DaisukeMiyamoto / parallelcluster-closednetwork

set up AWS ParallelCluster on closed network environment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

parallelcluster-closednetwork

Tutorial document for setting up AWS ParallelCluster on closed network environment. This feature is supported by AWS ParallelCluster 2.7.0 or later.

This tutorial includes these steps:

  1. Set up VPC and Subnet without internet access (IGW) by CloudFormation template or manually
  2. Launch ParallelCluster on the closed VPC
  3. testing the environment with Systems Manager Session Manager

Usage

1. Set up VPC and Private Gateways

1a: Set up closed network environment with CloudFormation

Launch CloudFormation template below. It includes VPC, private subnet, required Private Endpoints for various services. If you set UseSSM to true, template also set up PrivateLinks for Systems Manager Session Manager to test the cluster.

Info: In some AZ, they have missing PrivateLink service and failed to set up CloudFormation template. At that case, you could chose different AZ by setting AZ letter on SubnetAZLetter (a/b/c/d etc..)

Launch

or with CLI

$ aws cloudformation create-stack --stack-name ClosedEnvironment --template-url https://midaisuk-public-templates.s3.amazonaws.com/parallelcluster-closednetwork/closed-vpc-privatelink.yml

1b: Set up closed network environment manually

this step is not required if you use CloudFormation template

You need to set up following components.

  • VPC
  • Private Subnet for the VPC
  • Security Group for PrivateLinks
  • Private Endpoints
    • s3
    • dynamodb
    • logs
    • cloudformation
    • monitoring
    • ec2
    • sqs
    • sns
    • autoscaling

If you want to use Systems Manager Session Manager for testing the cluster, you also need to set up following PrivateLinks.

  • ssmmessages
  • ec2messages
  • ssm

2. Launch AWS ParallelCluster

Launch ParallelCluster with config file for closed network condition.

  • closed-network.config
[aws]
aws_region_name = <REGION>

[global]
update_check = true
sanity_check = true
cluster_template = closed

[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

[cluster closed]
key_name = <KEY_NAME>
additional_iam_policies = arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
base_os = alinux2
scheduler = slurm
master_instance_type = c5.large
compute_instance_type = c5.xlarge
disable_hyperthreading = true
initial_queue_size = 0
max_queue_size = 10
vpc_settings = closed

[vpc closed]
vpc_id = <VPC_ID>
master_subnet_id = <SUBNET_ID>
use_public_ips = false

You should change <REGION>, <KEY_NAME>, <VPC_ID>, and <SUBNET_ID>. You could find <VPC_ID>, and <SUBNET_ID> in the output of the cloudformation.

AmazonSSMManagedInstanceCore is required for connecting Master node by using Session Manager.

$ pcluster create -c closed-network.config closed-cluster

3. Connect the cluster by Session Manager

Go to Systems Manager page on Management Console, and select Session Manager. Select Start Session and chose Master instance and start session.

After connecting the Master instance, you need to change user to submit jobs. Example input is shown in below.

$ sudo su ec2-user
$ cd
$ cat > job.sh
#!/bin/bash
hostname
$ sbatch job.sh

Cost

You need to have extra cost for PrivateLink.

https://aws.amazon.com/privatelink/pricing/

Notification

  • Currently, Amazon Linux and Amazon Linux 2 could be used for closed network condition.
  • On closed network condition, scale-out process seems to need few more minutes because of wating connection timeout of external repositories.
  • You could restrict access to S3 bucket by setting up PolicyDocument on PrivateEndpoint for S3. But you need to allow following bucket.
    • for ParallelCluster
      • arn:aws:s3:::${AWS::Region}-aws-parallelcluster/*
    • for Amazon Linux
      • arn:aws:s3:::packages.${AWS::Region}.amazonaws.com/*
      • arn:aws:s3:::repo.${AWS::Region}.amazonaws.com/*
    • for Amazon Linux 2
      • arn:aws:s3:::amazonlinux.${AWS::Region}.amazonaws.com

About

set up AWS ParallelCluster on closed network environment

License:Apache License 2.0


Languages

Language:Shell 100.0%