pgasior / terraform-aws-metaflow

Deploy production-grade Metaflow cloud infrastructure on AWS

Home Page:https://registry.terraform.io/modules/outerbounds/metaflow/aws/latest

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Metaflow Terraform module

Terraform module that provisions AWS resources to run Metaflow in production.

This module consists of submodules that can be used separately as well:

modules diagram

You can either use this high-level module, or submodules individually. See each module's corresponding README.md for more details.

You can find a complete example that uses this module but also includes setting up VPC and other non-Metaflow-specific parts of infra in this repo.

Modules

Name Source Version
metaflow-computation ./modules/computation n/a
metaflow-datastore ./modules/datastore n/a
metaflow-metadata-service ./modules/metadata-service n/a
metaflow-step-functions ./modules/step-functions n/a
metaflow-ui ./modules/ui n/a

Inputs

Name Description Type Default Required
access_list_cidr_blocks List of CIDRs we want to grant access to our Metaflow Metadata Service. Usually this is our VPN's CIDR blocks. list(string) [] no
api_basic_auth Enable basic auth for API Gateway? (requires key export) bool true no
batch_type AWS Batch Compute Type ('ec2', 'fargate') string "ec2" no
compute_environment_desired_vcpus Desired Starting VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) number 8 no
compute_environment_instance_types The instance types for the compute environment list(string)
[
"c4.large",
"c4.xlarge",
"c4.2xlarge",
"c4.4xlarge",
"c4.8xlarge"
]
no
compute_environment_max_vcpus Maximum VCPUs for Batch Compute Environment [16-96] number 64 no
compute_environment_min_vcpus Minimum VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) number 8 no
enable_custom_batch_container_registry Provisions infrastructure for custom Amazon ECR container registry if enabled bool false no
enable_step_functions Provisions infrastructure for step functions if enabled bool n/a yes
extra_ui_backend_env_vars Additional environment variables for UI backend container map(string) {} no
extra_ui_static_env_vars Additional environment variables for UI static app map(string) {} no
iam_partition IAM Partition (Select aws-us-gov for AWS GovCloud, otherwise leave as is) string "aws" no
resource_prefix string prefix for all resources string "metaflow" no
resource_suffix string suffix for all resources string "" no
subnet1_id First subnet used for availability zone redundancy string n/a yes
subnet2_id Second subnet used for availability zone redundancy string n/a yes
tags aws tags map(string) n/a yes
ui_certificate_arn SSL certificate for UI string n/a yes
vpc_cidr_block The VPC CIDR block that we'll access list on our Metadata Service API to allow all internal communications string n/a yes
vpc_id The id of the single VPC we stood up for all Metaflow resources to exist in. string n/a yes

Outputs

Name Description
METAFLOW_BATCH_JOB_QUEUE AWS Batch Job Queue ARN for Metaflow
METAFLOW_DATASTORE_SYSROOT_S3 Amazon S3 URL for Metaflow DataStore
METAFLOW_DATATOOLS_S3ROOT Amazon S3 URL for Metaflow DataTools
METAFLOW_ECS_S3_ACCESS_IAM_ROLE Role for AWS Batch to Access Amazon S3
METAFLOW_EVENTS_SFN_ACCESS_IAM_ROLE IAM role for Amazon EventBridge to access AWS Step Functions.
METAFLOW_SERVICE_INTERNAL_URL URL for Metadata Service (Accessible in VPC)
METAFLOW_SERVICE_URL URL for Metadata Service (Accessible in VPC)
METAFLOW_SFN_DYNAMO_DB_TABLE AWS DynamoDB table name for tracking AWS Step Functions execution metadata.
METAFLOW_SFN_IAM_ROLE IAM role for AWS Step Functions to access AWS resources (AWS Batch, AWS DynamoDB).
api_gateway_rest_api_id_key_id API Gateway Key ID for Metadata Service. Fetch Key from AWS Console [METAFLOW_SERVICE_AUTH_KEY]
datastore_s3_bucket_kms_key_arn The ARN of the KMS key used to encrypt the Metaflow datastore S3 bucket
metadata_svc_ecs_task_role_arn n/a
metaflow_api_gateway_rest_api_id The ID of the API Gateway REST API we'll use to accept MetaData service requests to forward to the Fargate API instance
metaflow_batch_container_image The ECR repo containing the metaflow batch image
metaflow_profile_json Metaflow profile JSON object that can be used to communicate with this Metaflow Stack. Store this in ~/.metaflow/config_[stack-name] and select with $ export METAFLOW_PROFILE=[stack-name].
metaflow_s3_bucket_arn The ARN of the bucket we'll be using as blob storage
metaflow_s3_bucket_name The name of the bucket we'll be using as blob storage
migration_function_arn ARN of DB Migration Function
ui_alb_dns_name UI ALB DNS name

About

Deploy production-grade Metaflow cloud infrastructure on AWS

https://registry.terraform.io/modules/outerbounds/metaflow/aws/latest

License:Apache License 2.0


Languages

Language:HCL 100.0%