poldridge / terraform-gitlab-runner-aws-spot

Terraform module to provision self-hosted autoscaling Gitlab runner on AWS spot instances

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

terraform-gitlab-runner-aws-spot Build Status Latest Release

Terraform module to provision self-hosted autoscaling Gitlab runner on AWS spot instances.

Terraform versions

Terraform 0.12. Pin module version to ~> 1.0. Submit pull-requests to master branch.


It's 100% Open Source and licensed under the APACHE2.

Introduction

This module provisions a self-hosted Gitlab runner with docker+machine executor and auto-scaling configuration.

Architecture

The architecture is quite standard and mainly consists of EC2 instance (aka manager) which has all required software installed and automatically registers itself with Gitlab. It spawns worker instances which run CI/CD jobs and doesn't run any jobs itself.

Features:

Implementation notes:

  • This module is designed to work with Amazon Linux 2 AMIs. Other Linux distros most likely won't work!

Security considerations:

  • SSM Session Manager is a recommended way of accessing manager instance as it provides centralized access control, full audit and activity logging
  • Consider limiting self-hosted runners to private and internal repositories as running CI/CD pipelines for public repositories on your infrastructure introduce additional attack surface
  • Consider dedicating a separate VPC, subnets and AWS (sub)account for Gitlab Runners to reduce blast radius and attack surface. Setting a budget and billing alarm for your infrastructure may also be a wise choice

Cost optimization recommendations:

  • Consider purchasing Savings Plan or Reserved request_spot_instances = trueInstance for manager instance
  • Consider using AMD-powered EC2 instance types for manager instance (they are 10% cheaper compared to the Intel-powered instances at the moment of this writing)

Other recommendations:

  • If you use distributed cache feature, consider provisioning Gateway VPC Endpoint for S3 and routing all S3 traffic through it to avoid additional data tranfer charges and don't let this traffic leave AWS backbone network
  • Make sure to get yourself acquainted with Caveats related to Spot instances usage for running CI/CD jobs

Backlog:

  • Allow manager instance deployment as ECS service with Fargate launch type
  • Switch to Circle CI for CI/CD pipelines
  • Add tests
  • Add examples to the repo
  • Support Autoscaling periods
  • Add an option to request regular on-demand instances instead of the spot

This module is backed by best of breed terraform modules maintained by Cloudposse.

Usage

IMPORTANT: The master branch is used in source just as an example. In your code, do not pin to master because there may be breaking changes between releases. Instead pin to the release tag (e.g. ?ref=tags/x.y.z) of one of our latest releases.

This example creates a Gitlab runner in us-west-2 region and availability zone d with the registration token passed via variable.

data "aws_ami" "amzn_linux_2" {
 most_recent = true
 owners      = ["amazon"]

 filter {
   name   = "name"
   values = ["amzn2-ami-hvm-*-x86_64-ebs"]
 }
}

data "aws_ami" "ubuntu_18_04" {
  most_recent = true
  owners      = ["099720109477"]

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*"]
  }
}

module "gitlab_runner" {
  source    = "git::https://github.com/aleks-fofanov/terraform-gitlab-runner-aws-spot.git?ref=master"
  name      = "stack"
  namespace = "cp"
  stage     = "prod"

  region            = "us-west-2"
  availability_zone = "d"

  registration_token = "XXXXXXXX"

  vpc = {
    vpc_id     = "XXXXXXXX"
    cidr_block = "10.0.0.0/16"
  }

  manager = {
    ami_id                      = data.aws_ami.amzn_linux_2.id
    ami_owner                   = "amazon"
    instance_type               = "t3.micro"
    key_pair                    = null
    subnet_id                   = "subnet-XXXXXXXX"
    associate_public_ip_address = true
    assign_eip_address          = false
    enable_detailed_monitoring  = false
    root_volume_size            = 8
    ebs_optimized               = false
  }

  runner = {
    concurent = 2
    limit     = 2
    tags      = ["shared", "docker", "spot", "us-west-2d"]
    image     = "docker:19.03.8"

    instance_type       = "c5.large"
    ami_id              = data.aws_ami.ubuntu_18_04.id
    use_private_address = true

    run_untagged        = false
    lock_to_project     = true

    spot_bid_price      = 0.09
    spot_block_duration = 60
    request_spot_instances = true

    idle = {
      count = 0
      time  = 1200
    }
    autoscaling_periods = [
      {
        periods = ["* * 9-17 * * mon-fri *"]
        idle_count = 1
        idle_time = 1200
        timezone = "UTC"
      }
    ]
  }
}

Examples

For more examples please refer to the example folder in this repo.

Requirements

Name Version
terraform ~> 0.12.0
aws ~> 2.12
local ~> 1.2
null ~> 2.1

Providers

Name Version
aws ~> 2.12
null ~> 2.1

Inputs

Name Description Type Default Required
additional_security_groups List of Security Group IDs allowed to be associated with manager instance list(string) [] no
allowed_metrics_cidr_blocks CIDR blocks that should be able to access metrics port exposed on manager instance
list(object({
cidr_blocks = list(string)
description = string
}))
[] no
allowed_ssh_cidr_blocks CIDR blocks that should be able to communicate with manager's 22 port
list(object({
cidr_blocks = list(string)
description = string
}))
[] no
attributes Additional attributes, e.g. 1 list(string) [] no
authentication_token_ssm_param An override for SSM Parameter name that will store runner authentication token string null no
authentication_token_ssm_param_kms_key Identifier of KMS key used for encryption of SSM Parameter that will store authentication token string null no
availability_zone Availability Zone (e.g. a, b, c etc.) for instances to be launched in string "a" no
cloudwatch_logs_kms_key_arn The ARN of the KMS Key to use when encrypting log data. Please note, after the AWS KMS CMK is disassociated from the log group, AWS CloudWatch Logs stops encrypting newly ingested data for the log group. All previously ingested data remains encrypted, and AWS CloudWatch Logs requires permissions for the CMK whenever the encrypted data is requested. string null no
cloudwatch_logs_retention Number of days you want to retain log events in Cloudwatch log group number 30 no
create_service_linked_roles Defines whether required service-linked roles should be created bool true no
delimiter Delimiter to be used between namespace, name, stage and attributes string "-" no
docker_machine_version Docker machine version to be installed on manager instance string "0.16.2-gitlab.2" no
enable_access_to_ecr_repositories A list of ECR repositories in specified region that manager instance should have read-only access to list(string) [] no
enable_cloudwatch_logs Defines whether manager instance should ship its logs to Cloudwatch bool true no
enable_s3_cache Defines whether s3 should be created and used as a source for distributed cache bool true no
enable_ssm_sessions Defines whether access via SSM Session Manager should be enabled for manager instance bool true no
gitlab_runner_version Gitlab runner version to be installed on manager instance string "13.2.0" no
gitlab_url Gitlab URL string "https://gitlab.com" no
manager Runners' manager (aka bastion) configuration
object({
ami_id = string
ami_owner = string
instance_type = string
key_pair = string
subnet_id = string
associate_public_ip_address = bool
assign_eip_address = bool
root_volume_size = number
ebs_optimized = bool
enable_detailed_monitoring = bool
})
n/a yes
metrics_port See https://docs.gitlab.com/runner/monitoring/#configuration-of-the-metrics-http-server for more details number 9252 no
name Solution name, e.g. 'app' or 'jenkins' string n/a yes
namespace Namespace (e.g. cp or cloudposse) string "" no
region AWS Region identifier for instances to be launched in any n/a yes
registration_token Runner registration token string null no
registration_token_ssm_param SSM Parameter name that stored runner registration token. This parameter takes precedence over registration_token string null no
registration_token_ssm_param_kms_key Identifier of KMS key used for encryption of SSM Parameter that stores registration token string null no
runner Gitlab runner configuration. See https://docs.gitlab.com/runner/configuration/advanced-configuration.html
object({
concurrent = number
limit = number

image = string
tags = list(string)

use_private_address = bool
instance_type = string
ami_id = string

run_untagged = bool
lock_to_project = bool

idle = object({
count = number
time = number
})

autoscaling_periods = list(object({
periods = list(string)
idle_count = number
idle_time = number
timezone = string
}))

request_spot_instances = bool
spot_bid_price = number
spot_block_duration = number
})
n/a yes
runner_advanced_config Advanced configuration options for gitlab runner
object({
pre_build_script = string
post_build_script = string
pre_clone_script = string
environment = list(string)
request_concurrency = number
output_limit = number
shm_size = number
max_builds = number
pull_policy = string
additional_volumes = list(string)
additional_docker_machine_options = list(string)
root_volume_size = number
ebs_optimized = bool
enable_detailed_monitoring = bool
})
{
"additional_docker_machine_options": [],
"additional_volumes": [
"/certs/client"
],
"ebs_optimized": false,
"enable_detailed_monitoring": false,
"environment": [],
"max_builds": 0,
"output_limit": 4096,
"post_build_script": "",
"pre_build_script": "",
"pre_clone_script": "",
"pull_policy": "always",
"request_concurrency": 1,
"root_volume_size": 20,
"shm_size": 0
}
no
s3_cache_expiration Number of days you want to retain cache in S3 bucket number 45 no
s3_cache_infrequent_access_transition Number of days to persist in the standard storage tier before moving to the infrequent access tier number 30 no
stage Stage (e.g. prod, dev, staging) string "" no
tags Additional tags (e.g. map(BusinessUnit,XYZ) map(string) {} no
vpc VPC configuration
object({
vpc_id = string
cidr_block = string
})
n/a yes

Outputs

Name Description
auth_token_ssm_param_arn ARN of SSM Parameter that stores runner's authentication token
auth_token_ssm_param_name Name of SSM Parameter that stores runner's authentication token
manager_instance Disambiguated ID of manager instance
manager_instance_cloudwatch_alarm CloudWatch Alarm ID created for manager instance
manager_instance_cloudwatch_log_group_arn ARN of CloudWatch Log Group created for manager instance
manager_instance_cloudwatch_log_group_name Name of CloudWatch Log Group created for manager instance
manager_instance_name Manager instance name
manager_instance_policy_arn ARN of AWS IAM Policy associated with manager instance IAM role
manager_instance_policy_name Name of AWS IAM Policy associated with manager instance IAM role
manager_instance_primary_security_group_id An ID of security group created for and associated with manager instance
manager_instance_private_dns Private DNS of manager instance
manager_instance_private_ip Private IP of manager instance
manager_instance_public_dns Public DNS of manager instance (or DNS of EIP)
manager_instance_public_ip Public IP of manager instance (or EIP)
manager_instance_role_arn ARN of AWS IAM Role associated with manager instance
manager_instance_role_name Name of AWS IAM Role associated with manager instance
manager_instance_security_group_ids List of all security groups ID associated with manager instance
manager_instance_ssh_key_pair Name of the SSH key pair provisioned on manager instance
runner_instance_primary_security_group_id An ID of security group created for and associated with manager instance
runner_instance_role_arn ARN of AWS IAM Role associated with runner instance(s)
runner_instance_role_name Name of AWS IAM Role associated with runner instance(s)
s3_cache_bucket_arn Cache bucket ARN
s3_cache_bucket_id Cache bucket Name (aka ID)

Help

Got a question?

File a GitHub issue.

Contributing

Bug Reports & Feature Requests

Please use the issue tracker to report any bugs or file feature requests.

Developing

In general, PRs are welcome. We follow the typical "fork-and-pull" Git workflow.

  1. Fork the repo on GitHub
  2. Clone the project to your own machine
  3. Commit changes to your own branch
  4. Push your work back up to your fork
  5. Submit a Pull Request so that we can review your changes

NOTE: Be sure to merge the latest changes from "upstream" before making a pull request!

Copyright

Copyright © 2017-2020 Aleksandr Fofanov

License

License

See LICENSE for full details.

Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.

Trademarks

All other trademarks referenced herein are the property of their respective owners.

Contributors

Aleksandr Fofanov
Aleksandr Fofanov

About

Terraform module to provision self-hosted autoscaling Gitlab runner on AWS spot instances

License:Apache License 2.0


Languages

Language:HCL 76.6%Language:Shell 22.5%Language:Makefile 0.9%