Skip to content

Commit

Permalink
chore: update test iac (#223)
Browse files Browse the repository at this point in the history
Co-authored-by: UncleGedd <42304551+UncleGedd@users.noreply.github.com>
  • Loading branch information
TristanHoladay and UncleGedd authored Aug 20, 2024
1 parent e3f5f71 commit 9f21f91
Show file tree
Hide file tree
Showing 4 changed files with 184 additions and 59 deletions.
82 changes: 82 additions & 0 deletions .github/test-infra/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# Runtime Ephemeral Infrastructure

The UDS Runtime IAC is used by the [nightly-infra workflow](../workflows/nightly-infra.yaml), via [uds tasks](./tasks/infra.yaml), to destroy and create ephemeral testing clusters, using the latest `nightly-unstable` image of UDS Runtime.

## How it Works

When the nightly workflow kicks off, it will `tofu init` using the backend variables defined in the workflow, then destroy the currently running EC2 instance and related infra. After removing the old infra, it will create a new EC2 instance in the UDS CI AWS account, that on startup will do the following:

1. clone the [uds-k3d](https://github.com/defenseunicorns/uds-k3d) repo, setting `nginx.conf` to redirect for the `.burning.boats` domain
1. run the default task of `uds-k3d`, creating the k3d cluster on the instance
1. setup the `kubecontext` to be used by `uds`
1. pull the `.burning.boats` tls cert and key from secrets manager
1. deploy the `init` and `UDS Core` packages
1. deploy the `UDS Runtime` package

## Custom AMI

The ec2 instance is created with a custom AMI. We use `packer` to define the AMI in [runtime.pkr.hcl](./packer/runtime.pkr.hcl) and build / push it to our AWS accounts.

***Only needed if you're updating the AMI***

pre-requisites:
* [packer](https://developer.hashicorp.com/packer/tutorials/docker-get-started/get-started-install-cli)
* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)

*Don't forget to authenticate to the AWS account*
```bash
cd .github/test-infra/packer
packer init runtime.pkr.hcl
packer build runtime.pkr.hcl
```

> **NOTE**
> Please delete old instances of the AMI from whatever AWS account you push too
## Development and Testing

> **NOTE**
> **Please use the UDS Dev AWS Account instead of CI**
For local development and testing:

pre-requisites:
* [opentofu](https://opentofu.org/)
* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)

1. Make sure you're terminal is authenticated to the AWS Dev account
1. Create a state bucket and dynamo table (either via CLI or through UI)
1. Alter the [variables](./terraform/variables.tf)
* set the region to `us-east-1`
* set the permissions boundary arn / name. You can find that under policies in the IAM console.
* If you want to debug using SSH -- enable ssh and add your public IP.
1. Comment out the EIP association in [main.tf](./terraform//main.tf). This EIP is a dedicated EIP in the CI account attached to the `runtime-canary.burning.boats` domain.
1. Init and Apply:

Via uds task from the root level of this repo: `uds run -f .github/test-infra/tasks/infra.yaml create-iac`

OR:

```bash
cd .github/test-infra/terraform
tofu init
tofu apply -auto-approve
```


> **WARNING**
> **DO NOT PUSH CHANGES TO VARIABLES SUCH AS ENABLING SSH AND PERMISSIONS BOUNDARY INFORMATION**

## Debug with SSH

If you enabled ssh and added your IP when developing locally, you can access your instance using the `runtime-dev.pem` that gets dropped in `.github/test-infra/terraform`.

```bash
ssh -i /path/to/runtime-dev.pem ubuntu@<public-ip>
```

## Debug with SSM

The ec2 instance has been configured with SSM for debugging running clusters without needing SSH. To start an SSM session:

`Systems Manager` > click `Session Manager` under `Node Management` > click `start session` > select `runtime-ephemeral-*` > click `start session`
12 changes: 12 additions & 0 deletions .github/test-infra/packer/install-tools.sh
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,15 @@ sudo ./aws/install
sudo snap install amazon-ssm-agent --classic
sudo systemctl enable snap.amazon-ssm-agent.amazon-ssm-agent.service
sudo systemctl start snap.amazon-ssm-agent.amazon-ssm-agent.service

# Set ulimit values for running core / software factory
echo "* soft nofile 1000000" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 1000000" | sudo tee -a /etc/security/limits.conf
echo "* soft nproc 8192" | sudo tee -a /etc/security/limits.conf
echo "* hard nproc 8192" | sudo tee -a /etc/security/limits.conf

# Update sysctl settings for running core / software factory
echo "fs.file-max = 1000000" | sudo tee -a /etc/sysctl.conf
echo "vm.max_map_count = 1524288" | sudo tee -a /etc/sysctl.conf
echo "fs.inotify.max_user_instances = 8192" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
98 changes: 57 additions & 41 deletions .github/test-infra/terraform/main.tf
Original file line number Diff line number Diff line change
@@ -1,11 +1,23 @@
provider "aws" {
region = var.region

default_tags {
tags = {
Name = "runtime-ephemeral-${random_id.unique_id.hex}"
ManagedBy = "Terraform"
CreationDate = time_static.creation_time.rfc3339
nuke = "DO-NOT-DELETE"
PermissionsBoundary = "${var.permissions_boundary_name}"
}
}
}

resource "random_id" "unique_id" {
byte_length = 4
}

data "aws_partition" "current" {}

data "aws_caller_identity" "current" {}

data "aws_ami" "latest_runtime_ephemeral_ami" {
Expand All @@ -19,25 +31,16 @@ data "aws_ami" "latest_runtime_ephemeral_ami" {
owners = ["${data.aws_caller_identity.current.account_id}"]
}

locals {
suffix = random_id.unique_id.hex
tags = tomap({
"Name" = "runtime-ephemeral-${local.suffix}"
"ManagedBy" = "Terraform"
"CreationDate" = time_static.creation_time.rfc3339
"nuke" : "DO-NOT-DELETE"
"PermissionsBoundary" = "${var.permissions_boundary_name}"
})
}

resource "time_static" "creation_time" {}

#
# EC2 INSTANCE
#
resource "aws_instance" "runtime" {
ami = data.aws_ami.latest_runtime_ephemeral_ami.image_id
instance_type = "m5.2xlarge"
iam_instance_profile = aws_iam_instance_profile.runtime_profile.name
key_name = var.enable_ssh ? aws_key_pair.ssh[0].key_name : null
tags = local.tags

vpc_security_group_ids = [aws_security_group.security_group.id]
user_data = file("setup.sh")
Expand All @@ -48,26 +51,61 @@ resource "aws_instance" "runtime" {
}
}

// Get EIP ID
#
# EIP ASSOCIATION
#
data "aws_eip" "runtime_eip" {
filter {
name = "tag:Name"
values = ["runtime-ephemeral"]
}
}

// Attach EIP to Instance
resource "aws_eip_association" "runtime" {
resource "aws_eip_association" "runtime_eip_association" {
instance_id = aws_instance.runtime.id
allocation_id = data.aws_eip.runtime_eip.id
}

#
# IAM ROLE
#
resource "aws_iam_role" "runtime_role" {
name = "runtime-ephemeral-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
Action = "sts:AssumeRole"
}
]
})
permissions_boundary = var.permissions_boundary_arn
}

resource "aws_iam_instance_profile" "runtime_profile" {
name = "runtime-ephemeral-EC2InstanceProfile"
role = aws_iam_role.runtime_role.name
tags = local.tags
}

#
# SSM POLICY
#
data "aws_iam_policy" "AmazonSSMManagedInstanceCore" {
arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

resource "aws_iam_role_policy_attachment" "ssm_policy" {
role = aws_iam_role.runtime_role.name
policy_arn = data.aws_iam_policy.AmazonSSMManagedInstanceCore.arn
}

#
# SECRETS MANAGER POLICY
#
resource "aws_iam_policy" "secrets_manager_policy" {
name = "runtime-ephemeral-SecretsManagerPolicy"
description = "Allows access to specific secrets"
Expand All @@ -89,34 +127,14 @@ resource "aws_iam_policy" "secrets_manager_policy" {
})
}

resource "aws_iam_role" "runtime_role" {
name = "runtime-ephemeral-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
Action = "sts:AssumeRole"
}
]
})
permissions_boundary = var.permissions_boundary_arn
tags = local.tags
}

resource "aws_iam_role_policy_attachment" "ssm_policy" {
role = aws_iam_role.runtime_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

resource "aws_iam_role_policy_attachment" "secrets_manager_policy_attachment" {
role = aws_iam_role.runtime_role.name
policy_arn = aws_iam_policy.secrets_manager_policy.arn
}

#
# SECURITY GROUP
#
resource "aws_security_group" "security_group" {
name = "runtime-ephemeral-sg-${random_id.unique_id.hex}"
ingress {
Expand Down Expand Up @@ -151,8 +169,6 @@ resource "aws_security_group" "security_group" {
cidr_blocks = ["${var.ssh_ip}/32"]
}
}

tags = local.tags
}

#
Expand Down
51 changes: 33 additions & 18 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,24 +58,35 @@ A list of runnable tasks from `uds run --list-all`

| Name | Description |
| -------------------- | -------------------------------------------------------------------------------------------------------------- |
| dev-server | run the api server in dev mode (requires air https://github.com/air-verse/air?tab=readme-ov-file#installation) |
| dev-ui | run the ui in dev mode |
| test:e2e | run end-to-end tests (assumes api server is running on port 8080) |
| test:go | run api server unit tests |
| test:ui-unit | run frontend unit tests |
| test:unit | run all unit tests (backend and frontend) |
| lint:all | Run all linters |
| lint:golangci | Run golang linters |
| lint:yaml | Run yaml linters |
| lint:ui | Run ui linters |
| lint:format-ui | Format ui code |
| setup:build-api | build the go api server |
| setup:build-ui | build ui |
| setup:slim-cluster | Create a k3d cluster and deploy core slim dev with metrics server |
| setup:simple-cluster | Create a k3d cluster, no core |
| setup:golangci | Install golangci-lint to GOPATH using install.sh |
| setup:clone-core | Clone uds-core for custom slim dev setup |
| setup:metrics-server | Create and deploy metrics server from cloned core |
| dev-server | run the api server in dev mode (requires air https://github.com/air-verse/air?tab=readme-ov-file#installation)
| dev-ui | run the ui in dev mode
| compile | compile the api server and ui outputting to build/
| test:e2e | run end-to-end tests (assumes api server is running on port 8080)
| test:go | run api server unit tests
| test:ui-unit | run frontend unit tests
| test:unit | run all unit tests (backend and frontend)
| test:deploy-load | deploy some Zarf packages to test against
| test:deploy-min-core | install min resources for UDS Core
| lint:all | Run all linters
| lint:golangci | Run golang linters
| lint:yaml | Run yaml linters
| lint:ui | Run ui lint and type check
| lint:format-ui | Format ui code
| setup:build-api | build the go api server for the local platform
| setup:build-api-linux-amd64 | build the go api server for linux amd64 (used for multi-arch container)
| setup:build-api-linux-arm64 | build the go api server for linux arm64 (used for multi-arch container)
| setup:build-ui | build ui
| setup:slim-cluster | Create a k3d cluster and deploy core slim dev with metrics server
| setup:simple-cluster | Create a k3d cluster, no core
| setup:golangci | Install golangci-lint to GOPATH using install.sh
| setup:clone-core | Clone uds-core for custom slim dev setup
| setup:metrics-server | Create and deploy metrics server from cloned core
| build:publish-uds-runtime | publish the uds runtime including its image and Zarf pkg (multi-arch)
| build:push-container | build container and push to GHCR (multi-arch)
| build:build-zarf-packages | build the uds runtime zarf packages (multi-arch)
| build:publish-zarf-packages | publish uds runtime zarf packages (multi-arch)
| swagger:generate | Generate Swagger docs
| swagger:test | Ensure no changes to Swagger docs

### Pre-Commit Hooks and Linting

Expand Down Expand Up @@ -103,3 +114,7 @@ E2E tests reside in the `ui/tests/` directory and can be named `*.test.ts` or `*
1. build the api server
1. setup the slim cluster (core-slim-dev + metrics server)
1. run the e2e script, which starts the api server (serves ui) to test against.

#### Ephemeral EC2 for Usability Tests

There is an ephemeral ec2 instance deploying the nightly release of UDS Runtime along with `UDS Core`. For more details, please see the test IAC [README.md](./.github/test-infra/README.md).

0 comments on commit 9f21f91

Please sign in to comment.