studio-design
diff --git a/‎.gitignore
+16 b/‎.gitignore
+16
diff --git a/‎README.md
+138 b/‎README.md
+138
diff --git a/‎deploy/Makefile.example
+138 b/‎deploy/Makefile.example
+138
diff --git a/‎deploy/gcloud_init.sh
+5 b/‎deploy/gcloud_init.sh
+5
diff --git a/‎deploy/projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster/main.tf
+63 b/‎deploy/projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster/main.tf
+63
@@ -0,0 +1,16 @@
+.env
+*.out
+Makefile
+.terraform
+.terraform*
+*.tfstate*
+.idea
+.vscode
+error.json
+log.txt
+kubeconfig
+gcloud_conf.sh
+locust_connect.sh
+terraform.png
+locust/venv
+__pycache__
@@ -0,0 +1,138 @@
+# Locust Load Testing
+This repository is for setting up load testing environment on GKE with terraform.
+
+![diagram](docs/diagram.png?raw=true "Diagram")
+# Pre Request
+- gcloud >= Google Cloud SDK 349.0.0
+- kubernetes-cli >= 1.22.1
+- terraform >= 1.0.5
+- python >= 3.9 (To generate diagram)
+# How to Set Up on GKE
+## Configure Makefile
+Copy `Makefile.example` and fill out attributes below:
+| value | description |
+|:-- |:--|
+| PROJECT_ID | GCP Project ID |
+| CLUSTER_NAME | Cluster base name. Due to the cluster deletion takes time, this tool add a random texts at the end of the base cluster name |
+| REGION | GCP Region name |
+| ZONE | GCP Zone name |
+| MACHINE_TYPE | Machine type of loading machines. Please see [machine types](https://cloud.google.com/compute/docs/general-purpose-machines) for more details |
+| CREDENTIALS | The full path to the Service Account JSON file. |
+| SERVICE_ACCOUNT_EMAIL | Service Account Email. Eg. `[User name]@[Project name].iam.gserviceaccount.com` |
+| TARGET_HOST | Target host URL |
+
+## Set Up Google Kubernetes Cluster (GKE)
+1. Navigate to `deploy` folder.
+ 
+    ```
+    make init_all
+    ```
+     to set up `terraform`
+1. Run 
+    ```
+    make build
+    ```
+    to set up a GKE cluster and initialize and `gcloud` command pointing to the created GKE cluster.
+1. Run 
+    ```
+    make a_locust
+    ```
+    to set up `locust` and required config maps (storing load test scripts) for performance testing.
+1. Run
+    ```
+    make locust
+    ```
+    This will do port forwarding to the local. Then you can access to `Locust Master` with `localhost:8089`.
+1. Stop `make locust` and Run
+    ```
+    make refresh
+    ```
+    This will refresh the Locust Cluster with updated `main.py` script file and `values.yaml` content. Once the Locust Cluster up and running, connect the master with `make locust`
+## Tear Down GKE Cluster
+Run
+```
+make d_all
+```
+
+## Update Code for Load Testing
+At each load testing scripts update, workers need to be redeployed to read the latest config maps where testing scripts are stored according to the Kubernetes specification. This way allows you to update with one command.
+
+1. If you are already connecting the load cluster with `make locust`, Ctrl+C to stop it.
+1. All code is stored under `locust` directory. `main.py` is the main logic, and libraries are under the `lib` directory.
+1. Once code is updated, run
+    ```
+    make refresh
+    ```
+    to reload `ConfigMap` and Locust clusters to read the updated config map.
+1. Run `make locust` again to connect the load cluster.
+
+## How to Adjust Balance of Workers and Users
+To generate the load at a lower cost, you may want to use as few workers as possible. This is a sample step on how to adjust the number of users and workers appropriately.
+
+In the case of generating 10000 RPS, here are the steps that I tried.
+
+1. Enable HPA, start from 10 workers with 2000 users, and see how much load the Locust cluster can generate. In this case, Locust generated 3000 RPS and saturated there. No CPU errors are observed in Cloud Logging, which implies CPU is still not pushed to the limit. 
+1. Assuming 3 times more users would generate 10000 RPS. Change users to 6000 and run `make refresh` to restore `ConfigMap` and Locust pods.
+1. You observed workers automatically scaled to 15 and the load reached higher than 10000 RPS. 
+1. Adjust the initial worker to `15` in the `values.yaml` and `make refresh` to update the Locust pods.
+
+## Reference Settings
+In the case where you use `spike_load.py` to generate **10000RPS** with the Locust Cluster on GKE, here is the reference configuration.
+
+`spike_load.py` hatches users at once and hold requests until all users are spawned **in each worker** (not across all workers).
+
+| parameters | description |
+|:-- | :-- |
+| Machine type of locust worker (`MACHINE_TYPE` in `Makefile`) | e2-standard-2 |
+| Replicas for worker (line 66 of `values.yaml`) | 15 |
+| User amount (line 15 of `spike_load.ph`, `user_amount`) | 10000 |
+
+With this settings,
+- The first second RPS is around 600
+- It'll reach 10000RPS in 15 to 20 seconds, and go higher. You may want to pace the access with `constant_pacing` function if you exactly target 10000RPS and dwell (stay) for a while.
+
+In `spike_load.py`, the below line configure the dwell load time. This code means dwell 120 seconds with amount of user_amount users. Adust dwell time accordingly.
+```spike_load.py
+ targets_with_times = Step(user_amount, 120)
+```
+
+# How to Run Locally
+You may want to iterate try and error quickly while building a testing script. Loading the testing script every time on GKE is quite troublesome. For the development phase, you can leverage Docker to run a small cluster locally.
+
+Spin up the small locust cluster, run
+```
+docker-compose up --build --scale worker=1
+```
+and you can access to the master from `localhost:8089`
+
+# Tips
+
+## Test Script Locally first and move on the production.
+
+Locust stops with exceptions when syntax errors are included in the loading script. For a faster turnaround, you may want to make sure the script works correctly at the local first and move on to the production.
+## Help of Commands
+Run `make help`
+## How to Access Locust Master Manually
+1. Go to GCP console > `Services & Ingress`
+1. Open `locust-cluster`, scroll down to `Ports`
+1. Click `PORT FORWARDING` button of `master-p3`, with port `8089` row
+1. A dialog will be popped up and displays the port forwarding code in there. Copy & Paste onto the terminal, and run.
+1. You can access the `locust-cluster` master pod with `localhost:8080` from your browser.
+## How to Configure gcloud for The GKE Cluster by Default
+This can be done just run `make build`, but also separately as below:
+1. Build cluster with 
+    ```
+    make build_cluster
+    ```
+
+1. Run 
+    ```
+    make gcloud_init
+    ```
+    This command will configure your `gcloud` environment pointing to the newly created GKE cluster.
+## How to Generate Diagram
+1. Install `Diagrams` following [this step](https://diagrams.mingrammer.com/docs/getting-started/installation).
+1. Go to `docs` directory and run `python diagram.py`
+
+## How to Enable Autoscaling
+Autoscaling is depending on Kubernetes's [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#how-does-the-horizontal-pod-autoscaler-work)(HPA). To enable HPA, Kubernetes manifest needs to include `resource` to sepecify the pod's resource allocation so that Kubernetes can manage the pods based on the CPU usage.
@@ -0,0 +1,138 @@
+# EXEC_DIR should be passed in the parameter
+# ex: make EXEC_DIR=./projects/us_central/setup a
+# Environments required to run Terraform
+PROJECT_ID=<Project ID is here>
+# Cluster name must be shorter than 40 characters
+CLUSTER_NAME=<Cluster Name is here>
+CREDENTIALS="<Service Account JSON file path (MUST BE absolute path)>"
+SERVICE_ACCOUNT_EMAIL="<service account e-mail address is here>"
+REGION=<Region is here>
+ZONE=${REGION}-b
+# https://cloud.google.com/compute/docs/general-purpose-machines
+# recommended MACHINE_TYPE is e2-standard-2
+MACHINE_TYPE=<Machine type is here>
+TARGET_HOST=<Target Host is here>
+
+RED=`tput setaf 1`
+ORG_PATH := ${CURDIR}
+
+ENVS = \
+	export TF_VAR_PROJECT_ID=$(PROJECT_ID); \
+	export TF_VAR_GOOGLE_APPLICATION_CREDENTIALS=$(CREDENTIALS); \
+	export TF_VAR_CLUSTER_NAME=$(CLUSTER_NAME); \
+	export TF_VAR_REGION=$(REGION); \
+	export TF_VAR_ZONE=$(ZONE); \
+	export TF_VAR_MACHINE_TYPE=$(MACHINE_TYPE); \
+	export TF_VAR_SERVICE_ACCOUNT_EMAIL=$(SERVICE_ACCOUNT_EMAIL); \
+	export TF_VAR_TARGET_HOST=$(TARGET_HOST); \
+
+
+# Clean up all environment at once
+.PHONY: clean_all
+clean_all: ## Clean up all environment. Remove all states and terraform cache files.
+	printf "${RED}Clean up 0-build-cluster\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster; rm -fR terraform.* .terraform*; cd ${ORG_PATH}; \
+	printf "${RED}Clean up 1-build-monitoring\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/1-build-monitoring; rm -fR terraform.* .terraform*; cd ${ORG_PATH}; \
+	printf "${RED}Clean up 2-deploy-locust\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/2-deploy-locust; rm -fR terraform.* .terraform*; cd ${ORG_PATH}; \
+	printf "${RED}Done\n"; \
+
+
+# Setup all environment at once
+.PHONY: init_all
+init_all: ## Initialize all environment
+	printf "${RED}Init 0-build-cluster\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster; terraform init -upgrade; cd ${ORG_PATH}; \
+	printf "${RED}Init 1-build-monitoring\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/1-build-monitoring; terraform init -upgrade; cd ${ORG_PATH}; \
+	printf "${RED}Init 2-deploy-locust\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/2-deploy-locust; terraform init -upgrade; cd ${ORG_PATH}; \
+	printf "${RED}Done\n"; \
+
+.PHONY: gcloud_init
+gcloud_init: ## Init gcloud command
+	./gcloud_init.sh "${PROJECT_ID}" "${ZONE}"; \
+	projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster/gcloud_conf.sh; \
+
+# Build Cluster
+.PHONY: build_cluster
+build_cluster: ## Build performance testing environment on GKE
+	printf "${RED}Building 0-build-cluster\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster; ${ENVS} terraform apply -refresh=false -auto-approve; cd ${ORG_PATH}; \
+	# printf "${RED}Building 1-build-monitoring\n\n"; \
+	# cd projects/distributed-load-testing-using-kubernetes-locust/1-build-monitoring; ${ENVS} terraform apply -refresh=false -auto-approve; cd ${ORG_PATH}; \
+	printf "${RED}Done\n"; \
+
+.PHONY: build
+build: build_cluster gcloud_init ## Build performance testing environment on GKE
+
+# Deploy locust
+.PHONY: a_locust
+a_locust: ## Deploy the locust, grafana and influxdb to the GKE
+	printf "${RED}Building 2-deploy-locust\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/2-deploy-locust; ${ENVS} terraform apply -refresh=false -auto-approve; cd ${ORG_PATH}; \
+	printf "${RED}Done\n"; \
+
+# Delete locust environment at once
+.PHONY: d_locust
+d_locust: ## Delete locust, grafana and influxdb from GKE
+	printf "${RED}Tearing down 2-deploy-locust\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/2-deploy-locust; ${ENVS} terraform destroy -auto-approve; cd ${ORG_PATH}; \
+	printf "${RED}Done\n"; \
+
+# Plan all environment at once
+.PHONY: p_all
+p_all: ## plan all terraform states
+	printf "${RED}Planning 0-build-cluster\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster; ${ENVS} terraform plan; cd ${ORG_PATH}; \
+	printf "${RED}Planning 1-build-monitoring\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/1-build-monitoring; ${ENVS} terraform plan; cd ${ORG_PATH}; \
+	printf "${RED}Planning 2-deploy-locust\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/2-deploy-locust; ${ENVS} terraform plan; cd ${ORG_PATH}; \
+	printf "${RED}Done\n"; \
+
+# Delete all environment at once
+.PHONY: d_all
+d_all: ## Delete all environment
+	printf "${RED}Tearing down 2-deploy-locust\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/2-deploy-locust; ${ENVS} terraform destroy -auto-approve; cd ${ORG_PATH}; \
+	printf "${RED}Tearing down 1-build-monitoring\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/1-build-monitoring; ${ENVS} terraform destroy -auto-approve; cd ${ORG_PATH}; \
+	printf "${RED}Tearing down 0-build-cluster\n\n"; \
+	cd projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster; ${ENVS} terraform destroy -auto-approve; cd ${ORG_PATH}; \
+	printf "${RED}Done\n"; \
+
+.PHONY: locust
+locust: ## connect to the locust
+	projects/distributed-load-testing-using-kubernetes-locust/0-build-cluster/locust_connect.sh; \
+
+.PHONY: refresh
+refresh: d_locust a_locust ## refresh locust config map and apply to locust cluster
+
+# format
+.PHONY: f
+f: ## terraform fmt at the directory where tf files exists
+	terraform fmt -recursive
+
+# Delete
+.PHONY: d
+d: ## terraform destroy at the directory where tf files exists. ex: make d CONFIG=<target directory full path>
+	cd $(CONFIG); \
+	${ENVS} terraform destroy -auto-approve
+
+# Apply
+.PHONY: a
+a: ## terraform apply at the directory where tf files exists. ex: make a CONFIG=<target directory full path>
+	cd $(CONFIG); \
+	${ENVS} terraform apply -refresh=false -auto-approve
+
+# Plan
+.PHONY: p
+p: ## terraform plan at the directory where tf files exists. ex: make p CONFIG=<target directory full path>
+	cd $(CONFIG); \
+	${ENVS} terraform plan
+
+.PHONY: help
+help: ## Display this help screen
+	@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-20s\033[0m %s\n", $$1, $$2}'
@@ -0,0 +1,5 @@
+#/bin/bash -x
+PROJECT=$1
+ZONE=$2
+gcloud config set compute/zone ${ZONE}
+gcloud config set project ${PROJECT}
@@ -0,0 +1,63 @@
+terraform {
+  required_providers {
+    google = {
+      source  = "hashicorp/google"
+      version = "3.90.0"
+    }
+  }
+}
+
+provider "google" {
+  project     = var.PROJECT_ID
+  credentials = file(var.GOOGLE_APPLICATION_CREDENTIALS)
+
+  region = var.REGION
+  zone   = var.ZONE
+}
+
+provider "google-beta" {
+  project     = var.PROJECT_ID
+  credentials = file(var.GOOGLE_APPLICATION_CREDENTIALS)
+
+  region = var.REGION
+  zone   = var.ZONE
+}
+module "gke" {
+  source                = "../../../usecases/gke_cluster"
+  cluster_name          = var.CLUSTER_NAME
+  project_id            = var.PROJECT_ID
+  region                = var.REGION
+  zone                  = var.ZONE
+  service_account_email = var.SERVICE_ACCOUNT_EMAIL
+  machine_type          = var.MACHINE_TYPE
+}
+
+# kubeconfig file for loading secret information into modules from different terraform commands
+# How to retrive kubeconfig
+# https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/master/examples/simple_regional_with_kubeconfig/outputs.tf
+resource "local_file" "kubeconfigfile" {
+  content  = module.gke.kubeconfig_raw
+  filename = "${path.module}/kubeconfig"
+}
+
+# Build a script to configure gcloud command for the newly created GKE cluster.
+resource "local_file" "cluster_name" {
+  content         = <<EOT
+#/bin/bash -x
+gcloud container clusters get-credentials ${module.gke.cluster_name} --zone=${module.gke.cluster_region}
+EOT
+  filename        = "${path.module}/gcloud_conf.sh"
+  file_permission = "0755"
+}
+
+# Generate short cut script for forwarding Locust master port to the local machine
+resource "local_file" "locust_connect_sh" {
+  content         = <<EOT
+#/bin/bash -x
+# locust master port forwarding
+gcloud container clusters get-credentials ${module.gke.cluster_name} --region ${module.gke.cluster_region} --project ${var.PROJECT_ID} \
+ && kubectl port-forward $(kubectl get pod --selector="app.kubernetes.io/instance=locust-cluster,app.kubernetes.io/name=locust,component=master,load_test=locust-cluster" --output jsonpath='{.items[0].metadata.name}') 8089:8089 
+EOT
+  filename        = "${path.module}/locust_connect.sh"
+  file_permission = "0755"
+}