Reduce your Cloud Composer bills (Part 2)

Author:Murphy  |  View: 28328  |  Time: 2025-03-23 19:07:52
Photo by Sasun Bughdaryan on Unsplash

This is the second part of 2 parts series which goal is to introduce an efficient way to save money while using Cloud Composer for jobs orchestration. Thus, I highly recommend to check the first part of the series if that's not already done.

Following are the main topics that will be covered:

Understanding Cloud Composer 2 pricing (Part 1)

Snapshots as a way to shut down Composer and still preserve its state (Part 1)

Creating Composer Environments using Snapshots (Part 1)

Summing Up (Part 1)

Destroying Composer Environments To Save Money (Part 2)

Updating Composer Environments (Part 2)

Automating Composer Environments Creation and Destruction (Part 2)

Summing Up (Part 2)


Destroying Composer Environments To Save Money

Here is how to reduce the Cloud Composer bill: Shut down the environment while it's not been used.

The key to the cost reduction strategy is to save a snapshot or state of the Composer Environment before destroying it. Please, note that this strategy won't work for Environments that has orchestration jobs running continuously twenty-four hours a day, seven days a week, like production Environments.

The pipeline that destroys the Composer Environment has 3 steps:

  1. Save a snapshot of the Environment
  2. Copy the tasks logs to a backup Cloud Storage bucket
  3. Delete the Environment and its bucket

steps:

  - name: gcr.io/cloud-builders/gcloud
    entrypoint: /bin/bash
    id: 'Save snapshot'
    args:
      - -c
      - |
        set -e
        # This is an example project_id and env_name. Use your own
        project_id=reduce-composer-bill
        env_name=my-basic-environment
        snap_folder=$(gsutil ls gs://${project_id}-europe-west1-backup/snapshots) || snap_folder=empty
        gcloud composer environments snapshots save ${env_name} 
          --location europe-west1 --project ${project_id} 
          --snapshot-location gs://${project_id}-europe-west1-backup/snapshots
        if [[ $snap_folder != empty ]]
        then
          gsutil -m rm -r $snap_folder
        fi

  - name: gcr.io/cloud-builders/gcloud
    entrypoint: /bin/bash
    id: 'Save Tasks Logs'
    args:
      - -c
      - |
        set -e
        # This is an example project_id and env_name. Use your own
        project_id=reduce-composer-bill
        env_name=my-basic-environment
        dags_folder=$(gcloud composer environments describe ${env_name} --project ${project_id} 
              --location europe-west1 --format="get(config.dagGcsPrefix)")
        logs_folder=$(echo $dags_folder | cut -d / -f-3)/logs
        gsutil -m cp -r ${logs_folder}/* gs://${project_id}-europe-west1-backup/tasks-logs/

  - name: gcr.io/cloud-builders/gcloud
    entrypoint: /bin/bash
    id: 'Delete Composer Environment'
    args:
      - -c
      - |
        set -e
        # This is an example project_id and env_name. Use your own
        project_id=reduce-composer-bill
        env_name=my-basic-environment
        dags_folder=$(gcloud composer environments describe ${env_name} --project ${project_id} 
              --location europe-west1 --format="get(config.dagGcsPrefix)")
        gcloud composer environments delete --project ${project_id} --quiet 
          ${env_name} --location europe-west1
        dags_bucket=$(echo $dags_folder | cut -d / -f-3)
        gsutil -m rm -r $dags_bucket

Similarly to the Environment Creation, we create a trigger to run the Environment Destruction Cicd Pipeline

gcloud builds triggers create manual --name trg-environment-destroyer 
  --build-config destroy_environment.yaml --repo reduce_composer_bill 
  --branch main --repo-type CLOUD_SOURCE_REPOSITORIES

Updating Composer Environments

As time goes by, there will be the need to update the Cloud Composer Environments. Updating the Environment is handled as a separate CICD pipeline.

The number of steps of this pipeline is expected to increase as the number of update operations grow. Following is a simple example of update pipeline where an environment variable is added to the Cloud Composer Environment after its creation.

steps:

  - name: gcr.io/cloud-builders/gcloud
    entrypoint: /bin/bash
    id: 'Add environment variables'
    args:
      - -c
      - |
        set -e
        # This is an example project_id and env_name. Use your own
        project_id=reduce-composer-bill
        env_name=my-basic-environment
        gcloud composer environments update ${env_name} --location europe-west1 
          --project ${project_id} --update-env-variables ENV=dev

The associate CICD pipeline trigger needs to be created following the same principle as for Environment creation and destruction.

gcloud builds triggers create manual --name trg-environment-updater 
  --build-config update_environment.yaml --repo reduce_composer_bill 
  --branch main --repo-type CLOUD_SOURCE_REPOSITORIES

Automating Composer Environments Creation, Update and Destruction

At this point, there are 3 pipelines in place to implement the Cloud Composer cost reduction strategy:

  1. The creation pipeline creates the Composer Environment from the latest snapshot to keep the state of orchestration jobs executions
  2. The destruction pipeline destroys the Composer Environment after having saved its state in a snapshot
  3. The update pipeline is useful to apply any update to the Composer Environment every once in a while

While it's okay to keep the update pipeline runs manual, the cost reduction strategy requires the automation of the creation and destruction pipelines. This can be accomplished with the help of Cloud Scheduler.

Cloud Scheduler, as the name suggests is a managed Google Cloud service which can be used to schedule different tasks. In our case, the tasks that need to be run on schedule are the Cloud Build Triggers for Composer Environment creation and destruction.

From the Cloud Build interface, it's possible to add a schedule to a Trigger. Firstly we enable the Cloud Scheduler API.

gcloud services enable cloudscheduler.googleapis.com

Then, we open the Environment Creation Trigger and click on RUN ON SCHEDULE

Image By Author, Add a Schedule to a Cloud Build Trigger

Then we configure the schedule. The most important parameters are the name of the schedule, its frequency and the service account to use to run the schedule. For instance, name the schedule trg-environment-creator-schedule, make it run every day at 7 a.m except the weekend, and use the sac-cmp service account to run the schedule job.

Image By Author, Add a Schedule to a Cloud Build Trigger

That is it for the Environment Creation. Now every day at 7 a.m, the environment will be automatically recreated from its latest snapshot.

Finally, we add a schedule to the Environment Destruction pipeline. We might use the name trg-environment-destroyer-schedule and a frequency of 0 21 1–5, meaning that the Composer Environment will be automatically destroyed every day at 9 p.m.


Summing Up

Cloud Composer does not allow starting and stopping Environments, which put the service on the higher side, as far as cost in concerned. In my opinion, the best attempt to substantially reduce the price of Cloud Composer is to destroy and recreate the Environments.

However, the state of an Environment is lost when the Environment is destroyed. This is where Snapshots come into play. In the cost reduction strategy proposed, Snapshots are leveraged to save the state of Environments. In addition, tasks logs are also saved in a Cloud Storage backup bucket because they are not natively saved as part of Snapshots.

The destroy and recreate processes are made transparent to users with the help of Cloud Build and Cloud Scheduler. The Environment update is handled via a separate pipeline which is meant to be triggered manually on demand, anytime an update is needed.

The CICD pipelines code is available in this Gitlab repository. Feel free to check it out. Thank you for your time and stay tuned for more.

Tags: Automation Cicd Pipeline Finops Google Cloud Composer Google Cloud Platform

Comment