Automating Cloud Budgets: Borrowing the 'Total Campaign Budget' Model from Google Ads
cost-optimizationIaCautomation

Automating Cloud Budgets: Borrowing the 'Total Campaign Budget' Model from Google Ads

UUnknown
2026-03-10
10 min read
Advertisement

Set fixed cloud budgets for a period and auto-scale non-critical resources as the period ends. Practical templates, IaC and GitHub Actions included.

Cut runaway cloud bills with a "total campaign budget" controller

Unpredictable cloud spend is one of the top pain points platform and FinOps teams tell me in 2026: multiple teams launch bursts, toolchains spin up ephemeral infra, and billing spikes arrive after hours. What if you could set a single total budget for a fixed period—like marketers do with Google's new total campaign budget for Search—and have a controller automatically enforce that cap and gracefully scale down non-critical resources as the period ends?

This article gives you a pragmatic, engineer-first plan and starter templates (IaC + GitHub Actions + controller code patterns) to implement a periodic budget controller that: enforces spend caps, forecasts burn rate, and automatically scales non-critical resources toward the end of the period so you hit — not blow — your budget.

Why a "total campaign budget" model matters for cloud in 2026

Cloud-native teams are adopting FinOps automation and platform engineering practices fast in 2025–26. Providers and tools now expose richer consumption datasets and near-real-time telemetry (billing export to BigQuery/S3/ADLS, cost streaming APIs). That makes periodic-budgets feasible:

  • Predictable caps: You can set a fixed amount for a period (72 hours, 7 days, 30 days) and automate enforcement.
  • Graceful scaling: Instead of hard shutdowns, scale non-critical parts progressively so production SLAs remain intact.
  • FinOps velocity: Platform teams can let teams run experiments without manual budget policing.
Reference: Google rolled out total campaign budgets for Search in Jan 2026 — marketers now set total budgets over a period and let the platform optimize spend. We can borrow that model for cloud infrastructure.

High-level architecture

Here’s the minimal, production-ready architecture for a periodic budget controller:

  1. Billing export / stream — export cost records to a data sink (BigQuery, S3, ADLS) or consume provider cost APIs (AWS Cost Explorer, GCP Billing, Azure Consumption).
  2. Controller function — serverless function or Kubernetes operator that reads costs, computes forecasts vs total budget, and decides actions.
  3. Enforcement layer — acts via IaC APIs: Kubernetes API, cloud compute auto-scaling groups, serverless config, CI runner throttles.
  4. Policy store — labels/tags or a CRD describing resource priorities and scaling strategies.
  5. CI/CD — GitHub Actions pipeline to deploy controller, notifies teams, and provides an override workflow.

Design principles

  • Period-first: Budgets are tied to a start and end date. The controller enforces the total for that window, not daily limits.
  • Priority-based: Tag resources as critical or non-critical. Critical resources are protected until budget exhaustion approaches.
  • Progressive actions: Scale non-critical resources down gradually as the end date approaches or as spend outpaces forecast.
  • Predictive: Use historical burn and forecast techniques to avoid last-minute shocks. Account for billing latency.
  • Auditable & reversible: Keep audit logs and a manual override (with a final approval gate).

How the controller makes decisions — algorithm

At the core is a simple forecasting loop run regularly (hourly or per billing event):

  1. Read totalBudget, startTime, endTime.
  2. Fetch consumedToDate (sum of costs) and compute remainingBudget = totalBudget - consumedToDate.
  3. Compute timeLeft in the period and expected burn rate needed to spend remainingBudget evenly.
  4. Compute currentBurnRate (consumedToDate / elapsedTime).
  5. Decide scale factor for non-critical resources using a safety margin. Increase scale-down intensity when currentBurnRate > target or timeLeft is small.
  6. Apply actions and notify.

Sample pseudo-code (Python)

def compute_scale(consumed, total_budget, start_ts, end_ts, now, safety=0.95):
    elapsed = now - start_ts
    total_period = end_ts - start_ts
    remaining_time = total_period - elapsed

    consumed = float(consumed)
    remaining_budget = max(total_budget - consumed, 0.0)

    # target spend per second to evenly use remaining budget
    target_rate = remaining_budget / remaining_time.total_seconds()

    # current burn rate
    current_rate = consumed / elapsed.total_seconds()

    # scale_factor in [0.0, 1.0] where 1.0 = full capacity
    if current_rate <= target_rate:
        scale_factor = 1.0
    else:
        # reduce capacity proportionally but keep critical resources at 1.0
        reduction_ratio = target_rate / (current_rate + 1e-9)
        scale_factor = max(reduction_ratio * safety, 0.0)

    return scale_factor

Practical scaling actions

Different resources require different actions. Examples:

  • Kubernetes — scale Deployments/StatefulSets down via kubectl scale --replicas=, or patch HorizontalPodAutoscaler targets.
  • VMs / Instance Groups — reduce ASG desired capacity or schedule instance hibernation.
  • Serverless — lower concurrency limits or pause non-critical functions.
  • Batch jobs / CI runners — reduce concurrent job slots and slow queue workers.
  • Data services — switch to cheaper tiers or reduce retention/replication temporarily.

Tagging and policies (required)

Your controller needs a way to know what it can touch. Use consistent labels/tags:

# Kubernetes example label
metadata:
  labels:
    budget.priority: "critical"   # or non-critical
    budget.controller: "enabled"

# AWS / GCP tag example
Tags:
  - Key: budget:priority
    Value: non-critical
  - Key: budget:controller
    Value: enabled

Starter templates and file layout

Below is a minimal starter repo layout you can copy. Each component has sample boilerplate so you can deploy quickly.

budget-controller-starter/
├─ iac/
│  ├─ main.tf            # Terraform module to deploy controller infra
│  ├─ variables.tf
│  └─ outputs.tf
├─ k8s/
│  ├─ crd/budget.yaml    # CRD for BudgetPeriod
│  └─ controllers/       # k8s manifests for controller
├─ functions/
│  └─ controller.py      # serverless controller (Python) reading costs
├─ workflows/
│  └─ deploy.yml         # GitHub Actions to deploy infra and controller
└─ README.md

Example Terraform snippet (AWS Lambda + IAM minimal)

resource "aws_iam_role" "controller" {
  name = "budget-controller-role"
  assume_role_policy = data.aws_iam_policy_document.lambda_assume.json
}

resource "aws_iam_policy" "billing_read" {
  name = "ReadBillingPolicy"
  path = "/"
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Action = ["ce:GetCostAndUsage", "ce:GetCostForecast"],
        Effect = "Allow",
        Resource = "*"
      }
    ]
  })
}

resource "aws_lambda_function" "controller" {
  filename         = "controller.zip"
  function_name    = "budget-controller"
  role             = aws_iam_role.controller.arn
  handler          = "controller.handler"
  runtime          = "python3.11"
  source_code_hash = filebase64sha256("controller.zip")
}

GitHub Actions: scheduled enforcement and deployment

Use two workflows: one to deploy (manual/PR) and one scheduled to run the enforcement logic hourly.

# .github/workflows/enforce-budget.yml
name: Enforce Budgets (scheduled)

on:
  schedule:
    - cron: '0 * * * *'  # hourly
  workflow_dispatch:

jobs:
  run:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Run controller script
        uses: ./.github/actions/execute-controller
        with:
          AWS_REGION: us-east-1

Extending: Kubernetes-native Budget CRD

For teams that run Kubernetes, a native CRD and operator is useful. Example CRD:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: budgetperiods.finops.example.com
spec:
  group: finops.example.com
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                totalBudget:
                  type: number
                startTime:
                  type: string
                endTime:
                  type: string
                actions:
                  type: object
  scope: Namespaced
  names:
    plural: budgetperiods
    singular: budgetperiod
    kind: BudgetPeriod
    shortNames:
      - bp

Controller behavior

Operator watches BudgetPeriod resources, computes scale factors, and issues Kubernetes patches to target Deployments labelled budget.controller=enabled. Keep the operator logic small and idempotent.

Testing and rollout

  1. Enable billing export for your provider (BigQuery/S3/Datalake). Many providers added faster cost streaming APIs by late 2025—use them if you need near-real-time enforcement.
  2. Tag a subset of non-production resources with budget.priority=non-critical for early tests.
  3. Deploy the controller to a sandbox namespace or account and run with a simulated budget using historical cost data.
  4. Validate scaling actions under controlled conditions and confirm graceful recovery after the period ends.

Security and IAM best practices

  • Grant the controller the minimal permissions: read billing, read/list targeted resources, and modify only labeled resources.
  • Use short-lived credentials (OIDC tokens with GitHub Actions, or workload identity in GCP/Azure) for CI/CD.
  • Log all actions and store audit events in an immutable store (S3/Blob/BigQuery) for compliance.

Edge cases and reliability

Two constraints to watch:

  • Billing latency: Cloud billing can lag. Use conservative safety margins and prefer trend-based forecasting over raw momentary numbers.
  • Transient spikes: Short spikes can distort burn rate. Smooth with EMA (exponential moving average) or use percentile-based burn estimates.

Example: scaling strategy matrix

Map policies to actions so teams know what to expect.

priority: critical -> protect replicas and CPU limits
priority: standard -> reduce replicas to 50% when 75% budget used
priority: non-critical -> scale to 0 when 90% budget used
end-of-period: progressively step down 75% -> 50% -> 25% -> 0%

Observability and alerts

Integrate metrics into your monitoring stack:

  • Expose controller metrics: current burn rate, remaining budget, applied scale factor.
  • Create alerts for anomalies: sudden spend increase, controller failures, or action rejections.
  • Notify teams via Slack/Teams and create a ticket with proposed mitigation if manual intervention is required.

In late 2025 and into 2026, we saw three trends that make this approach both timely and sustainable:

  • Providers add periodic budget constructs — Google’s total campaign budgets for Search (Jan 2026) show the idea scales beyond advertising. Expect cloud vendors to offer first-class periodic budget objects soon.
  • Real-time cost streaming — faster exports and cost streaming let you enforce budgets with finer granularity.
  • AI-driven forecasting — modern FinOps tools can predict burn rate and suggest scaling policies; use them to refine controller thresholds.

Quick wins you can deploy in a day

  1. Export billing to a data sink and run a quick query to compute daily burn rate.
  2. Tag a small set of non-critical apps and write a simple Lambda/Python script to scale their replicas based on a manual threshold.
  3. Create a GitHub Actions scheduled workflow to run that script hourly and send Slack alerts.

Advanced strategies (next steps)

  • Integrate with rightsizing and AI suggestions to reduce sizes before scaling to zero.
  • Add per-team allocations and chargeback metadata so platform teams can expose consumed amounts in dashboards.
  • Use policy-as-code (OPA/Gatekeeper) to prevent new resource creation that would violate a live budget.

Sample runbook (incident: fast budget burn)

  1. Controller detects burn rate >2x target. It emits a PagerDuty/SMS alert and posts to #finops-alerts.
  2. Controller reduces non-critical replicas by 50% and throttles CI runners.
  3. Platform engineer reviews logs, approves further reduction with an on-call override (manual GitHub Action workflow), or increases budget after PR approval.
  4. After period end, controller restores non-critical services to default sizes or leaves them scaled down until manual reconciliation.

Wrap-up: actionable checklist

  • Enable billing export or streaming for your cloud account.
  • Define budget periods (start, end, totalBudget) and tagging conventions.
  • Deploy a lightweight enforcement script and run on a schedule (hourly).
  • Test with tagged non-critical resources and a simulated budget.
  • Iterate on forecasting, safety margins, and escalation playbooks.

Final takeaways

Borrowing the marketing concept of a total campaign budget gives you a straightforward, predictable way to run bounded cloud campaigns (sales promotions, experiments, short-term projects) without constant firefighting. The pattern works now because of improved billing telemetry, and platforms in 2026 are increasingly supporting automation-first FinOps patterns.

If you want to move faster, use the starter template layout above: export costs, tag resources, deploy a small controller, and attach an hourly GitHub Actions runner. Start with conservative thresholds and build trust with your teams.

Call to action

Ready to try a periodic budget controller in your environment? Clone the starter layout, tag a sandbox app as budget.priority=non-critical, and deploy the controller with the GitHub Actions workflow. If you prefer, email the platform team to request the starter repo and a one-hour pairing session to get it running on your account.

Advertisement

Related Topics

#cost-optimization#IaC#automation
U

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-03-10T00:31:28.014Z