Declarative Provisioning (Terraform)
Deployment Scripts
We provide a wrapper script to simplify Terraform deployments and handle secret injection automatically:
./cli/sdlc/cloud-run-grpc/terraform-deploy.sh [env]
This script:
- Validates the environment.
- Sources secrets from
env/${ENV}/firebase/m3/setvars.secrets.sh(orsetvars.secrets.sh), injectingTF_VAR_environment variables. - Runs
terraform planand asks for confirmation. - Runs
terraform apply.
Usage:
./cli/sdlc/cloud-run-grpc/terraform-deploy.sh stg
Declarative Environment Provisioning (Terraform)
This guide details the modern, declarative approach to provisioning environments using Terraform. This method is preferred for Staging (stg) and Production comparisons (test is also managed via Terraform).
Overview
We use a modular Terraform structure to manage infrastructure as code (IaC). This ensures consistency, reproducibility, and drift detection.
Architecture Overview
Our GCP infrastructure follows a hub-and-spoke pattern with centralized shared services and isolated environment projects:
┌─────────────────────────────────────────────────────────────────────────────┐
│ SHARED INFRASTRUCTURE │
├─────────────────────────────────┬───────────────────────────────────────────┤
│ construction-code-expert-admin │ construction-code-expert-repo │
│ ───────────────────────────── │ ──────────────────────────── │
│ • Terraform state buckets │ • Central Artifact Registry │
│ • Billing/quota project │ • Docker images (gRPC backend, ESPv2) │
│ • Cloud Build API │ • Shared across all environments │
│ • Bootstrap scripts │ │
└─────────────────────────────────┴───────────────────────────────────────────┘
│
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ construction-code- │ │ construction-code- │ │ construction-code- │
│ expert-dev │ │ expert-stg │ │ expert-test │
│ ───────────────── │ │ ───────────────── │ │ ───────────────── │
│ • Cloud Run │ │ • Cloud Run │ │ • Cloud Run │
│ • Firestore │ │ • Firestore │ │ • Firestore │
│ • GCS Buckets │ │ • GCS Buckets │ │ • GCS Buckets │
│ • Firebase │ │ • Firebase │ │ • Firebase │
│ • Secrets │ │ • Secrets │ │ • Secrets │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
ENVIRONMENT ENVIRONMENT ENVIRONMENT
Centralized Projects
| Project | Purpose | Key Resources |
|---|---|---|
construction-code-expert-admin | Infrastructure management | Terraform state buckets (gs://construction-code-expert-tf-state-{env}), billing/quota for gcloud commands, Cloud Build API |
construction-code-expert-repo | Artifact storage & Build Factory | Central Artifact Registry (custom-docker-image-repo), Cloud Build execution environment |
Centralized Build Factory Pattern
We strictly separate Build concerns from Runtime concerns.
- Builds run in the
-repoproject. - Runtime (Cloud Run) runs in
-dev,-stg,-test.
Why?
- Supply Chain Security: The
-repoproject owns the "software factory". It creates signed artifacts. - Simplified IAM: The Cloud Build Service Account in
-repoautomatically has push access to the Artifact Registry in-repo. - Auditability: All build logs are centralized.
Manual Setup Required:
Because -repo and -admin projects are long-lived and established before the environment Terraform runs, certain APIs must be enabled manually if not already present:
# Enable proper APIs on the repo project to support builds
gcloud services enable cloudbuild.googleapis.com --project=construction-code-expert-repo
gcloud services enable artifactregistry.googleapis.com --project=construction-code-expert-repo
Why This Pattern?
- State Isolation: Each environment has its own Terraform state bucket, preventing accidental cross-environment changes
- Centralized Artifacts: Docker images are built once and promoted through environments (test → stg → prod)
- Billing Control: The admin project handles API quota/billing, solving the "chicken-and-egg" problem when bootstrapping new projects
- Access Control: Environment projects can be granted different IAM permissions without affecting shared infrastructure
Cross-Project Permissions
When provisioning a new environment, the following cross-project permissions are required:
# Grant Cloud Run Service Agent access to pull images from central repo
gcloud artifacts repositories add-iam-policy-binding custom-docker-image-repo \
--project=construction-code-expert-repo \
--location=us-central1 \
--member="serviceAccount:service-NEW_PROJECT_NUMBER@serverless-robot-prod.iam.gserviceaccount.com" \
--role="roles/artifactregistry.reader"
Directory Structure
terraform/live/: Contains environment-specific configurations (e.g.,dev,stg,test).main.tf: The primary entry point, instantiating theenvironmentmodule.outputs.tf: exposing key resource IDs (e.g.,espv2_uri,web_api_key).variables.tf: Environment-specific inputs.backend.tf: GCS backend configuration for state storage.
terraform/modules/: Reusable modules.environment/: The core module that bundles Project, Storage, Cloud Run, and Firebase resources.project/: Project creation and API enablement.
terraform/shared_vars.yaml: centralized configuration for project IDs, regions, and image tags.
Prerequisites
- Terraform: v1.5+ installed.
- macOS (Homebrew):
brew tap hashicorp/tap
brew install hashicorp/tap/terraform - DevContainer / Linux (Debian/Ubuntu):
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
- macOS (Homebrew):
- GCP Access:
OwnerorEditorrole on the GCP Project (or folder).- Note: This applies to the Application Default Credentials (ADC) of the entity running Terraform (e.g., your personal account on the host machine, or the Service Account inside the DevContainer).
- Admin Setup: Ensure the bootstrap scripts have run (if creating a fresh project).
- See
cli/sdlc/new-environment-provisioning/setup-tf-admin.shfor details on setting up the Admin Project and State Buckets.
- See
Step-by-Step Usage
1. Initialize
Navigate to the specific environment directory:
cd terraform/live/stg
terraform init
Tip: Use
terraform init -upgradeif you see errors about provider version mismatches or if you want to update to the latest allowed versions of providers/modules.
2. Plan
Review the changes Terraform will make. This step is critical to avoid unintended destruction of resources.
terraform plan
3. Apply
Provision the resources.
terraform apply
Provisioning a New Environment
This playbook covers end-to-end provisioning of a completely new environment (e.g., phil, crsr).
Phase 1: Terraform Configuration
Recommended: Use the Automation Script
We provide a script to scaffold the new environment from a standard template. This ensures all naming conventions and boilerplates are correct.
# Basic usage (defaults to 'construction-code-expert' prefix)
./cli/utils/create-terraform-env.sh ENVNAME
# Custom project prefix
./cli/utils/create-terraform-env.sh ENVNAME --project-prefix=my-custom-prefix
This will:
- Create
terraform/live/ENVNAME. - Populate it with
main.tf,backend.tf,terraform.tfvars, etc. - Replace placeholders (
{{ENV}},{{PROJECT_PREFIX}}) with your values.
Manual Steps (if script is not used):
-
Create Terraform state bucket (in admin project):
gsutil mb -p construction-code-expert-admin -l us-central1 \
gs://construction-code-expert-tf-state-ENV_NAME -
Create environment directory:
mkdir -p terraform/live/ENV_NAME -
Copy and customize configuration:
- Copy contents of
terraform/templates/env-app/toterraform/live/ENVNAME/ - Replace
{{ENV}},{{PROJECT_PREFIX}}, and{{BILLING_ACCOUNT}}with real values inmain.tf,backend.tf, andterraform.tfvars.
- Copy contents of
-
Key
terraform.tfvarssettings:env_suffix = "newenv"
billing_account = "018A1F-2219A5-D47906"
enable_stripe = false # Unless Stripe is configured
enable_google_group_allowlist_check = false # For dev/agent envs
grpc_max_instance_count = 10 # Lower for dev envs
hierarchical_namespace_enabled = true
gcs_cors_allow_localhost = true # For local development
Phase 2: Brownfield Imports (Optional)
If resources were created imperatively before Terraform adoption, create imports.tf:
# Import existing GCP project
import {
id = "construction-code-expert-ENVNAME"
to = module.environment.module.project.google_project.main
}
# Import existing service account
import {
id = "projects/construction-code-expert-ENVNAME/serviceAccounts/cce-app-service@construction-code-expert-ENVNAME.iam.gserviceaccount.com"
to = module.environment.google_service_account.app_service_account
}
# Import existing APIs (example)
import {
id = "construction-code-expert-ENVNAME/firestore.googleapis.com"
to = module.environment.module.project.google_project_service.enabled_apis["firestore.googleapis.com"]
}
Tip: Run
terraform planfirst. If Terraform tries to create a resource that already exists, the error message will tell you the import ID.
Phase 3: Environment Shims (Simplified)
Modern deployment scripts automatically fallback to env/common shims if environment-specific ones are not found. This means you do not need to create manual shim files in env/ENVNAME/.
The scripts will automatically delegate to env/common and load configuration using the environment name passed as an argument (e.g., ./deploy.sh gcli).
Custom Overrides (Optional):
Only create specific shims in env/ENVNAME/ if you need to explicitly override the standard behavior or values provided by Terraform.
Legacy: Manual Shim Creation (Historical Reference)
If you are maintaining older environments or need to manually configure shims, here is the procedure:
Create shim scripts that source configuration from Terraform outputs:
-
Root shim (
env/ENVNAME/setvars.sh):#!/bin/bash
# SHIM: Sources dynamic configuration from Terraform outputs
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
source "$REPO_ROOT/env/terraform-sourced/load.sh" ENVNAME -
Per-service shims (create in
env/ENVNAME/gcp/cloud-run/{grpc,endpoints,websocket}/setvars.sh):#!/bin/bash
source "$REPO_ROOT/env/terraform-sourced/load.sh" ENVNAME -
Firebase shim (
env/ENVNAME/firebase/m3/setvars.sh):#!/bin/bash
source "$REPO_ROOT/env/terraform-sourced/load.sh" ENVNAME
# Map Terraform outputs to Angular environment variables
export CODEPROOF_API_SERVER="${ESPV2_URI#https://}"
export CODEPROOF_WS_SERVER="${WEBSOCKET_URI#https://}"
Phase 4: Frontend Configuration
Add Firebase hosting target to web-ng-m3/.firebaserc:
{
"targets": {
"construction-code-expert-ENVNAME": {
"hosting": {
"ENVNAME": [
"construction-code-expert-ENVNAME"
]
}
}
}
}
- Add hosting target to
web-ng-m3/firebase.json:
{
"target": "ENVNAME",
"public": "dist",
"ignore": [
"firebase.json",
"**/.*",
"**/node_modules/**"
],
"rewrites": [
{
"source": "**",
"destination": "/index.html"
}
]
}
Phase 5: Apply and Deploy
-
Initialize and apply Terraform:
terraform -chdir=terraform/live/ENVNAME init
terraform -chdir=terraform/live/ENVNAME apply -
Create OAuth credentials (see Manual Steps section below)
-
Deploy backend services:
cli/sdlc/cloud-run-grpc/deploy.sh ENVNAME
cd env && ./deploy-websocket.sh ENVNAME && cd ..
cli/sdlc/cloud-run-job/deploy.sh ENVNAME -
Deploy frontend:
cd web-ng-m3 && ./deploy.sh ENVNAME && cd ..⚠️ IMPORTANT: The frontend deploy script (
web-ng-m3/deploy.sh) MUST be run from within the Devcontainer to avoidesbuildplatform mismatch errors (See Issue #399).
Managed Resources
Terraform currently manages:
- Google Cloud Project: Labels, API enablement (e.g.,
run.googleapis.com,aiplatform.googleapis.com). - IAM: Service Account creation and role assignments (
web-api-keysecret access, Cloud Run invoker). - Cloud Run Services:
default_grpc_backend: The main gRPC application.- Note: Uses an environment-agnostic Docker image (promoted from Test -> Staging -> Prod).
espv2: The Envoy proxy for gRPC-Web transcoding (Infrastructure Shell).- Note: Uses an environment-specific image (baked with the specific backend URL).
websocket: The real-time chat service.
- Firebase: Project linking and Web App creation.
- Secrets: API Keys (
web_api_key,maps_api_key) and Stripe keys.
Backend Service Promotion Strategy
The default_grpc_backend uses a centralized image catalog defined in terraform/shared_vars.yaml. This allows for safe, explicit promotion of immutable artifacts.
To promote an image (e.g., Test -> Staging):
- Identify the Image:
- Find the verified image tag currently running in
test(or from the CI/CD build output). - Example:
1cc4057
- Find the verified image tag currently running in
- Update Catalog:
- Edit
terraform/shared_vars.yaml. - Update
image_catalog.grpc_backend.tags.stg_currentwith the new tag.
image_catalog:
grpc_backend:
tags:
test_current: "1cc4057" # Verified
stg_current: "1cc4057" # Promoting this tag - Edit
- Deploy:
- Run
terraform applyin the target environment (terraform/live/stg). - Terraform will detect the configuration change and redeploy the Cloud Run service with the new image.
- Run
⚠️ Manual Steps & Caveats
While Terraform handles the bulk of the infrastructure, some steps remain manual or imperative due to provider limitations or security constraints:
1. Firebase/GCP OAuth Credentials
After the first terraform apply, you must create OAuth credentials for Firebase Google Sign-In:
-
Create OAuth Consent Screen:
- Go to GCP Console → APIs & Services → OAuth consent screen
- Configure: App name, Support email, Authorized domains
-
Create OAuth Client ID:
- Go to GCP Console → APIs & Services → Credentials
- Create OAuth client ID → Web application
- Authorized JavaScript origins:
https://construction-code-expert-ENVNAME.web.app
- Authorized redirect URIs:
https://construction-code-expert-ENVNAME.firebaseapp.com/__/auth/handler
-
Store Client Secret:
echo -n "YOUR_CLIENT_SECRET" | gcloud secrets versions add firebase-login-oauth2-client-secret \\
--project=construction-code-expert-ENVNAME --data-file=- -
Update Terraform:
- Add to
terraform.tfvars:firebase_login_oauth2_client_id = "YOUR_CLIENT_ID.apps.googleusercontent.com" - Re-run
terraform applyto enable Google Sign-In in Identity Platform
- Add to
2. Service Account Domain-Wide Delegation
Granting the service account identifying rights to check Google Workspace group membership requires Google Workspace Admin privileges and is not done via GCP Terraform.
- Action: Manually configure Domain-Wide Delegation in the Google Admin Console if utilizing group-based RBAC.
3. ESPv2 Image Build
Unlike the backend, the ESPv2 Proxy image is environment-specific because it contains the baked-in Open API configuration which points to the specific backend URL for that environment. We use an "Infrastructure Shell" pattern where Terraform provisions the Cloud Run service with a placeholder first to generate the URL.
To build and apply a new ESPv2 image:
-
Prerequisites:
- Ensure Terraform has applied successfully (so the backend service exists and has a URL).
- Ensure you are authenticated with
gcloudanddocker.
-
Run Build Script:
- Execute the helper script from the
envdirectory:
cd env
./build-espv2-image.sh stg- What this does:
- Downloads Service & Google APIs.
- Deploys Cloud Endpoints configuration.
- Builds a new Docker image with the specific backend URL.
- Promotes the image to the Central Artifact Registry.
- Execute the helper script from the
-
Update Terraform (Optional but Recommended):
- The script outputs the new image URI (e.g.,
...:2024-01-01r0). - Update
terraform/live/stg/main.tfif you are pinning specific versions. - Note: Currently, Terraform may use a placeholder or specific tag. If the tag changes, run
terraform apply.
- The script outputs the new image URI (e.g.,
4. Database Indexes (Firestore)
Firestore composite indexes are defined in web-ng-m3/firestore.indexes.json and must be deployed via the Firebase CLI. They are not managed by Terraform.
Action: Deploy indexes to each environment after making changes:
cd web-ng-m3
# Deploy to all environments
firebase deploy --only firestore:indexes --project construction-code-expert-dev
firebase deploy --only firestore:indexes --project construction-code-expert-test
firebase deploy --only firestore:indexes --project construction-code-expert-stg
Note: Index builds may take 2-5 minutes. Check the Firebase Console for build status.
5. OAuth Consent Screen
The OAuth Consent Screen must be configured manually before Identity Platform Google Sign-In can work.
- Action: Go to GCP Console → APIs & Services → OAuth consent screen
- Configure: App name, Support email (can use a Google Group like
info@permitproof.com), Authorized domains - Note: To use a Google Group email, the GCP project owner must have manager privileges on that group
6. Secret Values (Stripe, etc.)
Terraform creates secret containers but not the actual values (versions). Secrets must be populated separately.
- Action: Use
gcloud secrets versions addor the provision-secrets script - Placeholder workaround (for staging without real keys):
echo -n "sk_test_placeholder_not_configured" | gcloud secrets versions add stripe-secret-key \
--project=construction-code-expert-stg --data-file=-
🔧 Common Issues & Troubleshooting
API Enablement Race Condition
Symptom: First terraform apply fails with "API has not been used in project... or it is disabled" errors
Cause: GCP API enablement is eventually consistent. Terraform enables the API but doesn't wait long enough before creating resources that depend on it.
Fix: Simply re-run terraform apply:
terraform -chdir=terraform/live/ENVNAME apply
The API should be fully propagated on the second run.
Note: This commonly affects Secret Manager, Firestore, and Identity Platform on new environments.
Cross-Project Artifact Registry Access
Symptom: Cloud Run deployment fails with Permission "artifactregistry.repositories.downloadArtifacts" denied
Cause: New environments need their Cloud Run Service Agent granted access to the central Artifact Registry.
Fix:
gcloud artifacts repositories add-iam-policy-binding custom-docker-image-repo \
--project=construction-code-expert-repo \
--location=us-central1 \
--member="serviceAccount:service-PROJECT_NUMBER@serverless-robot-prod.iam.gserviceaccount.com" \
--role="roles/artifactregistry.reader"
Note: Replace
PROJECT_NUMBERwith the new environment's project number (e.g.,381589830306for stg).
Tainted Resource Recovery
Symptom: After a failed deployment, Terraform wants to destroy/recreate a resource but deletion_protection=true blocks it.
Fix:
terraform untaint 'module.environment.google_cloud_run_v2_service.default_grpc_backend'
Force New Revision After Permission Fix
Symptom: After fixing permissions, the Cloud Run service still shows the old error (revision stuck in failed state).
Fix: Force a new revision deployment:
gcloud run services update SERVICE_NAME \
--region=us-central1 \
--project=PROJECT_ID \
--update-labels=force-redeploy=$(date +%s)
Backend Configuration Changed
Symptom: terraform init fails with "Backend configuration changed"
Fix: Use -reconfigure to reinitialize with the new backend:
terraform init -reconfigure
Cloud Build API on Admin Project
Symptom: gcloud builds submit fails or hangs asking to enable API on admin project
Cause: The admin project is used as the billing/quota project for gcloud commands.
Fix:
gcloud services enable cloudbuild.googleapis.com --project=construction-code-expert-admin
Image Promotion Hanging (gcloud container images add-tag)
Symptom: The build-espv2-image.sh script hangs at the promotion step despite --quiet flag.
Cause: gcloud container images add-tag has issues with interactive prompts when copying between GCR and Artifact Registry.
Fix: The script now uses Docker pull/tag/push instead:
gcloud auth configure-docker us-central1-docker.pkg.dev,gcr.io --quiet
docker pull "SOURCE_IMAGE"
docker tag "SOURCE_IMAGE" "DEST_IMAGE"
docker push "DEST_IMAGE"
GCS Hierarchical Namespace + Versioning Conflict
Symptom: Error: Versioning is not supported for hierarchical namespace buckets
Cause: GCS buckets with Hierarchical Namespace enabled cannot have versioning.
Fix: In the Terraform module, versioning is automatically disabled when HNS is enabled:
versioning = var.hierarchical_namespace != null && try(var.hierarchical_namespace.enabled, false) ? false : true
Firebase Auth Authorized Domains
Symptom: Frontend shows auth/unauthorized-domain error after deployment.
Cause: New hosting domains must be manually added to Firebase Auth settings.
Fix: Go to Firebase Console → Authentication → Settings → Authorized Domains → Add the new domain.
Note: Terraform cannot manage Firebase Auth Authorized Domains (provider limitation).