Background Task Architecture Overview

Problem Statement

Executing lengthy background tasks within a Cloud Run Service presents a fundamental architectural challenge. The platform is designed to handle request-response cycles, and once the response is sent, the CPU is heavily throttled or completely frozen. This causes background tasks initiated from gRPC service calls to execute slowly, even when using sub-threads within the request handler. (See GitHub Issue #161 for detailed observations of this behavior.)

Solution: Hybrid Architecture

PermitProof implements a hybrid architecture that combines the best of both approaches:

Primary Strategy: Cloud Run Jobs (separate container execution)
Fallback Strategy: ExecutorService thread pools (in-process execution)

Architecture Decision Flow

gRPC Request Received
        ↓
Create Task in Firestore
        ↓
  Cloud Run Jobs Available?
        ↓
    Yes ────→ Trigger Cloud Run Job ────→ Separate Container
     │                                    Full CPU Allocation
     │                                    Optimal for Long Tasks
     │
    No ─────→ ExecutorService Fallback ─→ Background Thread
                                           Same Container
                                          CPU Throttled After Response
                                          Best for Short Tasks (`<`60s)

When Each Approach is Used

Cloud Run Jobs (Primary)

Trigger Condition: CloudRunTaskTrigger successfully initialized
Environment: Production with proper GCP configuration
CPU Allocation: Full, dedicated CPU resources
Duration: Optimal for tasks > 60 seconds
Isolation: Completely independent container
Examples: Page ingestion, code applicability analysis, compliance report generation

ExecutorService Fallback

Trigger Condition: Cloud Run Jobs initialization fails
Environment: Development, testing, or degraded GCP setup
CPU Allocation: Throttled after gRPC response sent
Duration: Suitable for tasks < 60 seconds
Isolation: Runs in same container as gRPC service
Examples: Quick validations, metadata updates, short analyses

Architecture Components

1. gRPC Service Layer

Purpose: Request handler and job orchestrator
Responsibility: Create Firestore task, trigger background execution, return immediately
Implementations:
- ArchitecturalPlanWriteAsyncServiceImpl - Page ingestion tasks
- CodeApplicabilityServiceImpl - Code applicability analysis
- ArchitecturalPlanReviewServiceImpl - Compliance report generation
- TaskServiceImpl - Generic task management

2. Task Tracking Service

Purpose: Generic task management infrastructure
Storage: Firestore tasks collection
Features: Real-time progress, step tracking, cost analysis
Implementation: TaskServiceImpl

3. Background Execution Layer

Cloud Run Jobs: CloudRunTaskTrigger → separate container instances
ExecutorService: Thread pools within Cloud Run Service
Selection: Automatic fallback if Cloud Run Jobs unavailable

4. Frontend Integration

Real-time Updates: Firestore subscriptions via FirestoreTaskTrackingService
Progress Display: AsyncTaskProgressComponent with step-by-step tracking
User Experience: Non-blocking UI with live progress bars

Key Features

Graceful Degradation

The system works reliably even if Cloud Run Jobs setup fails:

Example pattern used across all async services (CodeApplicabilityServiceImpl, ArchitecturalPlanWriteAsyncServiceImpl, ArchitecturalPlanReviewServiceImpl)

if (jobTrigger != null) {
    logger.info("🚀 Triggering Cloud Run Job for task: " + taskId);
    backgroundExecutor.submit(() -> triggerJob(taskId, request));
} else {
    logger.info("🔄 Using background processing for task: " + taskId);
    executeBackgroundTask(taskId, request);
}

Real-time Progress Tracking

Firestore Integration: Tasks stored in tasks collection (provisioned by TaskServiceImpl)
Step History: Comprehensive progress tracking with timestamps
Cost Metadata: Per-model LLM cost accumulation
Status Enums: Type-safe status management (PENDING, PROCESSING, COMPLETE, FAILED)

Parallelization Strategy

One Task per Unit: Each page/chapter gets its own task for maximum parallelism
Cloud Run Scaling: Multiple instances process tasks simultaneously
Independent Progress: Each task tracks progress independently
Fault Tolerance: Failure of one task doesn't affect others

Performance Characteristics Comparison

Aspect	Cloud Run Jobs	ExecutorService
CPU Allocation	Full, dedicated	Throttled after response
Task Duration	Minutes to hours	Seconds to ~60s
Scalability	High (parallel jobs)	Limited (thread pool)
Isolation	Complete	Shared container
Overhead	Container startup	Minimal
Production Use	✅ Recommended	⚠️ Fallback only
Development	Optional	✅ Convenient

Implementation Status

Currently Implemented

✅ Hybrid architecture with automatic fallback
✅ Firestore task tracking with real-time updates
✅ Cloud Run Jobs integration for code applicability
✅ ExecutorService fallback for all async operations
✅ Step-by-step progress tracking
✅ Cost analysis metadata accumulation
✅ Frontend real-time progress display

Architecture Evolution

The system evolved from pure ExecutorService to hybrid approach:

Phase 1: ExecutorService only (CPU throttling issues discovered)
Phase 2: Cloud Run Jobs added as primary strategy
Phase 3: Hybrid approach with graceful fallback (current)

Security Considerations

Authentication: All gRPC calls require valid Firebase authentication
Authorization: RBAC service checks project permissions
Firestore Rules: Secure access to task documents by user
Data Validation: Validate all input parameters

Next Steps

For detailed implementation guides:

Cloud Run Jobs Pattern: See Cloud Run Jobs
ExecutorService Fallback: See ExecutorService Fallback
Complete Implementation: See Implementation Guide

Problem Statement​

Solution: Hybrid Architecture​

Architecture Decision Flow​

When Each Approach is Used​

Cloud Run Jobs (Primary)​

ExecutorService Fallback​

Architecture Components​

1. gRPC Service Layer​

2. Task Tracking Service​

3. Background Execution Layer​

4. Frontend Integration​

Key Features​

Graceful Degradation​

Real-time Progress Tracking​

Parallelization Strategy​

Performance Characteristics Comparison​

Implementation Status​

Currently Implemented​

Architecture Evolution​

Security Considerations​

Next Steps​