Query Task Timing Utility

Overview

The query-task-timing.sh script analyzes task timing data from Firestore to help optimize progress percentage calculations and identify performance bottlenecks in background tasks.

Script Location: cli/sdlc/utils/query-task-timing.sh

Purpose

This utility provides detailed timing analysis for background tasks (plan ingestion, code applicability, compliance report) by:

✅ Analyzing step durations - Shows how long each step takes
✅ Calculating proportions - Determines actual time distribution vs. assumed percentages
✅ Identifying bottlenecks - Highlights slow steps that need optimization
✅ Optimizing progress bars - Provides data to calibrate progress percentage calculations
✅ Performance tracking - Compare timing across different task runs

Usage

Basic Syntax

./cli/sdlc/utils/query-task-timing.sh <taskId> [project]

Parameters:

taskId - Task ID to analyze (UUID format)
project - (Optional) GCP project ID (default: construction-code-expert-test)

Examples

Analyze Plan Ingestion Task

# Analyze plan ingestion task
./cli/sdlc/utils/query-task-timing.sh a7326bb7-9d32-442c-85bc-0807541f893d

# Analyze from dev environment
./cli/sdlc/utils/query-task-timing.sh a7326bb7-9d32-442c-85bc-0807541f893d construction-code-expert-dev

Output:

⏱️  Analyzing task timing data
📁 Project: construction-code-expert-test
🆔 Task ID: a7326bb7-9d32-442c-85bc-0807541f893d

📥 Fetching task data from Firestore...
✅ Task data fetched successfully!

📈 Analyzing Progress Steps...
📈 Progress Steps Analysis:
Step | Title                                         | Duration (ms) | Duration (s) | Status
-----|-----------------------------------------------|---------------|-------------|--------
   1 | Initializing plan ingestion                   |          1234 |         1.2 | COMPLETED
   2 | Extracting pages from PDF                     |         45678 |        45.7 | COMPLETED
   3 | Converting pages to markdown                  |        123456 |       123.5 | COMPLETED
   4 | Analyzing building codes                      |         67890 |        67.9 | COMPLETED
   5 | Generating compliance report                  |         34567 |        34.6 | COMPLETED

🎯 Duration Analysis:
Total Duration: 272825ms (272s)

📊 Proportional Progress Percentages:
Step | Current % | Actual Duration | Proportional % | Cumulative %
-----|-----------|-----------------|----------------|-------------
   1 |        7% |           1234ms |             0% |          0%
   2 |       15% |          45678ms |            16% |         16%
   3 |       23% |         123456ms |            45% |         61%
   4 |       30% |          67890ms |            24% |         85%
   5 |       38% |          34567ms |            12% |         97%

✅ Analysis complete!

Analyze Code Applicability Task

./cli/sdlc/utils/query-task-timing.sh <code-applicability-task-id>

Analyze Compliance Report Task

./cli/sdlc/utils/query-task-timing.sh <compliance-report-task-id>

Output Explanation

Progress Steps Analysis

Shows each step with:

Step Number - Sequential step identifier
Title - Human-readable step name (from proto annotations)
Duration (ms) - Time taken in milliseconds
Duration (s) - Time taken in seconds
Status - Step completion status

Duration Analysis

Total Duration - Sum of all step durations
Helps identify which steps take the most time

Proportional Progress Percentages

Compares current progress calculation with actual timing data:

Current % - Current progress percentage assigned to this step
Actual Duration - Measured time for this step
Proportional % - What percentage this step should get based on actual timing
Cumulative % - Running total of progress

Use Case: If "Proportional %" differs significantly from "Current %", you should adjust the progress percentage annotations in your proto file.

Use Cases

1. Optimize Progress Bar Accuracy

When users report inaccurate progress bars:

# 1. Get task ID from logs or UI
TASK_ID="a7326bb7-9d32-442c-85bc-0807541f893d"

# 2. Analyze timing
./cli/sdlc/utils/query-task-timing.sh $TASK_ID

# 3. Compare "Current %" vs "Proportional %"
# If they differ significantly, update proto annotations

Example: If "Extract Pages" shows:

Current %: 15%
Proportional %: 45%

Then update task.proto:

EXTRACT_PAGES = 2 [
  (step_title) = "Extracting pages from PDF",
  (step_progress_percent) = 45  // Updated from 15 based on timing data
];

2. Identify Performance Bottlenecks

Find slow steps that need optimization:

# Analyze multiple task runs
for TASK_ID in $TASK_IDS; do
  echo "Analyzing $TASK_ID..."
  ./cli/sdlc/utils/query-task-timing.sh $TASK_ID | grep "Duration Analysis" -A 5
done

# Look for steps that consistently take >60 seconds

Action Items:

Steps >60s: Consider breaking into smaller steps
Steps >120s: Investigate for optimization opportunities
Steps with high variance: Check for external dependencies

3. Compare Task Performance

Compare timing across different environments or versions:

# Dev environment
./cli/sdlc/utils/query-task-timing.sh $TASK_ID construction-code-expert-dev > timing-dev.txt

# Test environment
./cli/sdlc/utils/query-task-timing.sh $TASK_ID construction-code-expert-test > timing-test.txt

# Compare
diff timing-dev.txt timing-test.txt

4. Track Performance Improvements

Before and after optimization:

# Before optimization
./cli/sdlc/utils/query-task-timing.sh $OLD_TASK_ID > timing-before.txt

# Deploy optimized code

# After optimization
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID > timing-after.txt

# Calculate improvement
echo "Before: $(grep "Total Duration" timing-before.txt)"
echo "After: $(grep "Total Duration" timing-after.txt)"

5. Debug Stuck Tasks

When a task appears stuck at a specific percentage:

# Analyze the stuck task
./cli/sdlc/utils/query-task-timing.sh $STUCK_TASK_ID

# Check:
# - Which step is it stuck on?
# - Is that step taking unusually long?
# - Are there any error messages?

# Cross-reference with logs
gcloud logging read "jsonPayload.taskId=$STUCK_TASK_ID" --limit 100

Integration with Other Tools

With fetch-firestore-object.sh

This script internally uses fetch-firestore-object.sh to retrieve task data:

# query-task-timing.sh calls:
./fetch-firestore-object.sh tasks $TASK_ID $PROJECT_ID

# Then parses the progress_steps array

With jq for Custom Analysis

Extract specific timing data:

# Get just step durations
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
  grep -A 100 "Progress Steps Analysis" | \
  awk -F'|' 'NR>2 {print $3}' | \
  xargs

# Calculate average step duration
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
  grep -A 100 "Progress Steps Analysis" | \
  awk -F'|' 'NR>2 {sum+=$3; count++} END {print "Average:", sum/count "ms"}'

In Performance Monitoring Scripts

#!/bin/bash
# monitor-task-performance.sh

TASK_IDS=$(gcloud firestore documents list tasks --filter="status=COMPLETED" --limit 10 --format="value(name)" | cut -d'/' -f6)

echo "Task ID,Total Duration (ms),Slowest Step,Slowest Duration (ms)"

for TASK_ID in $TASK_IDS; do
  OUTPUT=$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID 2>&1)
  
  TOTAL=$(echo "$OUTPUT" | grep "Total Duration:" | awk '{print $3}' | sed 's/ms//')
  SLOWEST_STEP=$(echo "$OUTPUT" | grep -A 100 "Progress Steps Analysis" | sort -t'|' -k3 -nr | head -2 | tail -1 | awk -F'|' '{print $2}' | xargs)
  SLOWEST_DURATION=$(echo "$OUTPUT" | grep -A 100 "Progress Steps Analysis" | sort -t'|' -k3 -nr | head -2 | tail -1 | awk -F'|' '{print $3}' | xargs)
  
  echo "$TASK_ID,$TOTAL,$SLOWEST_STEP,$SLOWEST_DURATION"
done

Understanding Progress Steps

Firestore Data Structure

Progress steps are stored as an array of maps in Firestore:

{
  "progress_steps": {
    "arrayValue": {
      "values": [
        {
          "mapValue": {
            "fields": {
              "stepNumber": {"doubleValue": 1},
              "name": {"stringValue": "Initializing plan ingestion"},
              "durationMs": {"stringValue": "1234"},
              "status": {"stringValue": "COMPLETED"}
            }
          }
        }
      ]
    }
  }
}

Step Status Values

PENDING - Step not yet started
IN_PROGRESS - Step currently executing
COMPLETED - Step finished successfully
FAILED - Step failed with error
SKIPPED - Step skipped (conditional logic)

Best Practices

1. Analyze Completed Tasks

For accurate timing data, analyze completed tasks:

# Good - completed task
./cli/sdlc/utils/query-task-timing.sh $COMPLETED_TASK_ID

# Less useful - in-progress task (incomplete timing data)
./cli/sdlc/utils/query-task-timing.sh $IN_PROGRESS_TASK_ID

2. Sample Multiple Tasks

Don't optimize based on a single task run:

# Analyze 5-10 tasks of the same type
for i in {1..10}; do
  ./cli/sdlc/utils/query-task-timing.sh $TASK_ID_$i >> timing-analysis.txt
done

# Calculate averages
grep "Total Duration" timing-analysis.txt | awk '{sum+=$3} END {print "Average:", sum/NR "ms"}'

3. Document Findings

Create a timing report:

# Generate report
cat > timing-report.md <<EOF
# Task Timing Analysis - $(date +%Y-%m-%d)

## Task Type: Plan Ingestion

### Sample Size: 10 tasks

### Findings:
$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID)

### Recommendations:
- Step 3 (Convert to Markdown) takes 45% of total time
- Consider parallel processing for page conversion
- Current progress % should be adjusted from 23% to 45%
EOF

4. Update Proto Annotations

Based on timing analysis, update progress percentages in proto files:

// task.proto
enum PlanIngestionStep {
  INITIALIZE = 1 [
    (step_title) = "Initializing plan ingestion",
    (step_progress_percent) = 1  // Takes <1% of time
  ];
  
  EXTRACT_PAGES = 2 [
    (step_title) = "Extracting pages from PDF",
    (step_progress_percent) = 16  // Takes ~16% of time
  ];
  
  CONVERT_TO_MARKDOWN = 3 [
    (step_title) = "Converting pages to markdown",
    (step_progress_percent) = 45  // Takes ~45% of time (UPDATED!)
  ];
  
  // ... other steps
}

Troubleshooting

No progress_steps Found

❌ No progress_steps found in task data
💡 Available fields:
[
  "taskId",
  "taskType",
  "status",
  "progressPercent"
]

Possible Causes:

Task is too old (before progress tracking was added)
Task type doesn't support step-by-step progress
Task failed before any steps were recorded

Solution: Use a more recent task or check if the task type supports progress steps.

Invalid Task ID

❌ Failed to fetch task data:
❌ Document not found or access denied

Solution: Verify the task ID is correct and exists in the specified project.

bc Command Not Found

bash: bc: command not found

Solution: Install bc (basic calculator):

# macOS
brew install bc

# Ubuntu/Debian
sudo apt-get install bc

Advanced Usage

Export to CSV

# Export timing data to CSV
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
  grep -A 100 "Progress Steps Analysis" | \
  awk -F'|' 'NR>2 {print $1","$2","$3","$4","$5}' > timing-data.csv

Aggregate Multiple Tasks

#!/bin/bash
# aggregate-timing.sh - Analyze multiple tasks

TASK_IDS=(
  "task-id-1"
  "task-id-2"
  "task-id-3"
)

echo "Step,Avg Duration (ms),Min Duration (ms),Max Duration (ms)"

for STEP in {1..13}; do
  DURATIONS=()
  
  for TASK_ID in "${TASK_IDS[@]}"; do
    DURATION=$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
      grep "^[[:space:]]*$STEP |" | \
      awk -F'|' '{print $3}' | \
      xargs)
    
    if [[ -n "$DURATION" ]]; then
      DURATIONS+=($DURATION)
    fi
  done
  
  if [[ ${#DURATIONS[@]} -gt 0 ]]; then
    AVG=$(echo "${DURATIONS[@]}" | awk '{sum=0; for(i=1;i<=NF;i++)sum+=$i; print sum/NF}')
    MIN=$(echo "${DURATIONS[@]}" | tr ' ' '\n' | sort -n | head -1)
    MAX=$(echo "${DURATIONS[@]}" | tr ' ' '\n' | sort -n | tail -1)
    
    echo "$STEP,$AVG,$MIN,$MAX"
  fi
done

Compare Task Types

# Compare plan ingestion vs compliance report
echo "Plan Ingestion:"
./cli/sdlc/utils/query-task-timing.sh $PLAN_INGESTION_TASK_ID | grep "Total Duration"

echo "Compliance Report:"
./cli/sdlc/utils/query-task-timing.sh $COMPLIANCE_REPORT_TASK_ID | grep "Total Duration"

Understanding the Analysis

Current % vs Proportional %

Current %: The progress percentage currently assigned in code (proto annotations)

Proportional %: What the percentage should be based on actual timing data

Example:

Step | Current % | Actual Duration | Proportional % | Cumulative %
-----|-----------|-----------------|----------------|-------------
   3 |       23% |        123456ms |            45% |         61%

Interpretation:

Step 3 is currently assigned 23% of progress
But it actually takes 45% of total time
This causes the progress bar to move slowly during this step
Action: Update proto annotation from 23% to 45%

Cumulative Progress

Shows the running total of progress percentages:

Good Distribution (cumulative reaches ~100%):

Step 1: 10% cumulative
Step 2: 25% cumulative
Step 3: 70% cumulative
Step 4: 95% cumulative
Step 5: 100% cumulative

Poor Distribution (cumulative doesn't match reality):

Step 1: 5% cumulative   (too low)
Step 2: 10% cumulative  (too low)
Step 3: 55% cumulative  (big jump - step 3 is slow)
Step 4: 60% cumulative  (too low)
Step 5: 100% cumulative (big jump at end)

Performance Optimization Workflow

1. Collect Baseline Data

# Run timing analysis on current implementation
./cli/sdlc/utils/query-task-timing.sh $TASK_ID > baseline-timing.txt

# Identify slowest steps
grep -A 100 "Progress Steps Analysis" baseline-timing.txt | sort -t'|' -k3 -nr | head -5

2. Optimize Slow Steps

Focus on steps that take >30% of total time:

# If "Convert to Markdown" takes 45% of time:
# - Consider parallel processing
# - Optimize markdown conversion logic
# - Cache intermediate results

3. Measure Improvement

# After optimization
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID > optimized-timing.txt

# Compare
echo "Before:"
grep "Total Duration" baseline-timing.txt

echo "After:"
grep "Total Duration" optimized-timing.txt

# Calculate improvement percentage

4. Update Progress Percentages

Based on new timing data:

# Get proportional percentages
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID | \
  grep -A 100 "Proportional Progress Percentages"

# Update proto file with new percentages
# Regenerate code: mvn clean protobuf:compile protobuf:compile-custom

Prerequisites

gcloud CLI - Must be authenticated
jq - JSON processor
bc - Basic calculator (for percentage calculations)
fetch-firestore-object.sh - Required dependency (same directory)

Installation

# Install prerequisites
brew install jq bc  # macOS
sudo apt-get install jq bc  # Ubuntu/Debian

# Authenticate
gcloud auth login
gcloud config set project construction-code-expert-test

Overview​

Purpose​

Usage​

Basic Syntax​

Examples​

Analyze Plan Ingestion Task​

Analyze Code Applicability Task​

Analyze Compliance Report Task​

Output Explanation​

Progress Steps Analysis​

Duration Analysis​

Proportional Progress Percentages​

Use Cases​

1. Optimize Progress Bar Accuracy​

2. Identify Performance Bottlenecks​

3. Compare Task Performance​

4. Track Performance Improvements​

5. Debug Stuck Tasks​

Integration with Other Tools​

With fetch-firestore-object.sh​

With jq for Custom Analysis​

In Performance Monitoring Scripts​

Understanding Progress Steps​

Firestore Data Structure​

Step Status Values​

Best Practices​

1. Analyze Completed Tasks​

2. Sample Multiple Tasks​

3. Document Findings​

4. Update Proto Annotations​

Troubleshooting​

No progress_steps Found​

Invalid Task ID​

bc Command Not Found​

Advanced Usage​

Export to CSV​

Aggregate Multiple Tasks​

Compare Task Types​

Understanding the Analysis​

Current % vs Proportional %​

Cumulative Progress​

Performance Optimization Workflow​

1. Collect Baseline Data​

2. Optimize Slow Steps​

3. Measure Improvement​

4. Update Progress Percentages​

Prerequisites​

Installation​

See Also​