Skip to main content

Query Task Timing Utility

Overview

The query-task-timing.sh script analyzes task timing data from Firestore to help optimize progress percentage calculations and identify performance bottlenecks in background tasks.

Script Location: cli/sdlc/utils/query-task-timing.sh

Purpose

This utility provides detailed timing analysis for background tasks (plan ingestion, code applicability, compliance report) by:

  • Analyzing step durations - Shows how long each step takes
  • Calculating proportions - Determines actual time distribution vs. assumed percentages
  • Identifying bottlenecks - Highlights slow steps that need optimization
  • Optimizing progress bars - Provides data to calibrate progress percentage calculations
  • Performance tracking - Compare timing across different task runs

Usage

Basic Syntax

./cli/sdlc/utils/query-task-timing.sh <taskId> [project]

Parameters:

  • taskId - Task ID to analyze (UUID format)
  • project - (Optional) GCP project ID (default: construction-code-expert-test)

Examples

Analyze Plan Ingestion Task

# Analyze plan ingestion task
./cli/sdlc/utils/query-task-timing.sh a7326bb7-9d32-442c-85bc-0807541f893d

# Analyze from dev environment
./cli/sdlc/utils/query-task-timing.sh a7326bb7-9d32-442c-85bc-0807541f893d construction-code-expert-dev

Output:

⏱️  Analyzing task timing data
📁 Project: construction-code-expert-test
🆔 Task ID: a7326bb7-9d32-442c-85bc-0807541f893d

📥 Fetching task data from Firestore...
✅ Task data fetched successfully!

📈 Analyzing Progress Steps...
📈 Progress Steps Analysis:
Step | Title | Duration (ms) | Duration (s) | Status
-----|-----------------------------------------------|---------------|-------------|--------
1 | Initializing plan ingestion | 1234 | 1.2 | COMPLETED
2 | Extracting pages from PDF | 45678 | 45.7 | COMPLETED
3 | Converting pages to markdown | 123456 | 123.5 | COMPLETED
4 | Analyzing building codes | 67890 | 67.9 | COMPLETED
5 | Generating compliance report | 34567 | 34.6 | COMPLETED

🎯 Duration Analysis:
Total Duration: 272825ms (272s)

📊 Proportional Progress Percentages:
Step | Current % | Actual Duration | Proportional % | Cumulative %
-----|-----------|-----------------|----------------|-------------
1 | 7% | 1234ms | 0% | 0%
2 | 15% | 45678ms | 16% | 16%
3 | 23% | 123456ms | 45% | 61%
4 | 30% | 67890ms | 24% | 85%
5 | 38% | 34567ms | 12% | 97%

✅ Analysis complete!

Analyze Code Applicability Task

./cli/sdlc/utils/query-task-timing.sh <code-applicability-task-id>

Analyze Compliance Report Task

./cli/sdlc/utils/query-task-timing.sh <compliance-report-task-id>

Output Explanation

Progress Steps Analysis

Shows each step with:

  • Step Number - Sequential step identifier
  • Title - Human-readable step name (from proto annotations)
  • Duration (ms) - Time taken in milliseconds
  • Duration (s) - Time taken in seconds
  • Status - Step completion status

Duration Analysis

  • Total Duration - Sum of all step durations
  • Helps identify which steps take the most time

Proportional Progress Percentages

Compares current progress calculation with actual timing data:

  • Current % - Current progress percentage assigned to this step
  • Actual Duration - Measured time for this step
  • Proportional % - What percentage this step should get based on actual timing
  • Cumulative % - Running total of progress

Use Case: If "Proportional %" differs significantly from "Current %", you should adjust the progress percentage annotations in your proto file.

Use Cases

1. Optimize Progress Bar Accuracy

When users report inaccurate progress bars:

# 1. Get task ID from logs or UI
TASK_ID="a7326bb7-9d32-442c-85bc-0807541f893d"

# 2. Analyze timing
./cli/sdlc/utils/query-task-timing.sh $TASK_ID

# 3. Compare "Current %" vs "Proportional %"
# If they differ significantly, update proto annotations

Example: If "Extract Pages" shows:

  • Current %: 15%
  • Proportional %: 45%

Then update task.proto:

EXTRACT_PAGES = 2 [
(step_title) = "Extracting pages from PDF",
(step_progress_percent) = 45 // Updated from 15 based on timing data
];

2. Identify Performance Bottlenecks

Find slow steps that need optimization:

# Analyze multiple task runs
for TASK_ID in $TASK_IDS; do
echo "Analyzing $TASK_ID..."
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | grep "Duration Analysis" -A 5
done

# Look for steps that consistently take >60 seconds

Action Items:

  • Steps >60s: Consider breaking into smaller steps
  • Steps >120s: Investigate for optimization opportunities
  • Steps with high variance: Check for external dependencies

3. Compare Task Performance

Compare timing across different environments or versions:

# Dev environment
./cli/sdlc/utils/query-task-timing.sh $TASK_ID construction-code-expert-dev > timing-dev.txt

# Test environment
./cli/sdlc/utils/query-task-timing.sh $TASK_ID construction-code-expert-test > timing-test.txt

# Compare
diff timing-dev.txt timing-test.txt

4. Track Performance Improvements

Before and after optimization:

# Before optimization
./cli/sdlc/utils/query-task-timing.sh $OLD_TASK_ID > timing-before.txt

# Deploy optimized code

# After optimization
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID > timing-after.txt

# Calculate improvement
echo "Before: $(grep "Total Duration" timing-before.txt)"
echo "After: $(grep "Total Duration" timing-after.txt)"

5. Debug Stuck Tasks

When a task appears stuck at a specific percentage:

# Analyze the stuck task
./cli/sdlc/utils/query-task-timing.sh $STUCK_TASK_ID

# Check:
# - Which step is it stuck on?
# - Is that step taking unusually long?
# - Are there any error messages?

# Cross-reference with logs
gcloud logging read "jsonPayload.taskId=$STUCK_TASK_ID" --limit 100

Integration with Other Tools

With fetch-firestore-object.sh

This script internally uses fetch-firestore-object.sh to retrieve task data:

# query-task-timing.sh calls:
./fetch-firestore-object.sh tasks $TASK_ID $PROJECT_ID

# Then parses the progress_steps array

With jq for Custom Analysis

Extract specific timing data:

# Get just step durations
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep -A 100 "Progress Steps Analysis" | \
awk -F'|' 'NR>2 {print $3}' | \
xargs

# Calculate average step duration
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep -A 100 "Progress Steps Analysis" | \
awk -F'|' 'NR>2 {sum+=$3; count++} END {print "Average:", sum/count "ms"}'

In Performance Monitoring Scripts

#!/bin/bash
# monitor-task-performance.sh

TASK_IDS=$(gcloud firestore documents list tasks --filter="status=COMPLETED" --limit 10 --format="value(name)" | cut -d'/' -f6)

echo "Task ID,Total Duration (ms),Slowest Step,Slowest Duration (ms)"

for TASK_ID in $TASK_IDS; do
OUTPUT=$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID 2>&1)

TOTAL=$(echo "$OUTPUT" | grep "Total Duration:" | awk '{print $3}' | sed 's/ms//')
SLOWEST_STEP=$(echo "$OUTPUT" | grep -A 100 "Progress Steps Analysis" | sort -t'|' -k3 -nr | head -2 | tail -1 | awk -F'|' '{print $2}' | xargs)
SLOWEST_DURATION=$(echo "$OUTPUT" | grep -A 100 "Progress Steps Analysis" | sort -t'|' -k3 -nr | head -2 | tail -1 | awk -F'|' '{print $3}' | xargs)

echo "$TASK_ID,$TOTAL,$SLOWEST_STEP,$SLOWEST_DURATION"
done

Understanding Progress Steps

Firestore Data Structure

Progress steps are stored as an array of maps in Firestore:

{
"progress_steps": {
"arrayValue": {
"values": [
{
"mapValue": {
"fields": {
"stepNumber": {"doubleValue": 1},
"name": {"stringValue": "Initializing plan ingestion"},
"durationMs": {"stringValue": "1234"},
"status": {"stringValue": "COMPLETED"}
}
}
}
]
}
}
}

Step Status Values

  • PENDING - Step not yet started
  • IN_PROGRESS - Step currently executing
  • COMPLETED - Step finished successfully
  • FAILED - Step failed with error
  • SKIPPED - Step skipped (conditional logic)

Best Practices

1. Analyze Completed Tasks

For accurate timing data, analyze completed tasks:

# Good - completed task
./cli/sdlc/utils/query-task-timing.sh $COMPLETED_TASK_ID

# Less useful - in-progress task (incomplete timing data)
./cli/sdlc/utils/query-task-timing.sh $IN_PROGRESS_TASK_ID

2. Sample Multiple Tasks

Don't optimize based on a single task run:

# Analyze 5-10 tasks of the same type
for i in {1..10}; do
./cli/sdlc/utils/query-task-timing.sh $TASK_ID_$i >> timing-analysis.txt
done

# Calculate averages
grep "Total Duration" timing-analysis.txt | awk '{sum+=$3} END {print "Average:", sum/NR "ms"}'

3. Document Findings

Create a timing report:

# Generate report
cat > timing-report.md <<EOF
# Task Timing Analysis - $(date +%Y-%m-%d)

## Task Type: Plan Ingestion

### Sample Size: 10 tasks

### Findings:
$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID)

### Recommendations:
- Step 3 (Convert to Markdown) takes 45% of total time
- Consider parallel processing for page conversion
- Current progress % should be adjusted from 23% to 45%
EOF

4. Update Proto Annotations

Based on timing analysis, update progress percentages in proto files:

// task.proto
enum PlanIngestionStep {
INITIALIZE = 1 [
(step_title) = "Initializing plan ingestion",
(step_progress_percent) = 1 // Takes <1% of time
];

EXTRACT_PAGES = 2 [
(step_title) = "Extracting pages from PDF",
(step_progress_percent) = 16 // Takes ~16% of time
];

CONVERT_TO_MARKDOWN = 3 [
(step_title) = "Converting pages to markdown",
(step_progress_percent) = 45 // Takes ~45% of time (UPDATED!)
];

// ... other steps
}

Troubleshooting

No progress_steps Found

❌ No progress_steps found in task data
💡 Available fields:
[
"taskId",
"taskType",
"status",
"progressPercent"
]

Possible Causes:

  1. Task is too old (before progress tracking was added)
  2. Task type doesn't support step-by-step progress
  3. Task failed before any steps were recorded

Solution: Use a more recent task or check if the task type supports progress steps.

Invalid Task ID

❌ Failed to fetch task data:
❌ Document not found or access denied

Solution: Verify the task ID is correct and exists in the specified project.

bc Command Not Found

bash: bc: command not found

Solution: Install bc (basic calculator):

# macOS
brew install bc

# Ubuntu/Debian
sudo apt-get install bc

Advanced Usage

Export to CSV

# Export timing data to CSV
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep -A 100 "Progress Steps Analysis" | \
awk -F'|' 'NR>2 {print $1","$2","$3","$4","$5}' > timing-data.csv

Aggregate Multiple Tasks

#!/bin/bash
# aggregate-timing.sh - Analyze multiple tasks

TASK_IDS=(
"task-id-1"
"task-id-2"
"task-id-3"
)

echo "Step,Avg Duration (ms),Min Duration (ms),Max Duration (ms)"

for STEP in {1..13}; do
DURATIONS=()

for TASK_ID in "${TASK_IDS[@]}"; do
DURATION=$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep "^[[:space:]]*$STEP |" | \
awk -F'|' '{print $3}' | \
xargs)

if [[ -n "$DURATION" ]]; then
DURATIONS+=($DURATION)
fi
done

if [[ ${#DURATIONS[@]} -gt 0 ]]; then
AVG=$(echo "${DURATIONS[@]}" | awk '{sum=0; for(i=1;i<=NF;i++)sum+=$i; print sum/NF}')
MIN=$(echo "${DURATIONS[@]}" | tr ' ' '\n' | sort -n | head -1)
MAX=$(echo "${DURATIONS[@]}" | tr ' ' '\n' | sort -n | tail -1)

echo "$STEP,$AVG,$MIN,$MAX"
fi
done

Compare Task Types

# Compare plan ingestion vs compliance report
echo "Plan Ingestion:"
./cli/sdlc/utils/query-task-timing.sh $PLAN_INGESTION_TASK_ID | grep "Total Duration"

echo "Compliance Report:"
./cli/sdlc/utils/query-task-timing.sh $COMPLIANCE_REPORT_TASK_ID | grep "Total Duration"

Understanding the Analysis

Current % vs Proportional %

Current %: The progress percentage currently assigned in code (proto annotations)

Proportional %: What the percentage should be based on actual timing data

Example:

Step | Current % | Actual Duration | Proportional % | Cumulative %
-----|-----------|-----------------|----------------|-------------
3 | 23% | 123456ms | 45% | 61%

Interpretation:

  • Step 3 is currently assigned 23% of progress
  • But it actually takes 45% of total time
  • This causes the progress bar to move slowly during this step
  • Action: Update proto annotation from 23% to 45%

Cumulative Progress

Shows the running total of progress percentages:

Good Distribution (cumulative reaches ~100%):

Step 1: 10% cumulative
Step 2: 25% cumulative
Step 3: 70% cumulative
Step 4: 95% cumulative
Step 5: 100% cumulative

Poor Distribution (cumulative doesn't match reality):

Step 1: 5% cumulative   (too low)
Step 2: 10% cumulative (too low)
Step 3: 55% cumulative (big jump - step 3 is slow)
Step 4: 60% cumulative (too low)
Step 5: 100% cumulative (big jump at end)

Performance Optimization Workflow

1. Collect Baseline Data

# Run timing analysis on current implementation
./cli/sdlc/utils/query-task-timing.sh $TASK_ID > baseline-timing.txt

# Identify slowest steps
grep -A 100 "Progress Steps Analysis" baseline-timing.txt | sort -t'|' -k3 -nr | head -5

2. Optimize Slow Steps

Focus on steps that take >30% of total time:

# If "Convert to Markdown" takes 45% of time:
# - Consider parallel processing
# - Optimize markdown conversion logic
# - Cache intermediate results

3. Measure Improvement

# After optimization
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID > optimized-timing.txt

# Compare
echo "Before:"
grep "Total Duration" baseline-timing.txt

echo "After:"
grep "Total Duration" optimized-timing.txt

# Calculate improvement percentage

4. Update Progress Percentages

Based on new timing data:

# Get proportional percentages
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID | \
grep -A 100 "Proportional Progress Percentages"

# Update proto file with new percentages
# Regenerate code: mvn clean protobuf:compile protobuf:compile-custom

Prerequisites

  • gcloud CLI - Must be authenticated
  • jq - JSON processor
  • bc - Basic calculator (for percentage calculations)
  • fetch-firestore-object.sh - Required dependency (same directory)

Installation

# Install prerequisites
brew install jq bc # macOS
sudo apt-get install jq bc # Ubuntu/Debian

# Authenticate
gcloud auth login
gcloud config set project construction-code-expert-test

See Also