Stats Sync-First Resolution Policy¶
StatsOrchestrator.resolve() follows a sync-first policy: on a store miss it attempts a
bounded synchronous capture before falling back to an async reconcile job. This lets queries
receive fresh statistics within a latency budget rather than always degrading to no-stats-available
on a cold cache.
Resolution flow¶
resolve(StatsCaptureRequest)
│
├─ store read ──────────────────────────────────────── HIT (stats returned, no capture)
│
└─ miss
│
├─ executionMode == SYNC
│ && latencyBudget present
│ && floecat.stats.sync.enabled == true
│ │
│ ├─ enqueue CAPTURE_ONLY reconcile job
│ │ poll every 100 ms until terminal or budget exceeded
│ │ │
│ │ ├─ job succeeded → re-read store
│ │ │ ├─ record present ───────── CAPTURED (stats returned, no follow-up)
│ │ │ └─ record absent ───────── PARTIAL (async follow-up enqueued)
│ │ │
│ │ ├─ budget exceeded ──────────────── TIMEOUT (async follow-up enqueued)
│ │ └─ job failed / no connector ────── FAILED (async follow-up enqueued)
│
└─ otherwise (ASYNC mode / no budget / sync disabled)
└────────────────────────────────────── SKIPPED (async job enqueued)
Outcome reference¶
| Outcome | Stats present | Async follow-up | When |
|---|---|---|---|
HIT |
yes | no | Record already in store |
CAPTURED |
yes | no | Sync capture succeeded; record now in store |
PARTIAL |
no | yes | Sync capture succeeded but record not yet visible |
TIMEOUT |
no | yes | Sync capture exceeded latency budget |
FAILED |
no | yes | Sync capture error or no upstream connector |
SKIPPED |
no | yes | ASYNC mode, no budget, or sync disabled |
Configuration¶
| Key | Default | Description |
|---|---|---|
floecat.stats.sync.enabled |
true |
Master switch for sync-first resolution |
floecat.stats.sync.latency-budget |
1s |
Per-request sync capture budget |
floecat.stats.sync.max-latency-budget |
10s |
Hard ceiling applied to any request budget |
Environment variable overrides follow the pattern FLOECAT_STATS_SYNC_ENABLED,
FLOECAT_STATS_SYNC_LATENCY_BUDGET, FLOECAT_STATS_SYNC_MAX_LATENCY_BUDGET.
Planner integration¶
StatsProviderFactory reads floecat.stats.sync.enabled and floecat.stats.sync.latency-budget
at startup and sets executionMode(SYNC) + latencyBudget(...) on every
StatsCaptureRequest it creates for query-time resolution. Set enabled=false to revert to
async-only for all queries without changing other behavior.
PlannerStatsBundleService maps StatsResolutionResult outcomes to planner quality:
HIT/CAPTURED→FOUNDTIMEOUT/FAILED/SKIPPED→NOT_FOUND(planner proceeds without stats)
Metrics¶
| Metric | Type | Tags | Description |
|---|---|---|---|
floecat.service.stats.sync_outcomes.total |
counter | result, component, operation |
Count per sync outcome |
floecat.service.stats.sync.latency |
timer (ms) | result, component, operation |
End-to-end resolve latency |
result values match StatsSyncOutcome names: HIT, CAPTURED, PARTIAL, TIMEOUT,
FAILED, SKIPPED.
Troubleshooting¶
High TIMEOUT rate
The most common cause is the capture budget being shorter than actual connector round-trip time.
Check floecat.service.stats.sync.latency (p95) against floecat.stats.sync.latency-budget.
Increase the budget (up to max-latency-budget) or set floecat.stats.sync.enabled=false to
stop blocking queries on sync attempts.
FAILED outcomes with no connector
Tables without an upstream connector cannot be captured synchronously. The orchestrator returns
FAILED immediately and enqueues an async follow-up. This is expected for system tables and
non-connected tables.
Disabling sync globally
Set floecat.stats.sync.enabled=false (or FLOECAT_STATS_SYNC_ENABLED=false). All queries
fall back to SKIPPED + async enqueue, which matches the pre-sync-policy behavior.
Verifying counters
Query /q/metrics after triggering a stats miss. Look for lines containing
floecat_service_stats_sync_outcomes_total.