Skip to content

Telemetry Hub Contract

This lists all metrics available in the repository right now:

Core Metrics

Metric Type Unit Since Description Required Tags Allowed Tags
floecat.core.cache.accounts GAUGE count v1 Number of accounts with an active cache entry, tagged by cache name. cache, component, operation account, cache, component, operation
floecat.core.cache.enabled GAUGE v1 Indicator that the cache is enabled (1=enabled, 0=disabled). cache, component, operation account, cache, component, operation
floecat.core.cache.entries GAUGE count v1 Approximate number of entries in the cache, tagged by cache name. cache, component, operation account, cache, component, operation
floecat.core.cache.errors COUNTER v1 Number of cache operation failures (load errors), tagged by cache name. cache, component, operation, result account, cache, component, exception, operation, result
floecat.core.cache.hits COUNTER v1 Number of cache lookup hits, tagged by cache name. cache, component, operation account, cache, component, operation
floecat.core.cache.latency TIMER seconds v1 Cache latency distribution for operations. cache, component, operation, result account, cache, component, exception, operation, result
floecat.core.cache.max.entries GAUGE count v1 Configured max entries for the cache. cache, component, operation account, cache, component, operation
floecat.core.cache.max.weight.bytes GAUGE bytes v1 Configured maximum weight (bytes) for the cache. cache, component, operation account, cache, component, operation
floecat.core.cache.misses COUNTER v1 Number of cache lookup misses, tagged by cache name. cache, component, operation account, cache, component, operation
floecat.core.cache.weighted.size.bytes GAUGE bytes v1 Total weight (bytes) of cache entries, tagged by cache name. cache, component, operation account, cache, component, operation
floecat.core.exec.active GAUGE count v1 Number of threads actively executing tasks per pool. component, operation, pool component, operation, pool
floecat.core.exec.queue.depth GAUGE count v1 Number of work items waiting in the executor queue per pool. component, operation, pool component, operation, pool
floecat.core.exec.rejected COUNTER v1 Number of task submissions rejected by the executor. component, operation, pool component, operation, pool
floecat.core.exec.task.run TIMER seconds v1 Duration spent running the task on a worker thread. component, operation, pool component, operation, pool, result
floecat.core.exec.task.wait TIMER seconds v1 Duration spent waiting in the queue before execution starts. component, operation, pool component, operation, pool, result
floecat.core.gc.collections COUNTER v1 Number of GC collections per GC type. component, gc, operation, result component, exception, gc, operation, result
floecat.core.gc.errors COUNTER v1 GC failures per GC type. component, gc, operation, result component, exception, gc, operation, result
floecat.core.gc.pause TIMER seconds v1 GC pause time per GC type. component, gc, operation, result component, exception, gc, operation, result
floecat.core.gc.retries COUNTER v1 GC retries per component/operation. component, operation component, operation
floecat.core.observability.dropped.metric.total COUNTER v1 Total number of metric emissions rejected because validation failed. reason
floecat.core.observability.dropped.tags.total COUNTER v1 Total number of tags dropped because they violated telemetry contracts.
floecat.core.observability.duplicate.gauge.total COUNTER v1 Count of duplicate gauge registration attempts. reason
floecat.core.observability.invalid.metric.total COUNTER v1 Total number of metrics rejected because they were not registered. reason
floecat.core.observability.registry.size GAUGE count v1 Current size of the telemetry registry.
floecat.core.rpc.active GAUGE v1 Number of in-flight RPCs per component/operation. component, operation component, operation
floecat.core.rpc.errors COUNTER v1 Count of RPC failures per component/operation. component, operation, result account, component, exception, operation, result, status
floecat.core.rpc.latency TIMER seconds v1 Latency distribution for RPC operations. component, operation, result account, component, exception, operation, result, status
floecat.core.rpc.requests COUNTER v1 Total RPC requests processed, tagged by account and status. account, component, operation, status account, component, operation, status
floecat.core.rpc.retries COUNTER v1 Number of RPC retries invoked. component, operation component, operation
floecat.core.store.bytes COUNTER bytes v1 Count of bytes processed by store operations. component, operation, result account, component, exception, operation, result
floecat.core.store.errors COUNTER v1 Store failure count per component/operation. component, operation, result account, component, exception, operation, result
floecat.core.store.latency TIMER seconds v1 Store operation latency distribution. component, operation, result account, component, exception, operation, result
floecat.core.store.requests COUNTER v1 Number of store requests emitted per component/operation. component, operation, result account, component, exception, operation, result
floecat.core.store.retries COUNTER v1 Store retries per component/operation. component, operation component, operation
floecat.core.task.enabled GAUGE v1 Indicator that a scheduled task is enabled (1=enabled, 0=disabled). component, operation, task account, component, operation, task
floecat.core.task.last.tick.end.ms GAUGE milliseconds v1 Timestamp (ms since epoch) when the scheduled task last finished a tick. component, operation, task account, component, operation, task
floecat.core.task.last.tick.start.ms GAUGE milliseconds v1 Timestamp (ms since epoch) when the scheduled task last started a tick. component, operation, task account, component, operation, task
floecat.core.task.running GAUGE v1 Number of active ticks for the scheduled task (usually 0 or 1). component, operation, task account, component, operation, task

Extra JVM Metrics

Metric Type Unit Since Description Required Tags Allowed Tags
floecat.jvm.gc.live.data.bytes GAUGE bytes v1 Estimated live data (bytes) held by each garbage collector. component, gc, operation component, gc, operation
floecat.jvm.gc.live.data.growth.rate GAUGE bytes_per_second v1 Live data growth rate (bytes/second) for GC-managed pools. component, gc, operation component, gc, operation

Profiling Metrics

Metric Type Unit Since Description Required Tags Allowed Tags
floecat.profiling.captures.total COUNTER v1 Profiling capture lifecycle counts (started/completed/failed/dropped). component, mode, operation, result, scope, trigger component, mode, operation, policy, reason, result, scope, trigger

Service Metrics

Metric Type Unit Since Description Required Tags Allowed Tags
floecat.service.flight.cancelled.total COUNTER v1 Flight request cancellations by operation, table, and reason. component, operation, reason component, operation, reason, resource, status
floecat.service.flight.errors.total COUNTER v1 Flight request failures by operation, table, and reason. component, operation, reason component, operation, reason, resource, status
floecat.service.flight.inflight GAUGE v1 Current number of in-flight Flight streams. component, operation component, operation, resource
floecat.service.flight.latency TIMER ms v1 Flight request latency by operation, table, and terminal status. component, operation, status component, operation, reason, resource, status
floecat.service.flight.requests.total COUNTER v1 Total Flight requests by operation, table, and terminal status. component, operation, status component, operation, reason, resource, status
floecat.service.reconcile.cancel_job.total COUNTER v1 CancelReconcileJob request outcomes. component, operation, result component, operation, reason, result
floecat.service.reconcile.capture_now.total COUNTER v1 CaptureNow request outcomes by trigger type. component, operation, result, trigger component, operation, reason, result, trigger
floecat.service.reconcile.errors.total COUNTER v1 Errors recorded by reconcile jobs. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.get_job.total COUNTER v1 GetReconcileJob request outcomes. component, operation, result component, operation, reason, result
floecat.service.reconcile.get_settings.total COUNTER v1 GetReconcilerSettings request outcomes. component, operation, result component, operation, reason, result
floecat.service.reconcile.job.latency TIMER ms v1 Reconcile job terminal latency by execution mode. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.jobs.cancelling GAUGE v1 Current number of reconcile jobs waiting for cancellation. component, operation component, operation
floecat.service.reconcile.jobs.queued GAUGE v1 Current number of queued reconcile jobs. component, operation component, operation
floecat.service.reconcile.jobs.running GAUGE v1 Current number of running reconcile jobs. component, operation component, operation
floecat.service.reconcile.jobs.total COUNTER v1 Reconcile job terminal outcomes by execution mode. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.list_jobs.total COUNTER v1 ListReconcileJobs request outcomes. component, operation, result component, operation, reason, result
floecat.service.reconcile.planner.enqueue.total COUNTER v1 Automatic reconcile planner enqueue decisions by mode. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.planner.tick.latency TIMER ms v1 Automatic reconcile planner tick latency. component, operation, result component, operation, reason, result
floecat.service.reconcile.planner.ticks.total COUNTER v1 Automatic reconcile planner tick outcomes. component, operation, result component, operation, reason, result
floecat.service.reconcile.queue.oldest_age GAUGE ms v1 Age in milliseconds of the oldest queued reconcile job. component, operation component, operation
floecat.service.reconcile.snapshots_processed.total COUNTER v1 Snapshots processed by reconcile jobs. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.start_capture.total COUNTER v1 StartCapture request outcomes by trigger type. component, operation, result, trigger component, operation, reason, result, trigger
floecat.service.reconcile.stats_processed.total COUNTER v1 Statistics payloads processed by reconcile jobs. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.tables_changed.total COUNTER v1 Tables changed by reconcile jobs. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.tables_scanned.total COUNTER v1 Tables scanned by reconcile jobs. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.update_settings.total COUNTER v1 UpdateReconcilerSettings request outcomes. component, operation, result component, operation, reason, result
floecat.service.reconcile.views_changed.total COUNTER v1 Views changed by reconcile jobs. component, mode, operation, result component, mode, operation, reason, result
floecat.service.reconcile.views_scanned.total COUNTER v1 Views scanned by reconcile jobs. component, mode, operation, result component, mode, operation, reason, result
floecat.service.stats.batch_groups.total COUNTER v1 Stats batch request groups processed. component, operation component, mode, operation, reason, resource, result, scope, trigger
floecat.service.stats.batch_items.total COUNTER v1 Stats batch item counters (untagged series tracks submitted items; tagged series with result=* tracks per-outcome items; query with result tag filters to avoid double-counting). component, operation component, mode, operation, reason, resource, result, scope, trigger
floecat.service.stats.engine_batch_calls.total COUNTER v1 Stats engine batch capture calls. component, operation component, mode, operation, reason, resource, result, scope, trigger
floecat.service.stats.store_hits.total COUNTER v1 Stats store hit count for batch resolution. component, operation component, mode, operation, reason, resource, result, scope, trigger
floecat.service.stats.store_misses.total COUNTER v1 Stats store miss count for batch resolution. component, operation component, mode, operation, reason, resource, result, scope, trigger
floecat.service.storage.account.bytes GAUGE bytes v1 Estimated per-account storage byte consumption (sampled, not exact). account account
floecat.service.storage.account.pointers GAUGE v1 Per-account pointer count stored in the service. account account

Standard JVM metrics

Quarkus enables the standard Micrometer JVM binders (JvmMemoryMetrics, JvmThreadMetrics, JvmGcMetrics, ProcessorMetrics, etc.) by default, so the jvm.*, process_cpu_usage, system_cpu_usage, and related runtime gauges are still emitted alongside Floecat’s custom GC live-data metrics. Refer to the OpenTelemetry JVM runtime semantic conventions for the full list of names and tags.

Correlation contract

The hub expects metrics, traces, and logs to share a small, predictable key set so dashboards can link between systems:

  • Spans carry floecat.component and floecat.operation (plus floecat.rpc.status for RPC spans and floecat.store.operation for storage observations) so Tempo queries can reuse the same component/operation filters as the Prometheus panels.
  • Logs expose floecat_component and floecat_operation via MDC (and can emit traceId/spanId if your JSON logging pipeline is configured to capture OpenTelemetry context). That keeps Loki’s Logs for this trace button and Tempo’s trace-to-log links usable even when you jump directly from a metric graph, while making it clear the trace identifiers only show up when your logging stack surfaces them.
  • Metrics retain their canonical component/operation tags from the catalog, and the Micrometer backend logs the current trace/span IDs at TRACE level while it records latency or error counters so you can correlate back to the active span.
  • Telemetry contract version (telemetry.contract.version) travels with every meter (and can be mirrored into spans/logs or log labels) so dashboards know which catalog version produced a series.

Adhering to this contract means a slow RPC bucket in Grafana can open Tempo with the matching span, jump from the span to Loki logs, and still point back to the same metric tags.