Service Runtime¶
Overview¶
The service/ module is the authoritative runtime for Floecat. It hosts the Quarkus gRPC server,
implements every public API from proto/, manages multi-account security contexts,
translates requests into pointer/blob mutations, assembles execution scan bundles, and operates background
tasks such as idempotency/CAS/pointer/transaction/reconcile-job GC and repository seeding.
It is structured for testability: each gRPC service delegates to repository abstractions, which in
turn encapsulate storage backends. Tests such as
service/src/test/java/ai/floedb/floecat/service/repo/impl/TableRepositoryTest.java and
service/src/test/java/ai/floedb/floecat/service/it/QueryServiceIT.java probe repository semantics and
query lifecycle / scan bundle logic.
Architecture & Responsibilities¶
┌────────────────────────────────────────────────────────────────────────────┐
│ Quarkus runtime │
│ ├─ Interceptors (context, localization, metering) │
│ ├─ Security (PrincipalProvider, Authorizer) │
│ ├─ Services (Catalog, Namespace, Table, View, Snapshot, Account, │
│ │ Directory, Statistics, Connectors, QueryService) │
│ ├─ QueryService (QueryContextStore, QueryServiceImpl) │
│ ├─ Repositories (CatalogRepository, NamespaceRepository, TableRepository, │
│ │ ViewRepository, SnapshotRepository, StatsRepository, │
│ │ ConnectorRepository, AccountRepository, │
│ │ IdempotencyRepositoryImpl) │
│ └─ GC & Bootstrap (IdempotencyGc, CasBlobGc, PointerGc, TransactionGc, │
│ ReconcileJobGc, SeedRunner) │
└────────────────────────────────────────────────────────────────────────────┘
Key packages¶
service/common– Shared helpers (BaseServiceImpl,IdempotencyGuard,Canonicalizer, pagination utilities, structured logging).service/context– gRPC interceptors injectingPrincipalContext, correlation IDs, query IDs, engine versions, and bridging outbound headers.service/security– MinimalPrincipalProviderandAuthorizerscaffolding; pluggable for production identity providers.service/repo– Resource repositories layering pointer/blob stores and parsing protobuf payloads; includes key generation utilities (Keys,ResourceKey) and value normalizers.service/catalog/directory/account/statistics/connector– gRPC service implementations.catalog/builtin– Shared builtin catalog data model, validator, and loader helpers.service/query– Query lifecycle management (QueryContext,QueryContextStore,QueryServiceImpl).service/query/graph– MetadataGraph cache + immutable node models shared by planners/executors (seedocs/metadata-graph.md).service/gc– Scheduled cleanup for idempotency records, orphan pointers/blobs, stale transaction artifacts, and durable reconcile jobs.service/bootstrap– Optional seeding of demo accounts and catalog data.service/metrics–ServiceTelemetryInterceptor+StorageUsageMetricsfor Micrometer integration.
Public API / Surface Area¶
Each gRPC implementation derives from BaseServiceImpl, gaining retry semantics, error mapping, and
helpers like randomResourceId (UUIDv4). Highlights:
- CatalogServiceImpl – Enforces
catalog.read/catalog.writepermissions, canonicalises names, usesIdempotencyGuardfor Create, and ensures namespace cascading checks during Delete. - NamespaceServiceImpl – Handles hierarchical selectors, supports recursive listing, and ensures
require_emptysemantics on deletion by inspecting repository counts. - TableServiceImpl – Validates
UpstreamRef, enforces unique names before writing, supports partial updates viaFieldMask, and coordinates snapshot/statistics purging. - ViewServiceImpl – Stores SQL definitions and references to base tables.
- SnapshotServiceImpl – Binds snapshots to tables, ensuring parent-child relationships remain intact.
- TableStatisticsServiceImpl – Persists per-snapshot target/file stats; validates
NDV/histogram payloads; paginates target listings (optionally filtered by target kind); uses
client-streaming
PutTargetStatsto batch writes per stream. - DirectoryServiceImpl – Provides fast name↔ID lookup via
MetadataGraph(Resolve/Lookup) and reuses the graph’s ResolveFQ helpers for list/prefix pagination. - AccountServiceImpl – Administers accounts and enforces conventional permissions.
- ConnectorsImpl – Manages connector lifecycle, validates
ConnectorSpecvia SPI factories, wires reconciliation job submission, and exposesValidateConnector+StartCapture.CaptureNowmaps to reconciler capture modes: - metadata only ->
METADATA_ONLY - stats only ->
STATS_ONLY - metadata plus stats ->
METADATA_AND_STATS - QueryServiceImpl – Administers query leases (
BeginQuery,RenewQuery,EndQuery,GetQuery) and exposesFetchScanBundleso planners can request connector scan metadata on demand. - SystemObjectsServiceImpl – Loads immutable builtin catalogs from disk/classpath, caches them
per engine version, and serves them via
GetSystemObjects.
Important Internal Details¶
BaseServiceImpl & Idempotency¶
BaseServiceImpl centralises retry policies (BACKOFF_MIN/MAX, jitter, RETRIES), correlation-ID
propagation, and error translation (storage/repository exceptions → gRPC status codes per
errors_en.properties). IdempotencyGuard stores request fingerprints inside
IdempotencyRepository (backed by the pointer/blob store) so replays reuse prior results.
Repository Layer¶
Each repository extends BaseResourceRepository<T>:
- Reserves pointer keys via CAS before writing blobs.
- Writes blobs with checksum verification (sha256B64).
- Maintains MutationMeta (pointer key, blob URI, pointer version, ETag, timestamp).
- Provides convenience accessors such as getByName, getById, list, and metaForSafe.
- Deletes tolerate missing blobs when cleaning up pointers, so skewed pointer/blob states can still be removed safely.
BaseResourceRepository also exposes reserveAllOrRollback for multi-key updates, and
compareAndDelete semantics for CAS-based deletions. Tests ensure parity between in-memory and AWS
implementations.
Security and Context¶
InboundContextInterceptor reads x-query-id, x-engine-version, and x-correlation-id headers,
plus optional OIDC session/authorization headers, validates account membership, hydrates
MDC/OpenTelemetry attributes, and enforces the configured floecat.auth.mode.
OutboundContextClientInterceptor mirrors the same headers for internal gRPC calls
(service-to-service).
Authorizer currently performs simple list membership checks on PrincipalContext.permissions; it
can be replaced by injecting a custom implementation.
External session header authentication is documented in
docs/external-authentication.md.
Query Lifecycle Service¶
QueryContextStore is a Caffeine cache keyed by query ID. Each QueryContext tracks state,
expiration, PrincipalContext, encoded SnapshotSet, and ExpansionMap.
QueryServiceImpl.beginQuery resolves name or ID references via Directory/Snapshot/Table services,
pins snapshots, and stores the lease. Planners request connector ScanBundles later via
FetchScanBundle, which streams data/delete files for a specific table. The lease data (snapshots,
expansion map, obligations) is returned to the caller inside the QueryDescriptor.
Builtin Catalog Service¶
SystemObjectsLoader reads immutable builtin catalogs (<engine_kind>.pb[pbtxt]) from the
configured location, caches them by engine kind, and exposes them through
SystemObjectsService.GetSystemObjects. Clients must send both x-engine-kind and
x-engine-version; the RPC always returns the filtered builtin bundle for the requested engine.
GC and Bootstrap¶
IdempotencyGc runs on a configurable cadence (see floecat.gc.* config) and sweeps expired
idempotency records in slices to avoid starvation. CasBlobGc enumerates blob prefixes and removes
CAS blobs with no remaining pointers once they exceed the configured min-age. PointerGc removes
orphan/stale pointers. TransactionGc reaps expired/aborted transaction artifacts and dangling
intent indices. ReconcileJobGc enforces durable reconcile retention and queue/dedupe cleanup for
terminal jobs. SeedRunner
populates demo data when floecat.seed.enabled=true.
For connector-backed fixture tables, seeding now runs a combined reconcile pass per fixture scope
using METADATA_AND_STATS.
This ingests metadata/snapshots first and enqueues scoped STATS_ONLY follow-up capture for stats.
Query scan bundles remain available immediately; stats availability follows queued capture completion.
Follow-up payloads now use reconcile scoped stats requests
(table_id, snapshot_id, target_spec, column_selectors) so background capture stays targeted
without depending on separate unresolved snapshot-id and target lists.
When a STATS_ONLY follow-up batch captures only a subset of requested items, the reconcile result
is degraded; when none of the requested items are captured, the reconcile result fails instead of
silently reporting zero processed stats.
Statistics streaming semantics¶
TableStatisticsServiceImpl enforces a single table_id + snapshot_id per streamed call to
PutTargetStats, rejects mixed idempotency keys within a stream, and applies
idempotent writes when a key is present. Each stream returns one response summarising the total
rows upserted after all batches have been consumed.
Data Flow & Lifecycle¶
Typical request path¶
client → Quarkus Server
→ InboundContextInterceptor (principal/query/correlation)
→ LocalizeErrorsInterceptor (message catalog)
→ ServiceTelemetryInterceptor (metrics/latency)
→ ServiceImpl (authz + validation)
→ Repository (CAS pointer/blob operations)
← response + MutationMeta
Configuration & Extensibility¶
Notable application.properties keys:
| Property | Purpose |
|---|---|
quarkus.grpc.server.* |
Port, HTTP2, plaintext/reflection toggles. |
quarkus.grpc.clients.floecat.* |
Loopback client config for internal RPC calls. |
floecat.seed.enabled |
Enable demo data seeding. |
floecat.kv / floecat.blob |
Select pointer/blob store implementation (memory, dynamodb, s3). |
floecat.query.* |
Default TTL, grace period, max cache size, safety expiry for query contexts. |
floecat.gc.idempotency.* |
Cadence, page size, batch limit, slice duration for idempotency GC. |
floecat.gc.cas.* |
Cadence, page size, min-age, tick slice settings for CAS blob GC. |
floecat.gc.pointer.* |
Cadence, page size, min-age, tick slice settings for pointer GC. |
floecat.gc.reconcile-jobs.* |
Cadence, retention, and slice settings for durable reconcile-job GC. |
floecat.reconciler.job-store.* |
Durable reconcile queue selection and retry/lease tuning. |
quarkus.log.* |
JSON logging, file rotation, audit handlers per RPC package. |
quarkus.otel.* / quarkus.micrometer.* |
Observability exporters (see docs/operations.md). |
floecat.auth.mode |
Auth enforcement mode (oidc, dev). |
floecat.auth.platform-admin.role |
IdP role name granted permission to manage accounts (default platform-admin). |
floecat.secrets.aws.role-arn |
Optional role to assume per account when using AWS Secrets Manager. |
Extension points:
- Storage – Provide custom PointerStore/BlobStore (see docs/storage-spi.md).
- Security – Replace Authorizer or interceptors with CDI alternatives.
- Connectors – Register new SPI implementations and expose them via ConnectorRepository.
- QueryService – Extend query metadata by enriching QueryContext creation or injecting
additional connector metadata via the FetchScanBundle RPC / ScanBundleService. BeginQuery
optionally accepts a client-specified query_id plus common.QueryInput records so lifecycle can
pre-pin snapshots/expansions for deterministic replay.
Secrets Manager integration (tags + optional per-account assume-role) is documented in
docs/secrets-manager.md.
Examples & Scenarios¶
- Create Catalog –
CatalogServiceImpl.createCatalogcanonicalisesdisplay_name, allocates a UUIDv4 identifier, reserves/accounts/{account}/catalogs/by-name/{name}and/by-id/{uuid}pointer keys, writes thecatalog.pbblob, and returnsMutationMeta. If the caller supplies anIdempotencyKey, the repository short-circuits duplicates. - Delete Namespace – Namespace deletions with
require_empty=truecheck child counts viaNamespaceRepository.countChildren. If tables exist, the service raisesMC_CONFLICT.namespace.not_empty. - Query lease renewal – Clients call
QueryService.RenewQuerybeforeexpires_at; the store extends the TTL if the query remainsACTIVE. A stale or ended query returnsMC_NOT_FOUND.query.not_found.
Cross-References¶
- RPC contracts:
docs/proto.md - Connector SPI & implementations:
docs/connectors-spi.md,docs/connectors-iceberg.md,docs/connectors-delta.md - Storage implementations consumed by repositories:
docs/storage-spi.md,docs/storage-memory.md,docs/storage-aws.md - Reconciler orchestrating connectors:
docs/reconciler.md