Migrating from monolithic GIS to data mesh
1. Routing Topology & Dependency Mapping
The cutover from a centralized PostGIS/ArcSDE monolith to a domain-scoped spatial product pipeline requires a deterministic routing layer that intercepts, validates, and forwards spatial queries only after strict schema and catalog parity are achieved. This topology replaces legacy query dispatchers with a domain-driven routing model aligned to Geospatial Data Mesh Fundamentals. Platform engineers must treat the routing layer as a state machine: traffic admission is gated by eight explicit architectural controls. Failure to map these dependencies results in spatial index fragmentation during bulk vector-to-GeoParquet conversion and metadata catalog desynchronization, triggering silent spatial drift or timeout cascades.
The deployment manifest must enforce the following dependency chain before any production routing activation:
- Spatial Domain Boundary Design dictates query scoping and prevents cross-domain spatial joins without explicit federation tokens.
- Product Thinking for GIS Datasets establishes clear ownership boundaries, routing accountability to domain data product owners rather than central DBAs.
- Metadata Cataloging for Raster/Vector mandates strict schema validation (geometry types, CRS, attribute nullability) prior to traffic admission.
- Scoping Rules for Spatial Products governs deterministic partition key generation, ensuring uniform spatial indexing across domain boundaries.
- Spatial Product Lifecycle Management configures automated retention policies and archival triggers, preventing stale partition reads.
- Spatial Product Versioning Strategies enforces immutable semantic tags (
v1.2.0-rc.1), blocking routing to unversioned or mutable staging artifacts. - Cross-Team Governance Workflows require automated approval gates (e.g., OPA/Rego policy evaluation) before production routing activation.
- Data Mesh vs Traditional GIS Architecture documents the architectural shift from centralized query optimization to domain-localized execution, requiring explicit routing table updates during phased migration.
2. Deterministic Routing Configuration (Idempotent Execution)
The routing layer must be deployed using declarative, idempotent manifests. The following configuration pattern enforces CRS normalization, partition alignment, and SLA circuit breaking. Execution must be wrapped in an atomic state check to prevent partial routing activation.
# routing-layer-config.yaml
apiVersion: routing.mesh/v1
kind: SpatialDomainRouter
metadata:
name: gis-domain-router
annotations:
idempotency-key: "sha256:$(cat routing-layer-config.yaml | sha256sum | awk '{print $1}')"
spec:
admission_gates:
crs_validation:
target_crs: "EPSG:4326"
fallback_transform: true
reject_on_mismatch: true
schema_validation:
required_fields: ["partition_key", "geom", "temporal_index"]
geometry_type: ["Polygon", "MultiPolygon", "Point"]
catalog_sync:
max_drift_seconds: 120
block_traffic_on_desync: true
circuit_breaker:
error_threshold: 0.05
timeout_ms: 3000
fallback_route: "monolith-legacy-fallback"
routing_rules:
- match:
header: "x-domain-scope: hydrology"
route: "domain-hydrology-v1.4.2"
partition_strategy: "h3_resolution_7"
Apply idempotently using a checksum-locked deployment controller:
# Idempotent routing activation
CONFIG_HASH=$(sha256sum routing-layer-config.yaml | awk '{print $1}')
kubectl apply -f routing-layer-config.yaml --server-side --field-manager=platform-engineer
kubectl rollout status deploy/gis-domain-router --timeout=300s
echo "Routing state locked: $CONFIG_HASH"
3. Diagnostic Baselines & Log Pattern Isolation
Pre-cutover validation requires deterministic execution against staging replicas. Run the following diagnostic sequence to isolate spatial partition misalignment and catalog drift before traffic admission.
# 1. Validate spatial partition alignment and CRS consistency
duckdb -c "
SELECT partition_key,
ST_Transform(geom, 'EPSG:4326')::geometry AS normalized_geom
FROM spatial_product_staging
WHERE crs != 'EPSG:4326'
AND NOT ST_IsValid(geom);
"
# 2. Audit metadata catalog sync latency
curl -s -H "Authorization: Bearer $CATALOG_TOKEN" \
https://catalog-api.internal/v1/spatial/metadata/sync-status | \
jq '.drift_seconds'
If drift_seconds exceeds 120 or the DuckDB query returns non-null rows, the routing layer must block traffic until catalog parity is restored. Monitor the following log patterns in centralized aggregation (Splunk/ELK) to detect silent failures:
| Failure Mode | Log Pattern (Regex) | Severity | Action |
|---|---|---|---|
| CRS Mismatch | CRS_MISMATCH.*expected=EPSG:4326.*actual=EPSG:\d{4} |
P2 | Reject query, trigger transform pipeline |
| Partition Drift | PARTITION_DRIFT.*h3_cell_mismatch.*index_fragmentation |
P1 | Halt routing, rebuild spatial index |
| Catalog Desync | CATALOG_SYNC_TIMEOUT.*drift_seconds>\d{3} |
P1 | Block traffic, invoke reconciliation job |
| Schema Violation | SCHEMA_REJECT.*geometry_type_violation.*null_geometry |
P2 | Drop malformed batch, notify steward |
4. SLA Circuit Breaking & Traffic Admission Gates
Circuit breakers must operate at the query execution boundary, not at the network layer. Configure the routing proxy to evaluate spatial join latency and catalog sync status atomically.
# Circuit breaker state evaluation (idempotent)
CIRCUIT_STATE=$(curl -s http://router-internal:8080/health/circuit-breaker | jq -r '.state')
if [[ "$CIRCUIT_STATE" == "OPEN" ]]; then
echo "Circuit OPEN: Routing blocked. Awaiting catalog sync stabilization."
exit 1
fi
# Atomic traffic shift
kubectl patch configmap gis-routing-flags -p '{"data":{"allow_domain_routing":"true"}}'
Traffic admission requires three concurrent conditions:
drift_seconds < 120across all domain catalogs.- Zero
CRS_MISMATCHorPARTITION_DRIFTlogs in the last 15-minute window. - Immutable semantic tag validation passes against the registry.
Reference the OGC Coordinate Reference Systems Registry for authoritative CRS transformation matrices and the GeoParquet Specification v1.0.0 for partition-aligned geometry encoding standards.
5. Incident Response & Escalation Matrix
When routing activation fails or spatial drift is detected post-cutover, execute the following escalation path. All steps assume idempotent rollback capability via the monolith-legacy-fallback route.
| Trigger | Immediate Action | Escalation Path | Resolution SLA |
|---|---|---|---|
drift_seconds > 120 |
Block domain routing, shift to fallback | GIS Data Steward → Catalog Ops | 30 min |
PARTITION_DRIFT detected |
Halt query ingestion, trigger index rebuild | Platform Engineer → Data Mesh SRE | 1 hr |
SCHEMA_REJECT > 5% |
Quarantine batch, validate against catalog contract | Domain Owner → Data Governance Board | 2 hrs |
Circuit breaker OPEN > 10 min |
Force rollback, capture heap/trace dumps | Incident Commander → Architecture Review | 15 min |
Rollback Procedure (Idempotent):
# 1. Force routing to legacy fallback
kubectl patch configmap gis-routing-flags -p '{"data":{"allow_domain_routing":"false"}}'
# 2. Verify traffic shift
curl -s http://router-internal:8080/routing/status | jq '.active_route'
# 3. Capture diagnostic snapshot
kubectl logs deploy/gis-domain-router --tail=5000 > /tmp/router-failure-$(date +%s).log
Post-rollback, isolate the root cause using the diagnostic baselines in Section 3. Do not reactivate domain routing until catalog parity, partition alignment, and schema validation return zero anomalies. Document the failure vector in the incident ledger and update the routing manifest to prevent recurrence.