Data Flow

Two core data flows define the Spatial.Properties platform. The first transforms raw geospatial data into versioned, validated Spatial Packs. The second takes a natural-language question and turns it into a map visualization with supporting data. Both flows are designed to be reproducible and auditable.

Pack build pipeline

The spatialpack CLI reads a Pipeline YAML file and executes a sequence of processing stages to produce a complete Spatial Pack.

graph LR
  Sources["Data\nSources"] --> YAML["Pipeline\nYAML"]
  YAML --> Parser["Parser"]
  Parser --> Vars["Variable\nResolution"]
  Vars --> Exec["Stage\nExecutor"]
  Exec --> Layers["GeoParquet\n+ PMTiles"]
  Layers --> Validate["Schema\nValidation"]
  Validate --> Pack["Spatial Pack"]

FIG. 1 · PACK BUILD PIPELINE

Data Sources are the raw geospatial files — Shapefiles, GeoPackages, GeoJSON, raster DEMs — declared in the Pipeline YAML’s sources section. Each source includes a file path and license identifier.

The Parser reads the YAML and resolves variables like ${pack.region.bbox} before execution begins. Variable interpolation happens once, producing a fully concrete pipeline definition with no unresolved references.

The Stage Executor processes each stage in declaration order. The platform includes 12 built-in stage types covering vector conversion, raster processing, tile generation, file extraction, metric computation, and integrity hashing. Each stage reads its input (a named source or a previous stage’s output), runs the action handler, and writes the result to the output directory.

After all stages complete, the executor generates a spatialpack.json manifest from accumulated layer metadata. The manifest is then validated against the Spatial Pack JSON Schema to verify that all required fields are present, every declared layer file exists, and governance requirements (license, provenance) are satisfied. Packs that fail validation cannot be published.

The final output is a self-contained directory with GeoParquet files for analytics, PMTiles for map visualization, and a manifest that ties everything together.

For a deep dive into Pipeline YAML syntax and stage types, see Pipeline Architecture.

Chat-to-map pipeline

The chat-to-map flow is the platform’s signature capability. A user asks a spatial question in natural language and receives both a data answer and a map visualization — no GIS expertise required.

graph LR
  User["User\nQuestion"] --> Chat["Chat\nInterface"]
  Chat --> Gateway["AI\nGateway"]
  Gateway --> LLM["Language\nModel"]
  LLM --> Tools["Tool\nCalls"]
  Tools --> Data["Pack\nData"]
  Data --> Map["Map\nUpdate"]
  Map --> User

FIG. 2 · CHAT-TO-MAP PIPELINE

The flow starts when a user asks a question through the web application’s chat interface — for example, “Show me all parcels larger than 5 hectares near the coast.”

The Chat Interface sends the message to the API server, which routes it to the AI Gateway. The gateway selects the appropriate language model based on database-driven configuration and forwards the request along with tool definitions that describe available spatial operations.

The Language Model analyzes the question and generates tool calls — structured instructions to query data, filter features, apply spatial operations, or update the map. Tools execute against pack data stored in GeoParquet files, using DuckDB for fast columnar queries and H3 spatial indexing for location-based lookups.

Results flow back through the gateway as a stream of events. The chat interface renders text responses while the map component applies layer updates, filters, and style changes. The user sees both a written answer and a visual result on the map.

Every step is logged. The AI Gateway records token usage, cost, and latency for each model call. The credit system deducts the appropriate amount from the organization’s balance. Failed calls are logged with error context for debugging.

Pack lifecycle flow

Beyond building and querying, a Spatial Pack follows a complete lifecycle from creation to consumption. Each stage enforces a specific guarantee.

graph LR
  Build["Build"] --> Validate["Validate"]
  Validate -->|pass| Hash["Hash"]
  Hash --> Publish["Publish"]
  Publish --> CDN["CDN"]
  CDN --> Load["Load"]
  Validate -->|fail| Fix["Fix &\nRebuild"]
  Fix --> Build

FIG. 3 · PACK LIFECYCLE FLOW

Build — The Pipeline executor produces layers and a manifest from source data.
Validate — The CLI checks the manifest against JSON Schema and verifies layer file existence.
Hash — BLAKE3 integrity hashes are computed for every artifact and recorded in the manifest.
Publish — The validated pack uploads to object storage at a versioned, immutable path.
Load — Consumers fetch the manifest and access individual layers via HTTP range requests.

If validation fails, the developer fixes the issue and rebuilds. Published packs are never modified — updates produce new versions.

For the full lifecycle with code examples at each stage, see Pack Lifecycle.

Next steps

Architecture Overview — Component descriptions, data formats, and architectural principles.
Getting Started — Build and validate your first Spatial Pack hands-on.