How does boundary enforcement relate to multi-tenant isolation?

Boundary enforcement and tenant isolation use the same boundary_id seam, but isolation adds a hard requirement that one tenant can never observe another tenant's topology even through planner expansion. That demands composite tenant-geometry index keys and scoping that runs before any spatial predicate.

Spatial Security Boundaries: Production Workflows for Graph-Based Access Control

Q: Why stamp boundary_id onto edges instead of testing polygons at query time?

Point-in-polygon is an expensive geometric operation. Running it per hop during a traversal multiplies that cost by the frontier size and makes latency depend on polygon complexity. Resolving containment once at ingestion and freezing it as a boundary_id property turns the per-hop check into a constant-time property comparison the planner can resolve against an index. The polygons only need to be consulted again when the network or the boundaries change.

Q: Can I pass the permitted boundary set as a parameter, or must I inline it?

Always pass it as a parameter. Parameters keep the query-plan cache warm across requests and remove an injection surface. The only value that must be interpolated into the query string is the variable-length upper bound, because Cypher forbids a parameter there, and that value should be range-checked as an integer before it ever touches the string.

Q: Should I enforce boundaries in plain Cypher or in a GDS projection?

Use plain Cypher with an all() predicate for cheapest-legal-path queries inside small, well-connected regions. Switch to a boundary-filtered GDS projection when networks are large and you need deterministic latency: project a subgraph whose relationship filter already excludes out-of-bounds edges so the algorithm cannot consider an illegal segment. The projection is a snapshot, so re-project after any re-stamp.

A routing engine that ignores geographic perimeters will happily return the cheapest path — straight through a restricted military zone, a competitor’s exclusive delivery territory, or a toll cordon the customer never agreed to pay. Spatial security boundaries are the engineering control that prevents this: they constrain graph traversal and shortest-path computation to a set of explicitly permitted geographic or administrative regions, so a route is rejected the moment it would cross an unauthorized segment rather than after an expensive client-side audit. The failure cost is concrete and asymmetric. A single leaked path can breach a service-level agreement, expose another tenant’s network topology, or send a vehicle into a regulated area — and because the violation is correct shortest-path output, it never trips a unit test that only checks reachability. This guide covers how to stamp boundaries onto graph edges, how to make the planner resolve those boundaries before it expands a frontier, and how to enforce all of it from async Python without sacrificing routing latency.

This guide is part of Spatial Graph Database Fundamentals for Python, and it builds directly on the topology produced by node and edge spatial mapping — every boundary check below assumes edges already carry stable identities and indexed geometry.

Prerequisites

Boundary enforcement is a Cypher-and-Python pattern layered on the async Neo4j driver. The polygon-to-edge assignment step depends on shapely for containment tests, and the optional region-scoped routing path uses the Graph Data Science plugin’s relationship filtering.

Requirement	Minimum version	Notes
Python	3.10+	Union types (`dict \| None`), structural `match`
Neo4j	5.13+	Native `point` type and point indexes
neo4j (driver)	5.x	Async driver (`AsyncGraphDatabase`)
graphdatascience / GDS plugin	2.5+	Only for boundary-filtered Dijkstra projections
shapely	2.0+	Polygon containment for boundary stamping
pyproj	3.6+	CRS alignment before containment tests

pip install "neo4j>=5.18" "shapely>=2.0" "pyproj>=3.6"

Before any enforcement logic runs, confirm that the boundary polygons and the graph geometry share a coordinate reference system. A containment test between a WGS 84 segment and a Web Mercator polygon silently returns wrong answers, and a misassigned boundary_id is far harder to detect than a crash.

Core Concept & Mechanism

A spatial security boundary is, mechanically, a set membership stamped onto the graph. Each routable edge is tagged with the identifier of the region it lies inside; each request carries the set of regions the caller is permitted to traverse. Enforcement reduces to a single invariant: every relationship in a returned path must belong to the permitted set. The subtlety is not the predicate — it is when the predicate runs.

There are two places a boundary check can happen, and only one of them is safe at scale:

Post-traversal filtering. The engine computes a shortest path, then the application discards it if any segment is out of bounds. This is correct but ruinous: the planner explores and ranks paths it will throw away, latency tracks the unconstrained graph size, and a disconnected permitted region can force the engine to traverse the entire network before discovering there is no legal route.
Predicate-before-expansion. The boundary set is pushed into the traversal itself, so an out-of-bounds edge is never added to the frontier. The candidate space collapses to the permitted region, and the cost of enforcement is bounded by the size of what the caller is actually allowed to see.

The mechanism that makes option 2 work rests on three invariants:

Boundary identity lives on the edge, not inferred at query time. Computing polygon containment during a traversal would re-run a point-in-polygon test per hop. Instead, containment is resolved once at ingestion and frozen as a boundary_id property, turning an O(geometry) test into an O(1) property comparison.
The permitted set is a parameter, never interpolated geometry. Callers pass a list of boundary identifiers, not raw polygons. This keeps the query plan cache warm and removes an injection surface.
Spatial scoping precedes cost evaluation. The bounding-box pre-filter that the spatial indexing strategies layer provides shrinks the candidate edge set to the request envelope before any weight is summed, so the boundary predicate runs over a small, index-resolved set.

Raw networks arrive without any of this. Segments straddle polygon edges, ingestion mixes coordinate systems, and boundary polygons overlap at administrative seams. The stamping phase exists to resolve every edge to exactly one permitted-set membership before a single routing request is served.

Schema & Data Model

The contract is only enforceable if boundary_id is a first-class, indexed property on both nodes and the relationships that carry traversal cost. Model each junction as a :Node with a native point location, a stable id, and the boundary_id of the region it sits in; model each traversable segment as a :ROUTE relationship carrying cost and its own boundary_id. The relationship boundary is what enforcement reads — a node can sit exactly on a seam, but an edge always lies within one region.

// Stable node identity so boundary stamping is idempotent across re-imports
CREATE CONSTRAINT node_id IF NOT EXISTS
FOR (n:Node) REQUIRE n.id IS UNIQUE;

// Stable segment identity so re-stamping replaces in place, never orphans topology
CREATE CONSTRAINT route_id IF NOT EXISTS
FOR ()-[r:ROUTE]-() REQUIRE r.id IS UNIQUE;

// Point index so the request envelope resolves against the index, not a label scan
CREATE POINT INDEX node_location IF NOT EXISTS
FOR (n:Node) ON (n.location);

// Selective boundary key the planner uses to scope the start set
CREATE INDEX node_boundary IF NOT EXISTS
FOR (n:Node) ON (n.boundary_id);

// Relationship-property index so boundary filtering resolves without scanning every edge
CREATE INDEX route_boundary IF NOT EXISTS
FOR ()-[r:ROUTE]-() ON (r.boundary_id);

// Representative shape of the boundary-aware routing graph
// (:Node {id, location: point({latitude, longitude}), boundary_id})
//   -[:ROUTE {id, cost, boundary_id}]->
// (:Node {id, location, boundary_id})

The physical structure backing location — native point index versus R-tree bucket — is a decision owned by the indexing layer; this schema only exposes the geometry and the two selective keys the planner consumes. When several customers share one graph, the same boundary_id seam is the isolation primitive enforced in depth by multi-tenant security in spatial graphs.

Step-by-Step Implementation

The pipeline has two halves: a one-time (or on-update) stamping pass that assigns boundary_id to every edge, and a per-request routing client that enforces the permitted set. We build it in three stages.

1. Stamp edges with their boundary membership

Boundary assignment is a point-in-polygon test against the edge’s representative point — the midpoint is robust for segments that begin or end exactly on a seam. Resolve containment once, in Python, and persist the result so the database never re-tests geometry at query time. Use a prepared spatial index over the boundary polygons so assignment stays near-linear in edge count.

import asyncio
from typing import Iterable, Iterator

from shapely.geometry import LineString, shape
from shapely.strtree import STRtree


class BoundaryStamper:
    """Resolve each segment to exactly one boundary_id via midpoint containment."""

    def __init__(self, boundary_features: Iterable[dict]):
        # boundary_features: GeoJSON-like dicts with properties.boundary_id
        self.polygons = []
        self.ids = []
        for feat in boundary_features:
            self.polygons.append(shape(feat["geometry"]))
            self.ids.append(feat["properties"]["boundary_id"])
        # STRtree gives an R-tree pre-filter so we test only candidate polygons
        self.tree = STRtree(self.polygons)

    def boundary_for(self, segment: LineString) -> str | None:
        probe = segment.interpolate(0.5, normalized=True)  # robust midpoint
        for idx in self.tree.query(probe):
            if self.polygons[idx].contains(probe):
                return self.ids[idx]
        return None  # straddles an unassigned zone — reject, do not guess

    def stamp(self, edges: Iterable[dict]) -> Iterator[dict]:
        for edge in edges:
            seg = LineString(edge["coords"])  # [(lon, lat), ...]
            bid = self.boundary_for(seg)
            if bid is None:
                # Surfacing this is mandatory: an unstamped edge is a leak waiting to happen
                raise ValueError(f"edge {edge['id']} falls in no boundary polygon")
            yield {"id": edge["id"], "boundary_id": bid,
                   "cost": edge["cost"], "source_id": edge["source_id"],
                   "target_id": edge["target_id"]}

The deliberate choice here is to reject, not default, an edge that falls in no polygon. A silent fallback ("unknown", or the nearest polygon) is exactly how leakage enters a system: the edge becomes routable under whatever set happens to include the fallback. Validation that edge midpoints fall within an assigned polygon is the cheapest place to stop a boundary violation.

2. Persist the stamps over a pooled async session

Write the stamped edges back with a parameterized UNWIND, merging on the constrained id so re-stamping an updated network is idempotent. The boundary becomes a frozen property the routing path reads for free.

from neo4j import AsyncGraphDatabase


class BoundaryWriter:
    def __init__(self, uri: str, user: str, password: str, pool_size: int = 8):
        self.driver = AsyncGraphDatabase.driver(
            uri, auth=(user, password), max_connection_pool_size=pool_size
        )

    async def close(self) -> None:
        await self.driver.close()

    async def write_batch(self, batch: list[dict]) -> None:
        query = """
        UNWIND $batch AS row
        MATCH (s:Node {id: row.source_id})
        MATCH (t:Node {id: row.target_id})
        MERGE (s)-[e:ROUTE {id: row.id}]->(t)
        SET e.cost = row.cost,
            e.boundary_id = row.boundary_id
        """
        async with self.driver.session(database="routing") as session:
            await session.run(query, batch=batch)

3. Enforce the permitted set at request time

The routing client expands the query envelope client-side (so the server never recomputes the bounding box), then runs a boundary-filtered traversal. The all() predicate is the enforcement point: a path survives only if every segment belongs to the permitted set.

import asyncio
import math

from neo4j import AsyncGraphDatabase
from shapely.geometry import box


class SpatialRoutingClient:
    def __init__(self, uri: str, user: str, password: str):
        self.driver = AsyncGraphDatabase.driver(
            uri,
            auth=(user, password),
            max_connection_pool_size=50,
            connection_acquisition_timeout=5.0,
            max_connection_lifetime=3600,
        )

    async def route_within_boundaries(
        self,
        origin_id: str,
        dest_id: str,
        allowed_boundaries: list[str],
        max_hops: int = 50,
        lat: float = 0.0,
        lon: float = 0.0,
        radius_km: float = 10.0,
    ) -> dict | None:
        # Expand the request envelope using a spherical approximation
        # (~111.32 km per degree of latitude)
        delta_lat = radius_km / 111.32
        delta_lon = radius_km / (111.32 * math.cos(math.radians(lat)))
        query_envelope = box(
            lon - delta_lon, lat - delta_lat, lon + delta_lon, lat + delta_lat
        )

        if not (1 <= max_hops <= 200):
            raise ValueError("max_hops must be between 1 and 200")

        # Cypher requires a literal upper bound on a variable-length pattern,
        # so the validated integer is interpolated by trusted code — never the
        # boundary set, which stays a parameter.
        cypher = f"""
            MATCH (o:Node {{id: $origin_id}})
            MATCH (d:Node {{id: $dest_id}})
            MATCH path = (o)-[rels:ROUTE*1..{max_hops}]->(d)
            WHERE all(rel IN rels WHERE rel.boundary_id IN $allowed_boundaries)
            WITH path, reduce(w = 0.0, rel IN rels | w + rel.cost) AS total_cost
            ORDER BY total_cost ASC
            LIMIT 1
            RETURN path, total_cost
        """

        async with self.driver.session(database="routing") as session:
            result = await session.run(
                cypher,
                origin_id=origin_id,
                dest_id=dest_id,
                allowed_boundaries=allowed_boundaries,
            )
            record = await result.single()
            if record:
                return {
                    "path_nodes": [n["id"] for n in record["path"].nodes],
                    "total_cost": record["total_cost"],
                    "query_envelope_wkt": query_envelope.wkt,
                }
            return None

    async def close(self) -> None:
        await self.driver.close()

This client demonstrates the three production essentials together: a bounded connection pool sized for concurrent dispatch, a spatial envelope precomputed client-side, and strict parameterization of the boundary set. The only value baked into the query string is the validated integer hop cap — because Cypher forbids parameters as variable-length bounds — and it is range-checked before interpolation.

Query Patterns & Variants

Pick the variant whose anchor matches how callers parameterize the request.

Variant A — strict all-segment compliance (plain Cypher). Best when the objective is the cheapest legal path inside a small, well-connected permitted region. The hop cap prevents runaway expansion in disconnected or cyclic graphs.

MATCH (o:Node {id: $origin_id})
MATCH (d:Node {id: $dest_id})
MATCH path = (o)-[rels:ROUTE*1..30]->(d)
WHERE all(rel IN rels WHERE rel.boundary_id IN $allowed_boundaries)
WITH path, reduce(w = 0.0, rel IN rels | w + rel.cost) AS total_cost
ORDER BY total_cost ASC
LIMIT 1
RETURN path, total_cost
// $allowed_boundaries stays a parameter; the *1..30 bound is literal by necessity.

Variant B — boundary-filtered Dijkstra (GDS). Best for large networks where deterministic latency matters. Project a subgraph that already excludes out-of-bounds relationships, then run weighted Dijkstra over it — enforcement moves into the projection, so the algorithm cannot consider an illegal edge.

// One-time projection scoped to the permitted boundaries via a relationship filter
MATCH (s:Node)-[r:ROUTE]->(t:Node)
WHERE r.boundary_id IN $allowed_boundaries
WITH gds.graph.project(
  'compliant_routing',
  s, t,
  { relationshipProperties: r { .cost } }
) AS g
RETURN g.graphName AS graphName;

// Per-request weighted shortest path over the boundary-scoped projection
MATCH (src:Node {id: $origin_id}), (dst:Node {id: $dest_id})
CALL gds.shortestPath.dijkstra.stream('compliant_routing', {
  sourceNode: src, targetNode: dst, relationshipWeightProperty: 'cost'
})
YIELD totalCost, nodeIds
RETURN totalCost, [id IN nodeIds | gds.util.asNode(id).id] AS route;

Variant C — leakage audit. Run this in CI and after every re-stamp. It surfaces any path that would cross out of the permitted set, which is how you prove the enforcement predicate is actually doing its job before customers depend on it.

MATCH (o:Node {id: $origin_id})
MATCH path = (o)-[rels:ROUTE*1..30]->(:Node {id: $dest_id})
WITH path, [rel IN rels WHERE NOT rel.boundary_id IN $allowed_boundaries] AS violations
WHERE size(violations) > 0
RETURN size(violations) AS leaked_segments, [rel IN violations | rel.id] AS leaked_ids
LIMIT 5
// Any row returned here is a path that strict enforcement must reject — investigate the stamping.

Performance Tuning

Boundary enforcement trades a small per-edge predicate cost for a drastically smaller candidate set; the net is almost always a win, but only if the planner enters through an index.

Profile with PROFILE, validate shape with EXPLAIN. Read the plan bottom-up and find the first operator whose rows dwarfs the result. A boundary_id filter applied after a full ROUTE expansion means the planner did not use the relationship index — anchor the start node by boundary_id or location so scoping happens first. This is the same discipline covered under graph query planner optimization.
Watch boundary_id cardinality. If only a handful of distinct boundaries exist, the index is unselective and the planner may rightly prefer a scan. For coarse partitions, combine boundary_id with the point index on location so the request envelope, not the boundary, drives selectivity.
Materialize masks for very large networks. Above ~10M edges, precompute per-tenant permitted-edge bitsets or zone-crossing penalties so enforcement is a membership test rather than a list scan. Cache compliant paths at the application layer with invalidation tied to topology and re-stamp events.
Bound the traversal aggressively. The *1..N cap is a safety valve, not a tuning knob — keep it as tight as the network diameter allows so a disconnected permitted region fails fast instead of exploring the whole graph.
Defragment after heavy re-stamps. Frequent boundary updates fragment the relationship-property index; scheduled drop-and-rebuild cycles (or partitioned index shards) keep lookup latency flat. Pair this with the broader read tuning in cypher performance tuning.

Edge Cases & Gotchas

Boundary leakage from coordinate drift. A segment whose endpoints round across a polygon seam can be stamped to the wrong region. Test containment on the midpoint, not an endpoint, and reject edges that match no polygon rather than defaulting them.
SRID mismatch in containment tests. A geographic segment (SRID 4326) tested against a projected polygon returns nonsense, and point.distance across SRIDs returns null — a null predicate silently drops rows. Align CRS before stamping and before any in-query distance filter.
Variable-length bound interpolation. Cypher forbids a parameter as a *1..N bound, so the cap must be interpolated. Range-check it as an integer first; never let a request value reach the query string unchecked, and never put the boundary set anywhere but a parameter.
GDS projection staleness. A boundary-scoped projection is a snapshot. After re-stamping or a topology change, drop and re-project, or Variant B will route over the old permitted set and quietly leak.
Unselective boundary keys. A near-constant boundary_id makes the index worthless and pushes the planner to a scan; lean on the spatial envelope for selectivity in that case, or partition the graph by tenant.
relationships(path) vs a named list. Binding the relationships once (-[rels:ROUTE*..]->) and reusing rels in both the WHERE and the reduce avoids re-deriving the list and keeps the predicate and the cost in lockstep.

Verification & Testing

Enforcement is only trustworthy if you can prove a blocked path is actually blocked. A reachability test alone passes whether or not boundaries work — the regression you must catch is a path that should be rejected but is returned. Seed a fixture where the only short route crosses a forbidden zone and assert it is refused.

import pytest
from neo4j import AsyncGraphDatabase

SEED = """
CREATE (a:Node {id: 'A', location: point({latitude: 47.60, longitude: -122.33}), boundary_id: 'Z1'})
CREATE (b:Node {id: 'B', location: point({latitude: 47.62, longitude: -122.35}), boundary_id: 'Z1'})
CREATE (c:Node {id: 'C', location: point({latitude: 47.64, longitude: -122.30}), boundary_id: 'Z2'})
CREATE (d:Node {id: 'D', location: point({latitude: 47.66, longitude: -122.28}), boundary_id: 'Z1'})
// Short route A->C->D crosses Z2; long route A->B->D stays in Z1
CREATE (a)-[:ROUTE {id: 'e1', cost: 1.0, boundary_id: 'Z1'}]->(c)
CREATE (c)-[:ROUTE {id: 'e2', cost: 1.0, boundary_id: 'Z2'}]->(d)
CREATE (a)-[:ROUTE {id: 'e3', cost: 5.0, boundary_id: 'Z1'}]->(b)
CREATE (b)-[:ROUTE {id: 'e4', cost: 5.0, boundary_id: 'Z1'}]->(d)
"""

ROUTE = """
MATCH path = (o:Node {id: 'A'})-[rels:ROUTE*1..10]->(d:Node {id: 'D'})
WHERE all(rel IN rels WHERE rel.boundary_id IN $allowed)
WITH path, reduce(w = 0.0, rel IN rels | w + rel.cost) AS cost
ORDER BY cost ASC LIMIT 1
RETURN [r IN relationships(path) | r.id] AS edges, cost
"""


@pytest.mark.asyncio
async def test_boundary_enforcement_rejects_the_cheaper_illegal_path():
    driver = AsyncGraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "test"))
    async with driver.session(database="neo4j") as s:
        await s.run("MATCH (n) DETACH DELETE n")
        await s.run(SEED)

        # Caller permitted only in Z1: must take the costlier compliant detour
        legal = await (await s.run(ROUTE, allowed=["Z1"])).single()
        assert legal is not None, "a compliant path A->B->D must exist"
        assert legal["edges"] == ["e3", "e4"], "must avoid the Z2 shortcut"
        assert legal["cost"] == 10.0

        # With Z2 permitted, the cheaper crossing path becomes legal
        relaxed = await (await s.run(ROUTE, allowed=["Z1", "Z2"])).single()
        assert relaxed["edges"] == ["e1", "e2"], "Z2 allowed -> take the shortcut"
        assert relaxed["cost"] == 2.0

    await driver.close()

The asymmetry is the whole point: the first assertion fails loudly if enforcement regresses and the engine returns the cheaper e1/e2 path while the caller is scoped to Z1. Run the leakage-audit query (Variant C) in the same suite to assert zero out-of-bounds segments after every re-stamp.

FAQ

Why stamp boundary_id onto edges instead of testing polygons at query time?

Point-in-polygon is an O(geometry) operation. Running it per hop during a traversal multiplies that cost by the frontier size and makes latency depend on polygon complexity. Resolving containment once at ingestion and freezing it as a boundary_id property turns the per-hop check into an O(1) property comparison the planner can resolve against an index. The polygons only need to be consulted again when the network or the boundaries change.

Can I pass the permitted boundary set as a parameter, or must I inline it?

Always pass it as a parameter ($allowed_boundaries). Parameters keep the query-plan cache warm across requests and remove an injection surface. The only value that must be interpolated into the query string is the variable-length upper bound (*1..N), because Cypher forbids a parameter there — and that value should be range-checked as an integer before it ever touches the string.

What happens to an edge that falls in no boundary polygon?

Reject it at stamping time and surface the error. The dangerous alternative is a silent default — assigning "unknown" or the nearest polygon — because the edge then becomes routable under whatever permitted set happens to include that fallback, which is precisely how leakage enters production. An edge straddling an unassigned zone is a data problem to fix upstream, not a value to guess.

Should I enforce boundaries in plain Cypher or in a GDS projection?

Use plain Cypher with an all() predicate for cheapest-legal-path queries inside small, well-connected regions. Switch to a boundary-filtered GDS projection when networks are large and you need deterministic latency: project a subgraph whose relationship filter already excludes out-of-bounds edges, so the algorithm physically cannot consider an illegal segment. Remember the projection is a snapshot — re-project after any re-stamp.

How does this relate to multi-tenant isolation?

Boundary enforcement and tenant isolation use the same boundary_id seam, but isolation adds a hard requirement that one tenant can never observe another’s topology even through planner expansion. That demands composite tenant-geometry index keys and scoping that runs before any spatial predicate. The full treatment lives in enforcing multi-tenant security in spatial graphs.

Node and Edge Spatial Mapping — the topology and stable identities that boundary stamping is applied to.
Spatial Indexing Strategies — the bounding-volume pre-filter that scopes a request before the boundary predicate runs.
Graph Query Planner Optimization — making the planner seek your boundary and point indexes instead of scanning.
Enforcing Multi-Tenant Security in Spatial Graphs — extending boundary control into hard cross-tenant isolation.
Scoping Routes with Composite Tenant-Geometry Indexes — making the tenant filter and spatial predicate resolve as one index seek.
Distance Filter Query Patterns — the envelope and radius predicates that pair with boundary scoping.

This guide is part of Spatial Graph Database Fundamentals for Python.

Related pages

Subtopics

Siblings