`aperta.network_processing`¶

Graph-construction and -annotation helpers for transport networks.

Aperta operates on networkx.Graph (and its multi/directed variants) as its canonical graph type. This module is OSM-agnostic — it consumes whatever edge/node attributes the caller provides and never reads OSM-specific tag content. (OSM-aware helpers — tag classification by highway rank, OSMnx- wrapped consolidation, OSM POI fetch — live in aperta_atlas.osm / aperta_atlas.osm_helpers. See the 2026-06-07 CHANGELOG entry.)

What’s here:

Intersection topology flags: flag_node_intersection_topology — network-agnostic per-node n_streets / is_t_junction / is_4way from graph degree.
Spatial primitives: snap_features_to_nodes — KDTree nearest-within- radius assignment of point locations to graph nodes, mutating a per-node boolean flag. Used for any “label nearby nodes” workflow (POI snap, AOI membership, sensor attribution), not OSM-specific.
Edge / node attribute helpers: aggregate node attributes onto edges (aggregate_nodes_to_edges), aggregate edge attributes onto nodes (aggregate_edges_to_nodes), and write attribute values through to a graph in a tolerant way (set_nx_edge_attributes_filled).
Edge betweenness sampling: get_nested_edge_betweenness runs the per-origin Dijkstra + path-walking accumulator used by the traffic-flow estimation pipeline in traffic_flows.py.

Sibling modules cover the workflow steps that consume this module’s output:

aperta.network_snap — snap-target resolution: snap_to_network_nodes, insert_projected_nodes, and companions.
aperta.routing_prep — mode-aware preparation: prepare_network, compute_snap_eligibility, PreparedGraph, MODE_DEFAULTS.

aperta.network_processing.parse_edge_id(s)[source]¶

Parse a ‘u:v[:k]’ edge-id string into a (u, v[, key]) tuple.

The string form is aperta’s canonical on-disk edge encoding (produced when serialising graph edges to CSV / GeoPackage). Numeric-looking parts are coerced back to int; non-numeric parts pass through as strings.

Parameters:: s (str)
Return type:: tuple

aperta.network_processing.attach_node_properties(graph, df)[source]¶

Attach a node-properties DataFrame to graph in place.

df.index must be the graph’s node IDs (strict set equality). Layer property tables from arbitrary sources onto a network — typically used after loading a graph from disk to attach companion CSVs.

Parameters:

graph (Graph)
df (DataFrame)

Return type:

None

aperta.network_processing.attach_edge_properties(graph, df)[source]¶

Attach an edge-properties DataFrame to graph in place.

df.index is either:

tuple-keyed (u, v) (non-multigraph) or (u, v, key) (multigraph) — for callers building the frame programmatically; or
string-keyed in aperta’s canonical ‘u:v[:key]’ form — for the common case of loading edge properties from a CSV produced via edge-id serialisation. Auto-parsed via parse_edge_id.

For non-multigraphs the parsed key element is dropped before matching. Keys must match the graph’s edges with strict set equality.

Parameters:

graph (Graph)
df (DataFrame)

Return type:

None

aperta.network_processing.verify_odm_against_network(odm, network_nodes, check_destinations=False)[source]¶

Verify ODM origins (and optionally destinations) appear in the network’s nodes.

Use this after loading an ODM (e.g. Context.get_tiered_odm, on each populated tier dict) to catch ODM / network drift — e.g. an ODM that was built against an older snapshot of the network than what the caller is now routing on. Pure data check; does no I/O.

check_destinations=True validates the values too — only meaningful for ‘idx’-style ODMs whose values are destination node IDs. For value-style ODMs (floats, type codes), leave it False.

Parameters:

odm (dict)
network_nodes (DataFrame | GeoDataFrame)
check_destinations (bool)

Return type:

None

aperta.network_processing.set_nx_edge_attributes_filled(graph, attr, attr_name, fill_value=0, strict=False)[source]¶

Set per-edge attribute attr_name on graph, filling missing edges with fill_value.

nx.set_edge_attributes silently leaves edges absent from the input mapping without the attribute, which is a footgun for downstream code that expects the attribute to be present on every edge. This wrapper writes fill_value instead.

Parameters:

graph (MultiGraph) – a MultiGraph (uses (u, v, k) edge keys).
attr (dict | Series) – edge → value mapping, keyed by (u, v, k) tuples.
attr_name (str) – edge attribute name to write.
fill_value – value to assign to edges missing from attr. Default 0.
strict (bool) – if True, raise DataError when attr is missing any of the graph’s edges. Default False (silently fill).

Returns:

graph, mutated in place.

aperta.network_processing.get_nested_edge_betweenness(graph, nested_node_sample, weight, *, cutoff=None)[source]¶

Edge usage counts from a nested (origin → sampled-destinations) sample.

For each origin in nested_node_sample, runs a single-source Dijkstra on graph (via scipy.sparse.csgraph.dijkstra with return_predecessors), walks the predecessor chain from each sampled destination back to the origin, and adds 1 to every edge on the path. The result is the weighted sum over all sampled OD pairs — a “traffic-stress”-style edge usage count, not classical Brandes’ betweenness.

Repeated destinations in the per-origin sample naturally count multiple times (each occurrence adds 1 to its path’s edges), so weight comes from the upstream sampling step’s destination distribution.

Parameters:

graph (Graph) – networkx graph (any variant). MultiGraph parallel edges with the same (u, v) collapse to the min-weight edge for routing, and the chosen key is the one credited in the output.
nested_node_sample (dict) – {origin_node -> array_of_dest_nodes}, typically from traffic_flows.nested_node_sample. Origins are unique; duplicate destinations within an origin’s array are fine.
weight (str) – edge attribute name to use as the per-edge cost (e.g. ‘duration_s’). Required — there’s no “all edges weight 1” default since traffic-flow sampling always needs real costs.
cutoff (float | None) – optional network-distance cutoff in weight units. Passed to csg.dijkstra(limit=cutoff) to truncate each per-origin search once destinations beyond the cutoff are unreachable anyway. Set this to the upstream sampling radius (typically r_zones from od_pairs.get_pairs) — destinations sampled within that radius are guaranteed reachable within cutoff, and the truncation gives a large speed-up on country-scale graphs. Default None = no cutoff.

Returns:

pd.Series indexed by edge ID — (u, v) for plain graphs, (u, v, k) for multigraphs — with the accumulated edge usage count.

Return type:

Series

aperta.network_processing.aggregate_nodes_to_edges(df_nodes, cols, node_edge_relations, *, aggregator)[source]¶

Aggregate node-level features onto the edges they touch.

Parameters:

df_nodes (DataFrame) – list of nodes, supplied as a DataFrame.
cols (list[str]) – list of columns in df_nodes to be mapped to edges.
node_edge_relations (str | Graph) – if str, must list the edges belonging to each node in column ‘node_edge_relations’ in df_nodes, separated by a comma (,). Otherwise, supply an nx.Graph where the ID of each node corresponds to the index in df_nodes.
aggregator (str | Callable) – how to aggregate values from different nodes onto a single edge. One of ‘max’, ‘min’, ‘mean’, ‘sum’, ‘median’, or a callable that takes a 1-D numpy array of per-node values and returns a scalar. String aggregators use the nan-safe numpy variants (silently skip NaN values).

Return type:

DataFrame

aperta.network_processing.snap_features_to_nodes(graph, locations, *, flag_name, max_distance)[source]¶

Snap a list of point locations to nearest-within-radius graph nodes, writing is_<flag_name>=1 on matched nodes and is_<flag_name>=0 on all others. Mutates graph in place.

Use case: re-attach OSM-tagged obstacle features (traffic signals, stops, give-ways, roundabout midpoints) to a consolidated graph after their original OSM nodes were merged away by osmnx.consolidate_intersections. The same primitive works for any “label nearby nodes with a flag” workflow — POI snap, AOI membership, sensor-station attribution, etc.

Uses a scipy.spatial.KDTree query with distance_upper_bound for a strict radius cap; locations whose nearest node is farther than max_distance are silently dropped (no flag set). For an OSMnx-style MultiDiGraph the node coordinates come from each node’s x / y attributes; the graph must be in a metric CRS so max_distance is meaningful in meters.

Parameters:

graph (MultiDiGraph) – MultiDiGraph with node x and y attributes in a metric CRS.
locations (list[tuple[float, float]]) – list of (x, y) tuples in the same CRS as the graph’s node coords.
flag_name (str) – bare attribute name; the boolean is written as f”is_{flag_name}” (so flag_name=’traffic_signal’ produces is_traffic_signal).
max_distance (float) – maximum snap distance, in the graph CRS’s units. Locations farther than this are dropped.

Return type:

None

aperta.network_processing.flag_node_intersection_topology(graph)[source]¶

Mutate graph in place to add per-node topology-only intersection flags. Network-agnostic — works on any graph regardless of where it came from (OSM, a custom road dataset, a synthetic graph) since it inspects only neighbour count, not edge tags.

Per-node attributes written:

n_streets — number of distinct neighbour nodes (degree in the undirected sense, ignoring edge direction and parallel edges). The “physical” intersection size: 1 = dead-end, 2 = passthrough, 3 = T-junction, 4+ = 4-way intersection or denser.
is_t_junction — 1 if n_streets == 3, else 0.
is_4way — 1 if n_streets >= 4, else 0.

is_t_junction and is_4way are mutually exclusive — a 4-way node carries only is_4way. (Degree 1 / 2 nodes — leaves and passthroughs — get neither.)

OSM-tag-based per-node classifications (highway rank, _major / _anchor variants) live in the companion function aperta_atlas.osm.flag_node_osm_classification, which must be called AFTER this one if you want the rank-conditional variants (since they’re conditional on is_t_junction / is_4way). A project working with a non-OSM road network (e.g., LUMOS’s simplified 3-tier networks) can call this function alone and supply its own project-specific classifier on top.

Per-node obstacle flags (is_traffic_signal, is_stop, etc.) are written by aperta_atlas.osm.consolidate_intersections’s obstacle re-attachment pass — also OSM-specific and in aperta-lab.

Parameters:: graph (Graph)
Return type:: None

aperta.network_processing.aggregate_edges_to_nodes(graph, edge_attribute, *, aggregator='max')[source]¶

For each node in graph, aggregate edge_attribute across its connected edges.

The inverse of aggregate_nodes_to_edges (which propagates per-node features onto edges). Common use: classify each node by the highest-class road that touches it (aggregator=’max’) — useful for filtering snap targets in snap_to_network_nodes (skip motorway-only nodes, etc.).

For MultiGraphs / MultiDiGraphs, parallel edges between the same (u, v) each contribute their own value — for ‘max’ this is harmless, for ‘mean’ it slightly weights duplicated edges. For OSMnx graphs (where parallel edges typically carry identical attributes), this is fine.

Parameters:

graph (Graph) – NetworkX graph.
edge_attribute (str | Callable) – name of an edge attribute (str) or a callable (u, v, data) -> value. Edges where the attribute is missing and the string form is used contribute NaN.
aggregator (str | Callable) – ‘max’ (default), ‘min’, ‘mean’, ‘sum’, ‘median’, or a callable that takes a 1-D numpy array of per-edge values and returns a scalar. String aggregators use the nan-safe numpy variants (silently skip NaN edge values).

Returns:

pd.Series indexed by node ID with the per-node aggregated value. Isolated nodes (no edges) are absent from the result.

Return type:

Series

aperta.network_processing.smooth_node_attribute(graph, node_attr, *, length_scale, length_attr='length', out_attr=None, n_iterations=1)[source]¶

Topology-weighted Gaussian smoothing of a per-node attribute.

For each node v, the smoothed value is a weighted mean of v itself (distance 0 → weight 1) plus its one-hop graph neighbours, each weighted by a Gaussian kernel of the connecting edge length:

smoothed[v] = ( val[v] + Σ_u w(v, u) · val[u] )  /  ( 1 + Σ_u w(v, u) )
w(v, u)     = exp(-(edge_length(v, u) / length_scale)² / 2)

length_scale is the Gaussian’s 1-σ: neighbours at this edge-length distance get weight ≈ 0.61; at 2·length_scale, ≈ 0.14; at 3·length_scale, ≈ 0.01. Bigger length_scale → more smoothing.

Topology-based — neighbours are graph-incident, not Euclidean-near. Avoids the Euclidean-ambiguity failure modes of disk-mean smoothing (bridges / parallel roads at different levels / switchbacks all stay correctly separated because they’re not connected by an edge). For directed graphs, uses nx.all_neighbors so both successors and predecessors contribute. For MultiGraphs, the shortest parallel edge sets the distance.

n_iterations > 1 re-applies the one-hop kernel; weights compound, so a 2-iteration pass implicitly reaches 2-hop neighbours via the convolution of two Gaussians — a cheap way to widen the effective kernel without running BFS or assembling a graph Laplacian.

NaN / missing-value semantics:

A NaN at v itself passes through unchanged.
NaN values at neighbours are skipped (don’t enter the sum).
Nodes without the input attribute are left untouched.

Parameters:

graph (Graph) – graph with node_attr set on a subset of nodes and length_attr set on every edge.
node_attr (str) – per-node attribute to smooth.
length_scale (float) – Gaussian 1-σ in length_attr units (typically metres for road networks). Pick to match the noise length scale you want to suppress.
length_attr (str) – edge attribute holding edge length. Default ‘length’.
out_attr (str | None) – where to write the smoothed values. If None, overwrites node_attr.
n_iterations (int) – number of one-hop passes. Default 1.

Return type:

None

Mutates graph in place via nx.set_node_attributes.

aperta.network_processing¶

`aperta.network_processing`¶