aperta.network_processing¶
Graph-construction and -annotation helpers for transport networks (OSM-flavoured).
Aperta operates on networkx.Graph (and its multi/directed variants) as its canonical graph type. This module supplies the operations that take a raw (typically OSMnx) graph and produce a richly-annotated one ready for downstream snap / route work:
Intersection consolidation: consolidate_intersections wraps osmnx.consolidate_intersections but preserves intersection-attribute nodes (traffic signals, stop signs, roundabouts) that the OSMnx default drops, which matters for any route-level analysis that counts those features (Section §3.3 of the toolkit paper). Calls flag_node_intersection_topology and flag_node_osm_classification as post-processing.
Highway-class machinery: OSM_HIGHWAY_RANKS, collapse_osm_highway_lists_by_rank, lanes_per_direction, the per-node topology / OSM-classification flag functions.
Edge / node attribute helpers: aggregate node attributes onto edges (aggregate_nodes_to_edges), aggregate edge attributes onto nodes (aggregate_edges_to_nodes), and write attribute values through to a graph in a tolerant way (set_nx_edge_attributes_filled).
Edge betweenness sampling: get_nested_edge_betweenness runs the per-origin Dijkstra + path-walking accumulator used by the traffic-flow estimation pipeline in traffic_flows.py.
Graphml round-trip: load_consolidated_graphml with the dtype maps that preserve custom integer flags through .graphml serialization.
Sibling modules cover the workflow steps that consume this module’s output:
aperta.network_snap — snap-target resolution: snap_to_network_nodes, insert_projected_nodes, nodes_incident_to_edges, transport_centroid, split_edge_at_point, split_two_way_edge_at_point. Re-exported below for back-compat.
aperta.routing_prep — mode-aware preparation: prepare_network, compute_snap_eligibility, PreparedGraph, MODE_DEFAULTS. Re-exported below for back-compat.
New code should import directly from aperta.network_snap / aperta.routing_prep; the re-exports here exist so existing callsites (from aperta.network_processing import prepare_network etc.) keep working without churn.
- aperta.network_processing.collapse_osm_highway_lists_by_rank(graph)[source]¶
Mutate graph in place: collapse list-valued edge highway to a single string (the highest-rank value via OSM_HIGHWAY_RANKS).
After osmnx.consolidate_intersections, edges built from multiple source edges have highway as a list of strings. Most downstream code expects a single string and silently picks the first element (e.g. osmnx.add_edge_speeds does this internally), which is not principled when the merged edges differ in road class. This helper picks the most major value instead (motorway > trunk > primary > … > unclassified).
Unknown highway names map to rank -1; when a list contains only unknowns the resulting collapsed value is the unknown with the highest dict-order. Edges without a highway attribute are left alone.
Auto-called from inside consolidate_intersections; callable standalone for graphs consolidated by external tooling.
- Parameters:
graph (Graph)
- Return type:
None
- aperta.network_processing.set_nx_edge_attributes_filled(graph, attr, attr_name, fill_value=0, strict=False)[source]¶
Set per-edge attribute attr_name on graph, filling missing edges with fill_value.
nx.set_edge_attributes silently leaves edges absent from the input mapping without the attribute, which is a footgun for downstream code that expects the attribute to be present on every edge. This wrapper writes fill_value instead.
- Parameters:
graph (MultiGraph) – a MultiGraph (uses (u, v, k) edge keys).
attr (dict | Series) – edge → value mapping, keyed by (u, v, k) tuples.
attr_name (str) – edge attribute name to write.
fill_value – value to assign to edges missing from attr. Default 0.
strict (bool) – if True, raise DataError when attr is missing any of the graph’s edges. Default False (silently fill).
- Returns:
graph, mutated in place.
- aperta.network_processing.get_nested_edge_betweenness(graph, nested_node_sample, weight=None, *, cutoff=None)[source]¶
Edge usage counts from a nested (origin → sampled-destinations) sample.
For each origin in nested_node_sample, runs a single-source Dijkstra on graph (via scipy.sparse.csgraph.dijkstra with return_predecessors), walks the predecessor chain from each sampled destination back to the origin, and adds 1 to every edge on the path. The result is the weighted sum over all sampled OD pairs — a “traffic-stress”-style edge usage count, not classical Brandes’ betweenness.
Repeated destinations in the per-origin sample naturally count multiple times (each occurrence adds 1 to its path’s edges), so weight comes from the upstream sampling step’s destination distribution.
- Parameters:
graph (Graph) – networkx graph (any variant). MultiGraph parallel edges with the same (u, v) collapse to the min-weight edge for routing, and the chosen key is the one credited in the output.
nested_node_sample (dict) – {origin_node -> array_of_dest_nodes}, typically from traffic_flows.nested_node_sample. Origins are unique; duplicate destinations within an origin’s array are fine.
weight (str | None) – edge attribute name to use as the per-edge cost (e.g. ‘duration_s’). Required — there’s no “all edges weight 1” default since traffic-flow sampling always needs real costs.
cutoff (float | None) – optional network-distance cutoff in weight units. Passed to csg.dijkstra(limit=cutoff) to truncate each per-origin search once destinations beyond the cutoff are unreachable anyway. Set this to the upstream sampling radius (typically r_zones from od_pairs.get_pairs) — destinations sampled within that radius are guaranteed reachable within cutoff, and the truncation gives a large speed-up on country-scale graphs. Default None = no cutoff.
- Returns:
pd.Series indexed by edge ID — (u, v) for plain graphs, (u, v, k) for multigraphs — with the accumulated edge usage count.
- Return type:
- aperta.network_processing.aggregate_nodes_to_edges(df_nodes, cols, node_edge_relations, *, aggregator)[source]¶
Aggregate node-level features onto the edges they touch (sum or mean).
- Parameters:
df_nodes (DataFrame) – list of nodes, supplied as a DataFrame.
cols (list[str]) – list of columns in df_nodes to be mapped to edges.
node_edge_relations (str | Graph) – if str, must list the edges belonging to each node in column ‘node_edge_relations’ in df_nodes, separated by a comma (,). Otherwise, supply an nx.Graph where the ID of each node corresponds to the index in df_nodes.
aggregator (str) – how to aggregate values from different nodes onto a single edge. One of ‘sum’, ‘mean’, ‘median’.
- Return type:
- aperta.network_processing.lanes_per_direction(edge_data)[source]¶
Per-direction lane count for a directed edge. OSM-specific: reads the OSM lanes and oneway tag conventions.
OSM’s lanes tag is the total lane count across both directions on two-way roads, and OSMnx inherits the same value on both directional edges. Any per-direction quantity (directional AADT, per-lane capacity) is therefore off by ~2× on two-way segments without correction — and biased unequally between mostly-one-way road classes (motorways) and mostly-two-way ones (primary / secondary), which a single coefficient can’t absorb.
- Rules:
oneway=True: all lanes are in this direction → return lanes.
lanes missing: OSM implicit default (1 per direction) → return 1.
lanes ≤ 1: can’t split a single lane → return 1.
otherwise: return lanes / 2.
Pure function over edge_data — caller decides whether to write the result back as an edge attribute. consolidate_intersections calls this for every consolidated edge and stores the result as lanes_per_direction.
- aperta.network_processing.load_consolidated_graphml(filepath, *, node_dtypes=None, edge_dtypes=None, **kwargs)[source]¶
Load a graphml saved by consolidate_intersections, casting aperta’s custom is_* / *_highway_rank / is_snap_eligible_<mode> / cost_excluded_<mode> / is_virtual attrs back to their proper types. OSM-specific — expects graphs in the OSMnx node/edge attribute convention; for non-OSM pickle/graphml saves, call nx.read_graphml directly and apply your own project-specific dtype casts.
Thin wrapper around osmnx.load_graphml. Three sources of casts get merged in (in order; later ones win):
_CONSOLIDATED_NODE_DTYPES / _CONSOLIDATED_EDGE_DTYPES — fixed casts for attrs consolidate_intersections writes (e.g. n_streets, is_t_junction, lanes_per_direction).
Prefix-scan of the graphml <key> schema for the dynamic per-mode flags written by prepare_network / compute_snap_eligibility / insert_projected_nodes: is_snap_eligible_<mode> and is_virtual (per-node); cost_excluded_<mode> (per-edge). Handles any mode label, including subtypes like car_night, without hardcoding.
Caller-supplied node_dtypes / edge_dtypes overrides.
Without the prefix-scan, the per-mode bool flags round-trip as the literal strings ‘True’ / ‘False’, and bool(‘False’) == True would silently invert the if not d.get(cost_excluded_flag, False) test in _compute_snap_eligible_nodes.
- Parameters:
filepath – path to a .graphml produced by consolidate_intersections or prepare_network / compute_snap_eligibility.
node_dtypes (dict | None) – optional override merged on top (caller’s values win).
edge_dtypes (dict | None) – optional override merged on top (caller’s values win).
**kwargs – forwarded to osmnx.load_graphml.
- Returns:
nx.MultiDiGraph for graphs saved from a directed source, or nx.MultiGraph for undirected ones. (OSMnx’s docstring claims MultiDiGraph unconditionally; that’s wrong — nx.read_graphml honors the file’s <graph edgedefault> attribute.)
- aperta.network_processing.extract_obstacle_locations(graph, *, obstacle_node_tags=None, detect_roundabouts=True)[source]¶
Pull obstacle + roundabout (x, y) locations from a raw OSMnx graph.
Companion to consolidate_intersections. Returns the two structures the consolidator needs (obstacle_xy, roundabout_xy) so callers can extract obstacles once from a canonical source (typically the raw car graph — the most signal-complete) and reuse for every network type’s consolidation. This matters because OSMnx’s per-network-type filters drop ways that signals sit on (e.g. trunk roads excluded from walk graphs), losing those signal nodes entirely from the walk graph’s node set; passing the union of locations via obstacle_locations= / roundabout_locations= to consolidate_intersections reattaches them to whichever consolidated node is nearest in each network.
- aperta.network_processing.consolidate_intersections(graph, tolerance, *, obstacle_buffer=30.0, obstacle_node_tags=None, obstacle_locations=None, detect_roundabouts=True, roundabout_locations=None, node_attr_aggs=None, edge_attr_aggs=None, drop_edge_attrs=None)[source]¶
OSMnx intersection consolidation + obstacle-aware re-flagging.
Wraps osmnx.consolidate_intersections(rebuild_graph=True) with the post-processing OSMnx alone misses: traffic-signal / stop / give-way nodes typically sit a few metres off the geometric intersection centre, so OSMnx’s tolerance-based merge can throw those nodes away rather than carrying the highway=traffic_signals tag onto the surviving consolidated node. The result is a consolidated graph in which most intersections are not flagged as signalised even when they actually are — a distortion for any edge-weight model that penalises signals.
This wrapper captures obstacle locations from the original graph before consolidation, then spatially re-attaches them to the nearest surviving consolidated node within obstacle_buffer metres. The same trick handles roundabouts, whose junction=roundabout tag lives on edges (not nodes) in OSM and is otherwise lost when the roundabout collapses to a single consolidated node.
The returned graph has the per-node attributes set by flag_node_intersection_topology (n_streets, is_t_junction, is_4way) and flag_node_osm_classification (max_highway_rank, min_highway_rank, is_t_junction_major, is_4way_major, is_t_junction_anchor, is_4way_anchor), plus one is_<name> per requested obstacle type, plus is_roundabout if detect_roundabouts=True. Edge highway lists from the consolidation are collapsed to the highest-rank single string via collapse_osm_highway_lists_by_rank. Each edge also gets lanes_per_direction (the OSM lanes tag corrected for two-way roads — see lanes_per_direction()). Node IDs are new integer IDs (per OSMnx behaviour) — caller must re-snap geo units to the consolidated graph.
OSM-specific. The wrapper consumes OSMnx graphs, reads the OSM highway / junction / traffic_signals tag conventions, and writes OSM_HIGHWAY_RANKS-derived per-node attributes. Projects working with non-OSM road networks (e.g. LUMOS’s simplified 3-tier network) should skip consolidation entirely and call flag_node_intersection_topology directly for the network-agnostic is_t_junction / is_4way flags.
Geometry guarantee: every consolidated edge carries a geometry LineString (OSMnx attaches one during the rebuild). This isn’t true of raw OSMnx graphs — simplify=True omits geometry from pure point-to-point edges (~10 % of edges typically), and downstream code that needs per-edge geometry (e.g. dual-graph construction, plotting with curvature) on a raw graph has to call osmnx.graph_to_gdfs(…, fill_edge_geometry=True) and copy geometry back. Consolidating first sidesteps that step.
- Parameters:
graph (MultiDiGraph) – an OSMnx MultiDiGraph (projected; tolerance is in graph CRS units, usually metres). osmnx is required (optional extra osm).
tolerance (float) – nodes within this distance are merged. Typical urban values: 5–15 m; ~25 m for sparser networks.
obstacle_buffer (float) – max distance to which an obstacle from the original graph is re-attached to a consolidated node. Should be at least as large as tolerance; default 30 m comfortably covers signalised intersections.
obstacle_node_tags (dict[str, tuple[str, str]] | None) – {flag_name -> (osm_key, osm_value)} — OSM node tags to extract as obstacles. Default: {‘traffic_signal’: (‘highway’, ‘traffic_signals’)}. Add ‘stop’: (‘highway’, ‘stop’), ‘give_way’: (‘highway’, ‘give_way’), etc., as needed.
obstacle_locations (dict[str, list[tuple[float, float]]] | None) – pre-supplied {flag_name -> [(x, y), …]} map. When given, the obstacle extraction from obstacle_node_tags is skipped — useful when obstacles come from a non-OSM source or were captured upstream.
detect_roundabouts (bool) – if True (default), edges with junction=roundabout are detected before consolidation and their midpoints get re-attached as is_roundabout.
roundabout_locations (list[tuple[float, float]] | None) – pre-supplied list of roundabout midpoints [(x, y), …]. When given, skips the edge-based roundabout detection from detect_roundabouts.
node_attr_aggs (dict | None) – passed through to ox.consolidate_intersections. Any per-node attribute not listed here that varies across the nodes being merged will be carried through as a list of values.
edge_attr_aggs (dict | None) – passed through to ox.consolidate_intersections to control how per-edge attributes are aggregated when parallel edges between the same (u, v) are collapsed.
drop_edge_attrs (list[str] | None) – edge attributes to drop after consolidation. Use for attributes that osmnx’s aggregation leaves in a confusing list-of-values form. Defaults to _DEFAULT_DROP_EDGE_ATTRS.
- Returns:
Consolidated nx.MultiDiGraph with new integer node IDs.
- aperta.network_processing.flag_node_intersection_topology(graph)[source]¶
Mutate graph in place to add per-node topology-only intersection flags. Network-agnostic — works on any graph regardless of where it came from (OSM, a custom road dataset, a synthetic graph) since it inspects only neighbour count, not edge tags.
Per-node attributes written:
n_streets — number of distinct neighbour nodes (degree in the undirected sense, ignoring edge direction and parallel edges). The “physical” intersection size: 1 = dead-end, 2 = passthrough, 3 = T-junction, 4+ = 4-way intersection or denser.
is_t_junction — 1 if n_streets == 3, else 0.
is_4way — 1 if n_streets >= 4, else 0.
is_t_junction and is_4way are mutually exclusive — a 4-way node carries only is_4way. (Degree 1 / 2 nodes — leaves and passthroughs — get neither.)
OSM-tag-based per-node classifications (highway rank, _major / _anchor variants) live in the companion function flag_node_osm_classification, which must be called AFTER this one if you want the rank-conditional variants (since they’re conditional on is_t_junction / is_4way). A project working with a non-OSM road network (e.g., LUMOS’s simplified 3-tier networks) can call this function alone and supply its own project-specific classifier on top.
Per-node obstacle flags (is_traffic_signal, is_stop, etc.) live in consolidate_intersections, which is also OSM-specific.
- Parameters:
graph (Graph)
- Return type:
None
- aperta.network_processing.flag_node_osm_classification(graph)[source]¶
Mutate graph in place to add OSM-tag-based per-node classification attributes derived from the per-edge highway tag (OSM convention).
Reads the per-edge highway attribute via OSM_HIGHWAY_RANKS and the per-node is_t_junction / is_4way flags. Call flag_node_intersection_topology first so those flags are present.
Per-node attributes written:
max_highway_rank — max OSM_HIGHWAY_RANKS value over edges incident to this node (-1 for unknown / not-a-real-road, e.g. footways).
min_highway_rank — same with min.
is_t_junction_major — is_t_junction AND min_highway_rank >= 3 (every incident edge is tertiary or better — a “fully classified” T-junction with no minor branches).
is_4way_major — is_4way AND min_highway_rank >= 3.
is_t_junction_anchor — is_t_junction AND max_highway_rank >= 3 AND min_highway_rank <= 5 (at least one tertiary-or-better edge, and not exclusively trunk / motorway — a trip-anchor T-junction where car trips can naturally begin or end).
is_4way_anchor — is_4way AND the same rank condition.
The two OSM-derived intersection tiers — _major, _anchor — capture progressively different selection criteria for downstream snap targets and edge-weight features:
`_major`: intersections where every connecting street is at least tertiary class. Used when only “real road” junctions matter (e.g., generating a coarse zone-snap candidate set).
`_anchor`: intersections that touch at least one main road (tertiary or better) and aren’t purely highway interchanges. Used as priority snap targets for car routing — trips begin and end at anchor nodes.
Non-OSM networks: this function only fires on graphs whose edges carry the OSM highway attribute (or OSM_HIGHWAY_RANKS-compatible string values for it). For networks with a different classification scheme (e.g., LUMOS’s simplified 3-tier network with highway / autostrasse / main_street tiers), write a project-specific classifier that follows the same per-node-attribute pattern.
- Parameters:
graph (Graph)
- Return type:
None
- aperta.network_processing.aggregate_edges_to_nodes(graph, edge_attribute, *, aggregator='max')[source]¶
For each node in graph, aggregate edge_attribute across its connected edges.
The inverse of aggregate_nodes_to_edges (which propagates per-node features onto edges). Common use: classify each node by the highest-class road that touches it (aggregator=’max’) — useful for filtering snap targets in snap_to_network_nodes (skip motorway-only nodes, etc.).
For MultiGraphs / MultiDiGraphs, parallel edges between the same (u, v) each contribute their own value — for ‘max’ this is harmless, for ‘mean’ it slightly weights duplicated edges. For OSMnx graphs (where parallel edges typically carry identical attributes), this is fine.
- Parameters:
graph (Graph) – NetworkX graph.
edge_attribute (str | Callable) – name of an edge attribute (str) or a callable (u, v, data) -> value. Edges where the attribute is missing and the string form is used contribute NaN.
aggregator (str | Callable) – ‘max’ (default), ‘min’, ‘mean’, ‘sum’, or a callable that takes a 1-D numpy array of per-edge values and returns a scalar. NaN handling is left to the aggregator (‘max’/’min’/’mean’/’sum’ use the nan-safe numpy variants and silently skip NaN edge values).
- Returns:
pd.Series indexed by node ID with the per-node aggregated value. Isolated nodes (no edges) are absent from the result.
- Return type: