aperta.traffic_flows

Lightweight one-shot traffic-flow estimation via sampled betweenness centrality.

Estimates daily per-edge traffic volumes (interpretable as AADT once calibrated) by simulating a quick three-step travel demand model: trip generation (origin sampling weighted by population), trip distribution (per-origin destination sampling weighted by a cost-decay function and per-destination attractiveness), and route assignment (shortest-path routing on the current edge weights, accumulating per-edge counts). Outputs can be calibrated against ground-truth counter data via the helpers in aperta.calibration.

Scope and limitations. This is a one-shot estimation pass: the routing step uses the input edge weights without iterating toward congestion equilibrium. It is intended for users who (1) want a per-edge traffic-flow estimate to feed into travel-time calibration or as an accessibility feature, and (2) do not already have detailed outputs from a full traffic-assignment model (which would be a more rigorous alternative and could be plugged in directly). The library reuses aperta’s existing infrastructure — tiered OD matrices, edge-weight calibration, scipy routing backend — to keep the estimation cheap and consistent with the rest of the pipeline; it does not aim to replace a dedicated traffic-assignment tool. An iterative congestion-aware variant is theoretically possible as a future extension.

This module supplies the sampling primitives (nested_node_sample) and the normalisation step (estimate_edge_flows) that turns raw sampled-betweenness counts into a per-edge volume calibrated against an expected total vehicle-kilometres figure. The routing + per-edge accumulation itself lives in network_processing.get_nested_edge_betweenness. A simpler alternative for small study areas — radius-limited Brandes betweenness without explicit OD sampling — also lives in network_processing.

aperta.traffic_flows.estimate_edge_flows(graph, weight, expected_km_driven, nested_node_sample, *, cutoff=None)[source]

Traffic-flow estimation from a nested OD sample, normalised to expected_km_driven total vehicle-kilometres.

Routes each sampled (origin, dest) pair via scipy Dijkstra, accumulates per-edge usage counts (see network_processing.get_nested_edge_betweenness), then scales so that sum(flow_e × length_e) matches expected_km_driven.

cutoff (optional): network-distance limit in weight units passed through to the per-origin Dijkstra. Set this to the upstream sampling radius (e.g. r_zones from od_pairs.get_pairs) — sampled destinations are guaranteed reachable within that radius, so the cutoff is correctness-preserving and gives a large speed-up on country-scale graphs. Default None = no cutoff.

Parameters:
Return type:

Series

aperta.traffic_flows.nested_node_sample(pairs, weights, costs, cell_to_zone_node, orig_weights, cost_to_weight, n_orig, n_dest, random_state, *, mask=None)[source]

Sample n_dest destinations for n_orig weighted-sampled origin cells, integrating all three tiers (cell, middle, far) into one combined pool.

Per origin cell, the tier dest arrays are concatenated on the fly into one combined dest pool with per-pair scores weight * cost_to_weight(cost). Sampling is then a single np.random.choice-equivalent (JITted) over the pool. Peak memory is bounded by the largest single per-origin concatenation, not by n_orig × total_dests.

The per-zone shared scores (far tier) are computed once per zone (not per cell), so the cost_to_weight call is amortized across all cells in the zone. The middle tier (cells_to_zones) is keyed per cell — same dest zones across cells in a zone, but different per-cell costs — so it can’t amortise the same way, but the per-cell cost is what makes the score correct.

Parameters:
  • pairs (TieredODPairs) – destination IDs per tier.

  • weights (TieredODPairs) – destination weights per tier (e.g. populations), same shape as pairs. Typically the output of od_pairs.dest_values.

  • costs (TieredODPairs) – per-pair costs (e.g. line distances), same shape as pairs. Typically the output of od_pairs.get_euclidean_dists.

  • cell_to_zone_node (dict) – {cell_node -> zone_node} mapping; build via od_pairs.build_cell_to_zone_node_map.

  • orig_weights (ndarray | Series) – per-origin sampling weights, aligned position-wise with list(pairs.cells_to_cells.keys()).

  • cost_to_weight (Callable) – monotone-decreasing function mapping a cost (e.g. distance in metres) to a per-pair weight. Vectorized — receives a 1-D array.

  • n_orig (int) – number of origins to sample; number of destinations per sampled origin. Sampling is with replacement on both ends; repeats in the origin sample are deduped (each origin processed once).

  • n_dest (int) – number of origins to sample; number of destinations per sampled origin. Sampling is with replacement on both ends; repeats in the origin sample are deduped (each origin processed once).

  • random_state (RandomState) – numpy RandomState; the only source of randomness.

  • mask (TieredODPairs | None) – optional boolean TieredODPairs (build via od_pairs.make_mask). Destinations where the mask is False are removed from the sampling pool. Missing origins or missing tiers in the mask are treated as “no filter” for that origin / tier.

Return type:

dict

Returns: {origin_cell_node -> np.ndarray[dest_node]} of length n_dest.

aperta.traffic_flows.percentile_bin_edges(survey_costs, n_bins=20)[source]

Equal-probability cost-bin edges from observed trip-cost data.

Returns n_bins + 1 edges such that each bin contains roughly 1 / n_bins of the survey data by count. Suitable as the bin_edges input to bin_adjusted_dest_weights for sampling that targets the empirical cost distribution non-parametrically (no need to fit a log-normal or similar).

Parameters:
  • survey_costs (ndarray | Series) – observed trip costs (e.g. observed travel times). NaNs and non-finite values are dropped before percentile estimation.

  • n_bins (int) – number of equal-probability bins. Default 20 balances granularity vs. per-origin sample-budget headroom; 10–30 are reasonable choices.

Returns:

Sorted 1-D array of length n_bins + 1 giving bin edges.

Return type:

ndarray

aperta.traffic_flows.bin_adjusted_dest_weights(pairs, costs, dest_weights, bin_edges, *, renormalize_per_origin=True)[source]

Per-origin per-bin reweight of destination weights, fixing the cost-weighted sampling bias at sparse-periphery origins.

For each origin and each cost bin, the bin’s target probability mass (1 / n_bins) is divided among the destinations that fall in that bin in proportion to their existing weight W(D). Bins with no destinations contribute nothing. The result is a per-origin adjusted weight array of the same shape as dest_weights, which can be passed to nested_node_sample (or any weighted sampler) in place of the raw weights — replacing the cost_to_weight callable entirely.

Destinations whose cost falls outside [bin_edges[0], bin_edges[-1]] receive zero weight (treated as too rare to be informative).

Parameters:
  • pairs (TieredODPairs) – destination IDs per tier (any of the three may be None).

  • costs (TieredODPairs) – per-pair costs, same shape as pairs (e.g. travel times).

  • dest_weights (TieredODPairs) – base destination weights W(D) per pair, same shape as pairs (e.g. populations, employment counts).

  • bin_edges (ndarray) – n_bins + 1 sorted values, typically from percentile_bin_edges applied to a travel-survey cost column.

  • renormalize_per_origin (bool) – when True (default), the adjusted weights for each origin are normalised to sum to 1, so each origin has the same total sampling weight regardless of how many cost bins its destinations populate. When False, sparse-periphery origins end up with a smaller total weight (the empty bins’ target mass is not redistributed) — this naturally reduces their effective trip count, useful when the bin adjustment is the only mechanism reducing trips from sparse origins. The default True matches the recommended decoupling: bin-adjustment fixes the cost distribution only; trip-generation count stays controlled separately at the orig_weights stage.

Returns:

Same TieredODPairs subclass as pairs with per-origin adjusted weight arrays. Origins whose destinations are entirely out of range (or whose dest_weights sum to zero) receive an all-zero array.

Return type:

TieredODPairs