`aperta.traffic_flows`¶

Lightweight one-shot traffic-flow estimation via sampled betweenness centrality.

Estimates daily per-edge traffic volumes (interpretable as AADT once calibrated) by simulating a quick three-step travel demand model: trip generation (origin sampling weighted by population), trip distribution (per-origin destination sampling weighted by a cost-decay function and per-destination attractiveness), and route assignment (shortest-path routing on the current edge weights, accumulating per-edge counts). Outputs can be calibrated against ground-truth counter data via the helpers in aperta.calibration.

Scope and limitations. This is a one-shot estimation pass: the routing step uses the input edge weights without iterating toward congestion equilibrium. It is intended for users who (1) want a per-edge traffic-flow estimate to feed into travel-time calibration or as an accessibility feature, and (2) do not already have detailed outputs from a full traffic-assignment model (which would be a more rigorous alternative and could be plugged in directly). The library reuses aperta’s existing infrastructure — tiered OD matrices, edge-weight calibration, scipy routing backend — to keep the estimation cheap and consistent with the rest of the pipeline; it does not aim to replace a dedicated traffic-assignment tool. An iterative congestion-aware variant is theoretically possible as a future extension.

This module supplies the sampling primitive nested_node_sample. The routing + per-edge accumulation itself lives in network_processing.get_nested_edge_betweenness. A simpler alternative for small study areas — radius-limited Brandes betweenness without explicit OD sampling — also lives in network_processing. Downstream callers apply their own normalisation of the raw sampled-betweenness counts (e.g. scaling to an expected vehicle-kilometres total).

aperta.traffic_flows.nested_node_sample(pairs, weights, costs, *, cell_to_zone_node, orig_weights, cost_to_weight, n_orig, n_dest, random_state, mask=None, chosen=None)[source]¶

Sample n_dest destinations for n_orig weighted-sampled origin cells, integrating all three tiers (cell, middle, far) into one combined pool.

Per origin cell, the tier dest arrays are concatenated on the fly into one combined dest pool with per-pair scores weight * cost_to_weight(cost). Sampling is then a single np.random.choice-equivalent (JITted) over the pool. Peak memory is bounded by the largest single per-origin concatenation, not by n_orig × total_dests.

The per-zone shared scores (far tier) are computed once per zone (not per cell), so the cost_to_weight call is amortized across all cells in the zone. The middle tier (cells_to_zones) is keyed per cell — same dest zones across cells in a zone, but different per-cell costs — so it can’t amortise the same way, but the per-cell cost is what makes the score correct.

Parameters:

pairs (TieredODPairs) – destination IDs per tier.
weights (TieredODPairs) – destination weights per tier (e.g. populations), same shape as pairs. Typically the output of od_pairs.lookup_dest_column_node.
costs (TieredODPairs) – per-pair costs (e.g. line distances), same shape as pairs. Typically the output of od_pairs.get_euclidean_dists.
cell_to_zone_node (dict) – {cell_node -> zone_node} mapping; build via od_pairs.build_cell_to_zone_node_map.
orig_weights (ndarray | Series | None) – per-origin sampling weights, aligned position-wise with list(pairs.cells_to_cells.keys()). Required when chosen is None; ignored when chosen is provided.
cost_to_weight (Callable) – monotone-decreasing function mapping a cost (e.g. distance in metres) to a per-pair weight. Vectorized — receives a 1-D array.
n_orig (int) – number of origins to sample; number of destinations sampled PER origin-pick. Origin sampling is with replacement (popular origins can appear multiple times in the underlying random_state.choice); each duplicate pick generates its own batch of n_dest destinations, all i.i.d. from the same per-origin score distribution. So an origin picked k times ends up with a length-k × n_dest destination array — and contributes k× the flow downstream when get_nested_edge_betweenness walks predecessors, while still running only ONE Dijkstra from that origin (no wasted routing work; only the destination set grows). Total OD pairs sampled is exactly n_orig × n_dest regardless of duplicate distribution — useful for AADT scaling (denominator = n_orig × n_dest, no dedup correction needed). When chosen is provided, n_orig is ignored (sample size comes from len(chosen)); n_dest still controls destinations-per-pick.
n_dest (int) – number of origins to sample; number of destinations sampled PER origin-pick. Origin sampling is with replacement (popular origins can appear multiple times in the underlying random_state.choice); each duplicate pick generates its own batch of n_dest destinations, all i.i.d. from the same per-origin score distribution. So an origin picked k times ends up with a length-k × n_dest destination array — and contributes k× the flow downstream when get_nested_edge_betweenness walks predecessors, while still running only ONE Dijkstra from that origin (no wasted routing work; only the destination set grows). Total OD pairs sampled is exactly n_orig × n_dest regardless of duplicate distribution — useful for AADT scaling (denominator = n_orig × n_dest, no dedup correction needed). When chosen is provided, n_orig is ignored (sample size comes from len(chosen)); n_dest still controls destinations-per-pick.
random_state (RandomState) – numpy RandomState; the only source of randomness.
mask (TieredODPairs | None) – optional boolean TieredODPairs (build via od_pairs.make_mask). Destinations where the mask is False are removed from the sampling pool. Missing origins or missing tiers in the mask are treated as “no filter” for that origin / tier.
chosen (ndarray | None) – optional pre-sampled origin array (with replacement, so duplicates carry their n_picks weight). When provided, the internal random_state.choice(origins, n_orig, True, p) is skipped; orig_weights and n_orig are ignored. Useful when the caller pre-samples origins externally to restrict the upstream tiered_path_costs work to only origins that will actually contribute — every entry of chosen must be a key in pairs.cells_to_cells.

Return type:

dict

Returns: {origin_cell_node -> np.ndarray[dest_node]} where each value: array has length n_picks × n_dest (= n_dest for origins picked once, longer for origins picked multiple times by the with-replacement origin sampling).

aperta.traffic_flows.percentile_bin_edges(survey_costs, n_bins=20)[source]¶

Equal-probability cost-bin edges from observed trip-cost data.

Returns n_bins + 1 edges such that each bin contains roughly 1 / n_bins of the survey data by count. Suitable as the bin_edges input to bin_adjusted_dest_weights for sampling that targets the empirical cost distribution non-parametrically (no need to fit a log-normal or similar).

Parameters:

survey_costs (ndarray | Series) – observed trip costs (e.g. observed travel times). NaNs and non-finite values are dropped before percentile estimation.
n_bins (int) – number of equal-probability bins. Default 20 balances granularity vs. per-origin sample-budget headroom; 10–30 are reasonable choices.

Returns:

Sorted 1-D array of length n_bins + 1 giving bin edges.

Return type:

ndarray

aperta.traffic_flows.bin_adjusted_dest_weights(pairs, costs, dest_weights, bin_edges, *, renormalize_per_origin=True)[source]¶

Per-origin per-bin reweight of destination weights, fixing the cost-weighted sampling bias at sparse-periphery origins.

For each origin and each cost bin, the bin’s target probability mass (1 / n_bins) is divided among the destinations that fall in that bin in proportion to their existing weight W(D). Bins with no destinations contribute nothing. The result is a per-origin adjusted weight array of the same shape as dest_weights, which can be passed to nested_node_sample (or any weighted sampler) in place of the raw weights — replacing the cost_to_weight callable entirely.

Destinations whose cost falls outside [bin_edges[0], bin_edges[-1]] receive zero weight (treated as too rare to be informative).

Parameters:

pairs (TieredODPairs) – destination IDs per tier (any of the three may be None).
costs (TieredODPairs) – per-pair costs, same shape as pairs (e.g. travel times).
dest_weights (TieredODPairs) – base destination weights W(D) per pair, same shape as pairs (e.g. populations, employment counts).
bin_edges (ndarray) – n_bins + 1 sorted values, typically from percentile_bin_edges applied to a travel-survey cost column.
renormalize_per_origin (bool) – when True (default), the adjusted weights for each origin are normalised to sum to 1, so each origin has the same total sampling weight regardless of how many cost bins its destinations populate. When False, sparse-periphery origins end up with a smaller total weight (the empty bins’ target mass is not redistributed) — this naturally reduces their effective trip count, useful when the bin adjustment is the only mechanism reducing trips from sparse origins. The default True matches the recommended decoupling: bin-adjustment fixes the cost distribution only; trip-generation count stays controlled separately at the orig_weights stage.

Returns:

Same TieredODPairs subclass as pairs with per-origin adjusted weight arrays. Origins whose destinations are entirely out of range (or whose dest_weights sum to zero) receive an all-zero array.

Return type:

TieredODPairs

aperta.traffic_flows¶

`aperta.traffic_flows`¶