aperta.traffic_flows¶
Lightweight one-shot traffic-flow estimation via sampled betweenness centrality.
Estimates daily per-edge traffic volumes (interpretable as AADT once calibrated) by simulating a quick three-step travel demand model: trip generation (origin sampling weighted by population), trip distribution (per-origin destination sampling weighted by a cost-decay function and per-destination attractiveness), and route assignment (shortest-path routing on the current edge weights, accumulating per-edge counts). Outputs can be calibrated against ground-truth counter data via the helpers in aperta.calibration.
Scope and limitations. This is a one-shot estimation pass: the routing step uses the input edge weights without iterating toward congestion equilibrium. It is intended for users who (1) want a per-edge traffic-flow estimate to feed into travel-time calibration or as an accessibility feature, and (2) do not already have detailed outputs from a full traffic-assignment model (which would be a more rigorous alternative and could be plugged in directly). The library reuses aperta’s existing infrastructure — tiered OD matrices, edge-weight calibration, scipy routing backend — to keep the estimation cheap and consistent with the rest of the pipeline; it does not aim to replace a dedicated traffic-assignment tool. An iterative congestion-aware variant is theoretically possible as a future extension.
This module supplies the sampling primitives (nested_node_sample) and the normalisation step (estimate_edge_flows) that turns raw sampled-betweenness counts into a per-edge volume calibrated against an expected total vehicle-kilometres figure. The routing + per-edge accumulation itself lives in network_processing.get_nested_edge_betweenness. A simpler alternative for small study areas — radius-limited Brandes betweenness without explicit OD sampling — also lives in network_processing.
- aperta.traffic_flows.estimate_edge_flows(graph, weight, expected_km_driven, nested_node_sample, *, cutoff=None)[source]¶
Traffic-flow estimation from a nested OD sample, normalised to expected_km_driven total vehicle-kilometres.
Routes each sampled (origin, dest) pair via scipy Dijkstra, accumulates per-edge usage counts (see network_processing.get_nested_edge_betweenness), then scales so that sum(flow_e × length_e) matches expected_km_driven.
cutoff (optional): network-distance limit in weight units passed through to the per-origin Dijkstra. Set this to the upstream sampling radius (e.g. r_zones from od_pairs.get_pairs) — sampled destinations are guaranteed reachable within that radius, so the cutoff is correctness-preserving and gives a large speed-up on country-scale graphs. Default None = no cutoff.
- aperta.traffic_flows.nested_node_sample(pairs, weights, costs, cell_to_zone_node, orig_weights, cost_to_weight, n_orig, n_dest, random_state, *, mask=None)[source]¶
Sample n_dest destinations for n_orig weighted-sampled origin cells, integrating all three tiers (cell, middle, far) into one combined pool.
Per origin cell, the tier dest arrays are concatenated on the fly into one combined dest pool with per-pair scores weight * cost_to_weight(cost). Sampling is then a single np.random.choice-equivalent (JITted) over the pool. Peak memory is bounded by the largest single per-origin concatenation, not by n_orig × total_dests.
The per-zone shared scores (far tier) are computed once per zone (not per cell), so the cost_to_weight call is amortized across all cells in the zone. The middle tier (cells_to_zones) is keyed per cell — same dest zones across cells in a zone, but different per-cell costs — so it can’t amortise the same way, but the per-cell cost is what makes the score correct.
- Parameters:
pairs (TieredODPairs) – destination IDs per tier.
weights (TieredODPairs) – destination weights per tier (e.g. populations), same shape as pairs. Typically the output of od_pairs.dest_values.
costs (TieredODPairs) – per-pair costs (e.g. line distances), same shape as pairs. Typically the output of od_pairs.get_euclidean_dists.
cell_to_zone_node (dict) – {cell_node -> zone_node} mapping; build via od_pairs.build_cell_to_zone_node_map.
orig_weights (ndarray | Series) – per-origin sampling weights, aligned position-wise with list(pairs.cells_to_cells.keys()).
cost_to_weight (Callable) – monotone-decreasing function mapping a cost (e.g. distance in metres) to a per-pair weight. Vectorized — receives a 1-D array.
n_orig (int) – number of origins to sample; number of destinations per sampled origin. Sampling is with replacement on both ends; repeats in the origin sample are deduped (each origin processed once).
n_dest (int) – number of origins to sample; number of destinations per sampled origin. Sampling is with replacement on both ends; repeats in the origin sample are deduped (each origin processed once).
random_state (RandomState) – numpy RandomState; the only source of randomness.
mask (TieredODPairs | None) – optional boolean TieredODPairs (build via od_pairs.make_mask). Destinations where the mask is False are removed from the sampling pool. Missing origins or missing tiers in the mask are treated as “no filter” for that origin / tier.
- Return type:
Returns: {origin_cell_node -> np.ndarray[dest_node]} of length n_dest.
- aperta.traffic_flows.percentile_bin_edges(survey_costs, n_bins=20)[source]¶
Equal-probability cost-bin edges from observed trip-cost data.
Returns
n_bins + 1edges such that each bin contains roughly1 / n_binsof the survey data by count. Suitable as thebin_edgesinput to bin_adjusted_dest_weights for sampling that targets the empirical cost distribution non-parametrically (no need to fit a log-normal or similar).- Parameters:
survey_costs (ndarray | Series) – observed trip costs (e.g. observed travel times). NaNs and non-finite values are dropped before percentile estimation.
n_bins (int) – number of equal-probability bins. Default 20 balances granularity vs. per-origin sample-budget headroom; 10–30 are reasonable choices.
- Returns:
Sorted 1-D array of length
n_bins + 1giving bin edges.- Return type:
- aperta.traffic_flows.bin_adjusted_dest_weights(pairs, costs, dest_weights, bin_edges, *, renormalize_per_origin=True)[source]¶
Per-origin per-bin reweight of destination weights, fixing the cost-weighted sampling bias at sparse-periphery origins.
For each origin and each cost bin, the bin’s target probability mass (
1 / n_bins) is divided among the destinations that fall in that bin in proportion to their existing weightW(D). Bins with no destinations contribute nothing. The result is a per-origin adjusted weight array of the same shape asdest_weights, which can be passed to nested_node_sample (or any weighted sampler) in place of the raw weights — replacing thecost_to_weightcallable entirely.Destinations whose cost falls outside
[bin_edges[0], bin_edges[-1]]receive zero weight (treated as too rare to be informative).- Parameters:
pairs (TieredODPairs) – destination IDs per tier (any of the three may be
None).costs (TieredODPairs) – per-pair costs, same shape as
pairs(e.g. travel times).dest_weights (TieredODPairs) – base destination weights
W(D)per pair, same shape aspairs(e.g. populations, employment counts).bin_edges (ndarray) –
n_bins + 1sorted values, typically from percentile_bin_edges applied to a travel-survey cost column.renormalize_per_origin (bool) – when
True(default), the adjusted weights for each origin are normalised to sum to 1, so each origin has the same total sampling weight regardless of how many cost bins its destinations populate. WhenFalse, sparse-periphery origins end up with a smaller total weight (the empty bins’ target mass is not redistributed) — this naturally reduces their effective trip count, useful when the bin adjustment is the only mechanism reducing trips from sparse origins. The defaultTruematches the recommended decoupling: bin-adjustment fixes the cost distribution only; trip-generation count stays controlled separately at theorig_weightsstage.
- Returns:
Same
TieredODPairssubclass aspairswith per-origin adjusted weight arrays. Origins whose destinations are entirely out of range (or whosedest_weightssum to zero) receive an all-zero array.- Return type: