PyPI workbook run (Track D)

From inside your Track D workbook folder (created by pystatsv1 workbook init --track d --dest ...), run:

pystatsv1 workbook run |trackd_run|

Outputs are written under outputs/track_d/ by default. If you’re unsure what a file is for, start with Track D Outputs Guide.

To see the full chapter-by-chapter run map (D00–D23), see Track D chapter index (PyPI).

Optional: write to a custom output folder:

pystatsv1 workbook run |trackd_run| --outdir outputs/track_d_custom

Interpretation prompts (quick self-check):

What is the accounting or business measurement goal in this chapter?
Which invariant/check would catch a “numbers look fine but are wrong” mistake here?

Track D — Chapter 17

Revenue forecasting via segmentation + drivers (NSO running case)

This chapter extends the Track D planning toolkit by building a simple, explainable forecast of AR invoice revenue.

Key idea

Revenue is modeled from two drivers:

Invoice count (how many invoices we expect)
Average invoice value (the typical size of an invoice)

\[\text{Revenue} = \text{InvoiceCount} \times \text{AvgInvoiceValue}\]

To make the output actionable for planning, we segment customers into:

the top K customers by invoice revenue (each is its own segment)
an All other customers segment

What you will build

A customer segmentation table based on AR invoices.
A monthly segmented revenue table.
A baseline model selection (backtest) for each segment and each driver.
A 12-month revenue forecast per segment and in total.

Command

From the project root:

make business-ch17

Or run the script directly:

python -m scripts.business_ch17_revenue_forecasting_segmentation_drivers \
  --datadir data/synthetic/nso_v1 \
  --outdir outputs/track_d \
  --seed 123

Outputs

Artifacts are written under:

outputs/track_d/track_d/

CSV / JSON / MD files (chapter folder)

ch17_customer_segments.csv Customer-level totals and segment assignment.
ch17_ar_revenue_segment_monthly.csv Monthly segmented table with invoice count, invoice amount, and average invoice value.
ch17_series_monthly.csv Monthly TOTAL series (sum across segments).
ch17_backtest_metrics.csv MAE / MAPE metrics for candidate driver methods.
ch17_backtest_total_revenue.csv 12-month holdout backtest comparing TOTAL actual vs predicted revenue.
ch17_forecast_next12.csv Next 12 months forecast per segment and TOTAL. (Includes forecast_lo / forecast_hi for TOTAL only.)
ch17_memo.md A short memo with the chosen models and headline results.
ch17_design.json Run metadata, segmentation details, and selected driver methods.
ch17_known_events_template.json Template to record known upcoming events (optional adjustments).
ch17_figures_manifest.csv and ch17_manifest.json Figure metadata and artifact manifest.

Compatibility aliases

Two alias files are also written for convenience:

ch17_forecast_next_12m.csv (same as ch17_forecast_next12.csv)
ch17_forecast_memo.md (same as ch17_memo.md)

Figures

Figures are saved under:

outputs/track_d/track_d/figures/

ch17_fig_segment_revenue_history.png
ch17_fig_backtest_total_revenue.png
ch17_fig_forecast_total_revenue.png

Interpretation guide

Segmentation: Treat the top customers as high-signal leading indicators. Large changes in these segments often drive the TOTAL.
Invoice count: A proxy for activity/volume. If count is rising while average value is flat, you likely have growth via more orders.
Average invoice value: A proxy for pricing/contract size. If value is rising while count is flat, you may have price increases or bigger deals.
Backtest: Use the holdout window to sanity-check whether the methods are stable enough for planning. If backtest errors are large, consider shortening the horizon or adding a “known events” overlay.