Track D — Chapter 15
PyPI workbook run (Track D)
From inside your Track D workbook folder (created by pystatsv1 workbook init --track d --dest ...), run:
pystatsv1 workbook run |trackd_run|
Outputs are written under outputs/track_d/ by default.
If you’re unsure what a file is for, start with Track D Outputs Guide.
To see the full chapter-by-chapter run map (D00–D23), see Track D chapter index (PyPI).
Optional: write to a custom output folder:
pystatsv1 workbook run |trackd_run| --outdir outputs/track_d_custom
Interpretation prompts (quick self-check):
What is the accounting or business measurement goal in this chapter?
Which invariant/check would catch a “numbers look fine but are wrong” mistake here?
Forecasting Foundations and Forecast Hygiene (NSO running case)
In Chapter 14 you built an explainable driver model (COGS explained by operational activity). In Chapter 15 you switch gears: you treat the accounting output as a time series and learn how to produce a defensible baseline forecast.
This chapter is deliberately “low-tech” on purpose:
You will compare three simple baseline forecasts.
You will backtest them on a 12-month holdout window.
You will pick a method using error metrics, not vibes.
You will create an assumptions log template so your forecast is auditable.
The goal is not to build the best forecast possible. The goal is to build a forecast process that an accountant/analyst can explain, reproduce, and improve.
What is “forecast hygiene”?
Forecast hygiene is the set of practices that keeps forecasting useful and honest:
Define the question (what are we forecasting, for whom, and for what decision?).
Define the grain (monthly vs weekly vs daily; consolidated vs by product/location).
Document assumptions (what is expected to change and why).
Measure error with backtesting (how wrong have we been, using history?).
Version the artifacts (inputs, method choice, metrics, memo, and figures).
In accounting, this matters because forecasts often feed budgeting, cash planning, staffing, and performance conversations. A forecast that can’t be explained or reproduced becomes a source of risk.
How this ties to earlier chapters
Chapter 15 builds directly on concepts you have already practiced:
Chapter 7–9 (data prep + reporting discipline): consistent month keys, clean joins, and reliable output artifacts.
Chapter 13 (controlled comparisons): “compare like with like” is the mindset behind backtesting (train vs holdout).
Chapter 14 (driver lens): the forecast is still a driver lens (planning tool), not a claim of causation or a guarantee.
What you will build
A clean monthly time series (from the NSO income statement)
From statements_is_monthly.csv you will build a wide monthly table:
month(YYYY-MM)revenue(Sales Revenue)cogs(Cost of Goods Sold)gross_profitoperating_expensesnet_income
Three baseline forecasts (for revenue)
You will compare three baseline methods:
naive_last — next month equals the last observed month
moving_avg_3 — next month equals the average of the last 3 months
linear_trend — fit a straight line through time and extrapolate
A 12-month backtest + error metrics
You will hold out the last 12 months, forecast them using the first 12 months, and compute:
MAE (mean absolute error): typical size of the miss (in currency units)
MAPE (mean absolute percentage error): typical miss as a percent of actual
Then you will select the baseline method with the lowest MAPE (tie-break on MAE).
A forecast memo + auditable assumptions log
You will produce a short memo and an assumptions log CSV template so that:
the forecast is shareable with stakeholders,
the method selection is documented,
the “why” behind adjustments is captured in a durable, versioned form.
How to run Chapter 15
Prerequisite: generate the NSO dataset (once)
If you already ran Chapter 14, you likely already have the NSO dataset at:
data/synthetic/nso_v1
If not:
make business-nso-sim
make business-validate
Run the Chapter 15 analysis
make business-ch15
By default this runs:
python -m scripts.business_ch15_forecasting_foundations \
--datadir data/synthetic/nso_v1 \
--outdir outputs/track_d \
--seed 123
Outputs
All artifacts are written under:
outputs/track_d/track_d
Open this first (recommended order):
ch15_backtest_metrics.csv(which baseline wins?)figures/ch15_fig_backtest_overlay.png(does the chosen method track reality?)ch15_forecast_memo.md(shareable summary)ch15_forecast_next12.csv(numbers you plug into planning)
Core tables (CSV)
ch15_series_monthly.csvClean monthly time series used for forecasting (revenue + key IS lines).ch15_backtest_predictions.csvMonth-by-month backtest predictions for each method (actual vs predicted + errors).ch15_backtest_metrics.csvSummary metrics (MAE and MAPE) by method.ch15_forecast_next12.csvSelected baseline method forecast for the next 12 months, including a simple range.ch15_assumptions_log_template.csvTemplate you fill in when business context requires adjustments.
Design + narrative (JSON/MD)
ch15_forecast_design.jsonMachine-readable “cover sheet”: series, train/test windows, methods compared, selection rule, chosen method, and forecast months.ch15_forecast_memo.mdHuman-readable memo with a small metrics table and the forecast table.
Figures (PNG) + manifest
Figures are written under outputs/track_d/track_d/figures and listed in:
ch15_figures_manifest.csv
Troubleshooting
“Expected statements_is_monthly.csv … but not found.”
You likely pointed --datadir at the wrong folder.
Correct:
--datadir data/synthetic/nso_v1Also confirm you ran:
make business-nso-sim
Outputs are missing or in a surprising folder
This chapter writes into --outdir plus /track_d (to match Track D conventions).
What’s next (Chapter 16+)
Chapter 16: seasonality and seasonal baselines
Chapter 17: rolling forecasts + scenario planning
Chapter 18: forecasting with drivers (combine operational drivers + time series)
End-of-chapter exercises
Re-run Chapter 15 but forecast COGS instead of revenue.
Change the holdout window to 6 months. Does the best baseline method change?
Use the assumptions log template to document a hypothetical pricing change next quarter.
See also
Appendix 14D (Artifact QA checklist): use the same pre-share mindset before circulating forecasts.
Appendix 14E (Apply to real world): adapting the workflow to your own chart of accounts and datasets.