============================================================= Chapter 8 -- Descriptive Statistics for Financial Performance ============================================================= .. |trackd_run| replace:: d08 .. include:: _includes/track_d_run_strip.rst By Chapter 7 we have an *analysis-ready* General Ledger: * ``gl_tidy.csv`` -- line-level tidy GL (one row per journal line) * ``gl_monthly_summary.csv`` -- monthly rollup by account Chapter 8 answers the next practical question: **"Now that the accounting data is tidy, how do we summarize performance and variability in a way that helps business decisions?"** This chapter focuses on descriptive statistics that accountants use every day: * level (mean / median) * spread (variance / standard deviation, coefficient of variation) * tails and skew (quantiles; why "average" can be misleading) * simple stability checks (rolling mean / rolling std; z-score style flags) It also includes an A/R-focused section because receivables are a common source of cash-flow surprises in small business. Learning goals ============== Accounting goals ---------------- * Turn monthly financial statements into *KPIs* (gross margin %, net margin %, etc.). * Use variability measures to reason about operational stability. * Connect A/R behavior to cash-flow risk using practical metrics: * credit sales vs collections * A/R ending balance and approximate Days Sales Outstanding (DSO) * a simple *FIFO application* of collections to invoices to estimate a distribution of "days outstanding". Python/data goals ----------------- * Convert long-form statement tables to wide data for analysis (``pivot_table``). * Compute rolling statistics with ``Series.rolling``. * Build "analysis artifacts" as CSV + a JSON summary/data dictionary. * Keep scripts deterministic and testable. Inputs ====== Chapter 8 reads from the NSO v1 synthetic dataset: * ``chart_of_accounts.csv`` * ``gl_journal.csv`` * ``statements_is_monthly.csv`` * ``statements_bs_monthly.csv`` * ``ar_events.csv`` (added in Chapter 6) Outputs ======= This chapter writes the following files to ``outputs/track_d``: ``gl_kpi_monthly.csv`` A compact monthly "performance dashboard" built from Income Statement + Balance Sheet lines. Includes ratios and rolling statistics. ``ar_monthly_metrics.csv`` Monthly receivables metrics: * credit sales vs collections (from GL) * A/R beginning, ending, average (from B/S) * A/R turnover and approximate DSO ``ar_payment_slices.csv`` **Optional-but-recommended** detail table. Each row represents a slice of a collection applied to an invoice under a FIFO assumption. This produces a realistic "days outstanding" distribution even when cash receipts do not explicitly reference invoice numbers. ``ar_days_stats.csv`` Descriptive stats for "days outstanding" overall and by customer. ``ch08_summary.json`` Summary metrics, checks, and a data dictionary. How to run ========== From the repository root: .. code-block:: bash # 1) (Re)generate the NSO v1 synthetic dataset make business-nso-sim # 2) Run Chapter 8 make business-ch08 # 3) Inspect outputs ls outputs/track_d | grep -E "gl_kpi_monthly|ar_monthly_metrics|ar_days_stats|ch08_summary" How to interpret the results ============================ KPIs: level vs variability -------------------------- Two businesses can have the same *average* gross margin but very different risk profiles. * If ``gross_margin_pct_std_w3`` is high, margin is unstable -- pricing, input costs, and product mix might be swinging month to month. * ``gross_margin_pct_cv`` (coefficient of variation) normalizes volatility by the mean and is useful when comparing different scales. A/R: why "average DSO" can hide tails ------------------------------------- The **mean** days outstanding can be pulled upward by a few very late payments. The **median** is often a better "typical" payment time. In ``ar_days_stats.csv`` look for: * ``p90_days`` / ``p95_days`` -- tail risk (customers who pay very late) * differences between mean and median -- skewness The table ``ar_monthly_metrics.csv`` is your month-by-month monitoring view. Large DSO spikes (or big gaps between credit sales and collections) are often early warnings for cash-flow pressure. Data dictionary highlights ========================== ``gl_kpi_monthly.csv`` * ``gross_margin_pct`` = ``gross_profit / revenue`` * ``net_margin_pct`` = ``net_income / revenue`` * ``*_mean_w3`` and ``*_std_w3`` are 3-month rolling statistics ``ar_monthly_metrics.csv`` * ``credit_sales`` = increase in A/R from invoices (GL A/R debits) * ``collections`` = decrease in A/R from cash collections (GL A/R credits) * ``dso_approx`` = ``avg_ar / credit_sales * days_in_month`` ``ar_payment_slices.csv`` * ``days_outstanding`` is computed as ``payment_date - invoice_date`` * rows are *amount-weighted* payment slices created by FIFO application Appendix -------- See :doc:`business_appendix_ch08_milestone_big_picture` for a big-picture recap of Chapters 1-8 and a roadmap beyond Chapter 8. Next chapter ============ Chapter 9 focuses on **visualization and reporting that doesn't mislead**. Using the KPIs and A/R artifacts from Chapter 8, we standardize how figures are labeled, how axes are handled (to avoid "chart crimes"), and how to produce a compact executive memo that tells a coherent story from a small chart pack.