.. _business-ch01: Ch 01 — Accounting as a measurement system ========================================== .. |trackd_run| replace:: d01 .. include:: _includes/track_d_run_strip.rst Why this matters (for accountants) ---------------------------------- Statistics is only useful if the underlying numbers are *meaningful* and *defensible*. Accounting is the measurement layer that makes business analysis possible. In other words: * Bookkeeping creates the **data-generating process**. * Accounting rules define what your measurements *mean*. * Controls and ethics determine whether the measurements are trustworthy. PyStatsV1’s promise is that we treat analysis like production software: **Don’t just calculate your results — engineer them. We treat statistical analysis like production software.** In Track D, that means every chapter is designed to be: * **reproducible** (seeded simulation, deterministic reruns), * **traceable** (inputs → transformations → outputs), * **controls-aware** (reconciliation checks and “what could go wrong?”), and * **decision-focused** (you end with a short memo, not just numbers). Learning objectives ------------------- By the end of this chapter, you will be able to: 1. Explain accounting as measurement (not just compliance). 2. Use the accounting equation to interpret business events. 3. Describe the core bookkeeping artifacts (journal, ledger, trial balance, statements). 4. Run a reproducible “mini-close” using PyStatsV1: simulate a tiny ledger, produce statements, and validate the accounting identity checks. Core terms (fast refresher) --------------------------- The accounting equation ^^^^^^^^^^^^^^^^^^^^^^^ .. math:: \text{Assets} = \text{Liabilities} + \text{Equity} This equation is the integrity constraint behind a balance sheet. * **Assets**: resources the business controls (cash, receivables, inventory, equipment). * **Liabilities**: obligations the business owes (accounts payable, loans, taxes payable). * **Equity**: the residual claim (owner contributions + retained earnings). Revenue, expense, and profit ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * **Revenue** is value earned from customers. * **Expenses** are resources consumed to generate revenue. * **Net income** (profit) is revenue minus expenses for a period. Double-entry bookkeeping ^^^^^^^^^^^^^^^^^^^^^^^^ Every transaction is recorded with at least two entries: * total **debits** = total **credits** within each transaction, and * the system stays consistent with the accounting equation. Key artifacts ^^^^^^^^^^^^^ * **Journal entry**: one transaction recorded as debit/credit lines. * **General ledger (GL)**: all journal lines organized by account. * **Trial balance**: account totals used to validate that debits = credits. * **Financial statements**: summarized views for decision making (income statement, balance sheet). Accrual vs cash timing (one idea to keep in mind) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Accounting often records economic activity when it is **earned/incurred** (accrual basis), not necessarily when cash moves. This matters later for forecasting, because: * profit does **not** equal cash, and * timing choices can create “patterns” that are artifacts of close processes. Accounting Connection (PDF refresher) ------------------------------------- This chapter refreshes the PDF’s **Bookkeeping basics** pillar: * accounting as measurement, * the bookkeeper’s role in financial integrity, * the double-entry method, * the GL and financial statements as outputs. Dataset tables used (LedgerLab core) ------------------------------------ In Track D we use a synthetic, accounting-shaped dataset family sometimes referred to as **LedgerLab**. For Chapters 1–3 we start with a **small “core ledger” dataset** (e.g., ``ledgerlab_ch01``). Starting in later chapters, Track D’s default running case becomes **North Shore Outfitters (NSO v1)** written to ``data/synthetic/nso_v1``. In Chapter 1 we start small: * ``chart_of_accounts.csv`` * ``gl_journal.csv`` (transaction-level debit/credit lines) * ``trial_balance_monthly.csv`` * ``statements_is_monthly.csv`` (baseline month) * ``statements_bs_monthly.csv`` (baseline month) What you’ll build in this chapter --------------------------------- You will run a “mini-close” workflow: 1. **Simulate** a small month of accounting activity (seeded and reproducible). 2. **Generate** a trial balance and basic statements. 3. **Validate** core integrity checks: * debits = credits per transaction, * the accounting equation balances on the balance sheet. 4. **Summarize** the month with a few “accountant-friendly” descriptive statistics. What the automated checks verify (exactly) ------------------------------------------ When you run ``make business-ch01``, the script prints a small set of controls-style checks. These checks are intentionally “audit-friendly”: they tell you whether the accounting data is internally consistent before you trust any statistics or forecasts derived from it. These appear under “Checks:” in the console output. Double-entry transaction check ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - ``transactions_balanced``: for every ``txn_id``, sum(debits) == sum(credits). (If this fails, the GL is not a valid double-entry ledger.) Companion diagnostics: - ``n_transactions``: number of transactions observed - ``n_unbalanced``: number of transactions failing the rule - ``max_abs_diff``: worst imbalance magnitude (0.0 is ideal) Accounting equation tie-out (balance sheet identity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - ``accounting_equation_balances``: verifies the balance sheet integrity constraint: .. math:: \text{Total Assets} = \text{Total Liabilities + Equity} Companion diagnostics: - ``total_assets``: computed total assets - ``total_liabilities_plus_equity``: computed total liabilities + equity - ``abs_diff``: absolute difference between the two totals (0.0 is ideal) Small nonzero differences extremely close to 0 can occur due to floating-point arithmetic; treat anything near machine precision as “effectively zero.” If a check fails, treat it like a controls exception: trace back to the offending transaction(s), confirm sign conventions, and verify statement rollups. PyStatsV1 lab (Run it) ---------------------- Using Makefile targets (recommended) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash make business-sim make business-ch01 Using Python module commands ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash # 1) Simulate the LedgerLab *core* dataset (Chapter 1 needs only the core tables) python -m scripts.sim_business_ledgerlab \ --outdir data/synthetic/ledgerlab_ch01 \ --seed 123 \ --month 2025-01 \ --n-sales 18 # 2) Analyze Chapter 1 and write a summary JSON + plots python -m scripts.business_ch01_accounting_measurement \ --datadir data/synthetic/ledgerlab_ch01 \ --outdir outputs/track_d \ --seed 123 Outputs you should see ^^^^^^^^^^^^^^^^^^^^^^ * Console output showing the integrity checks and key month metrics. * A LedgerLab core dataset folder in ``data/synthetic/ledgerlab_ch01``. * (Later) the NSO v1 running case in ``data/synthetic/nso_v1``. * A chapter output folder in ``outputs/track_d`` containing: * ``business_ch01_summary.json`` * ``business_ch01_cash_balance.png`` * ``business_ch01_balance_sheet_bar.png`` How to modify the scripts (student exercises) --------------------------------------------- 1) Change the business story ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Open ``scripts/sim_business_ledgerlab.py`` and try changing: * the number of sales (``--n-sales``), * the average sale amount, * the fraction of sales on account (AR vs cash), or * the mix of expenses. Then rerun the simulator and Chapter 1 analyzer. 2) Add your own integrity check ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In ``scripts/business_ch01_accounting_measurement.py``: * add a new check to flag negative inventory, * validate that revenue is never negative, * or compute a simple KPI like gross margin percentage. Interpretation & decision memo ------------------------------ After you run the lab, answer these memo prompts in 6–10 sentences: 1. **Measurement:** What does the accounting equation tell you about this business month? 2. **Trust:** Do the integrity checks pass? If they failed, what is the most likely root cause? 3. **Decision support:** Based on revenue, expenses, and cash balance, what is one action you would recommend? 4. **Forecasting preview:** Which number in the statements would you most want to forecast next, and why? End-of-chapter problems ----------------------- 1) Accounting equation classification ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Classify each event as increasing/decreasing assets, liabilities, or equity: * owner contributes cash, * inventory purchase on credit, * customer sale on account, * collecting on AR, * paying AP, * recording rent expense. 2) Timing and classification gotchas ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For each scenario, describe how it can distort analytics: * a large invoice posted in the wrong month, * an expense miscoded to the wrong account, * missing documentation requiring later reclass, * a bank reconciliation not performed. 3) PyStatsV1 reproducibility note ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Write a short policy (8–12 lines) for how your team will ensure: * every analysis can be rerun, * outputs are saved with metadata (seed, run date, code version), and * changes are reviewed like production code. What’s next ----------- Chapter 2 treats the general ledger as a dataset: we’ll structure it for analysis, build clean extracts, and set up “analysis-friendly” conventions that make forecasting chapters much easier. Textbook alignment notes ------------------------ Textbook Part A: Chapters 1–3.