Ch 01 — Accounting as a measurement system

PyPI workbook run (Track D)

From inside your Track D workbook folder (created by pystatsv1 workbook init --track d --dest ...), run:

pystatsv1 workbook run |trackd_run|

Outputs are written under outputs/track_d/ by default. If you’re unsure what a file is for, start with Track D Outputs Guide.

To see the full chapter-by-chapter run map (D00–D23), see Track D chapter index (PyPI).

Optional: write to a custom output folder:

pystatsv1 workbook run |trackd_run| --outdir outputs/track_d_custom

Interpretation prompts (quick self-check):

  • What is the accounting or business measurement goal in this chapter?

  • Which invariant/check would catch a “numbers look fine but are wrong” mistake here?

Why this matters (for accountants)

Statistics is only useful if the underlying numbers are meaningful and defensible. Accounting is the measurement layer that makes business analysis possible.

In other words:

  • Bookkeeping creates the data-generating process.

  • Accounting rules define what your measurements mean.

  • Controls and ethics determine whether the measurements are trustworthy.

PyStatsV1’s promise is that we treat analysis like production software:

Don’t just calculate your results — engineer them. We treat statistical analysis like production software.

In Track D, that means every chapter is designed to be:

  • reproducible (seeded simulation, deterministic reruns),

  • traceable (inputs → transformations → outputs),

  • controls-aware (reconciliation checks and “what could go wrong?”), and

  • decision-focused (you end with a short memo, not just numbers).

Learning objectives

By the end of this chapter, you will be able to:

  1. Explain accounting as measurement (not just compliance).

  2. Use the accounting equation to interpret business events.

  3. Describe the core bookkeeping artifacts (journal, ledger, trial balance, statements).

  4. Run a reproducible “mini-close” using PyStatsV1: simulate a tiny ledger, produce statements, and validate the accounting identity checks.

Core terms (fast refresher)

The accounting equation

\[\text{Assets} = \text{Liabilities} + \text{Equity}\]

This equation is the integrity constraint behind a balance sheet.

  • Assets: resources the business controls (cash, receivables, inventory, equipment).

  • Liabilities: obligations the business owes (accounts payable, loans, taxes payable).

  • Equity: the residual claim (owner contributions + retained earnings).

Revenue, expense, and profit

  • Revenue is value earned from customers.

  • Expenses are resources consumed to generate revenue.

  • Net income (profit) is revenue minus expenses for a period.

Double-entry bookkeeping

Every transaction is recorded with at least two entries:

  • total debits = total credits within each transaction, and

  • the system stays consistent with the accounting equation.

Key artifacts

  • Journal entry: one transaction recorded as debit/credit lines.

  • General ledger (GL): all journal lines organized by account.

  • Trial balance: account totals used to validate that debits = credits.

  • Financial statements: summarized views for decision making (income statement, balance sheet).

Accrual vs cash timing (one idea to keep in mind)

Accounting often records economic activity when it is earned/incurred (accrual basis), not necessarily when cash moves. This matters later for forecasting, because:

  • profit does not equal cash, and

  • timing choices can create “patterns” that are artifacts of close processes.

Accounting Connection (PDF refresher)

This chapter refreshes the PDF’s Bookkeeping basics pillar:

  • accounting as measurement,

  • the bookkeeper’s role in financial integrity,

  • the double-entry method,

  • the GL and financial statements as outputs.

Dataset tables used (LedgerLab core)

In Track D we use a synthetic, accounting-shaped dataset family sometimes referred to as LedgerLab.

For Chapters 1–3 we start with a small “core ledger” dataset (e.g., ledgerlab_ch01). Starting in later chapters, Track D’s default running case becomes North Shore Outfitters (NSO v1) written to data/synthetic/nso_v1. In Chapter 1 we start small:

  • chart_of_accounts.csv

  • gl_journal.csv (transaction-level debit/credit lines)

  • trial_balance_monthly.csv

  • statements_is_monthly.csv (baseline month)

  • statements_bs_monthly.csv (baseline month)

What you’ll build in this chapter

You will run a “mini-close” workflow:

  1. Simulate a small month of accounting activity (seeded and reproducible).

  2. Generate a trial balance and basic statements.

  3. Validate core integrity checks:

    • debits = credits per transaction,

    • the accounting equation balances on the balance sheet.

  4. Summarize the month with a few “accountant-friendly” descriptive statistics.

What the automated checks verify (exactly)

When you run make business-ch01, the script prints a small set of controls-style checks. These checks are intentionally “audit-friendly”: they tell you whether the accounting data is internally consistent before you trust any statistics or forecasts derived from it.

These appear under “Checks:” in the console output.

Double-entry transaction check

  • transactions_balanced: for every txn_id, sum(debits) == sum(credits). (If this fails, the GL is not a valid double-entry ledger.)

    Companion diagnostics: - n_transactions: number of transactions observed - n_unbalanced: number of transactions failing the rule - max_abs_diff: worst imbalance magnitude (0.0 is ideal)

Accounting equation tie-out (balance sheet identity)

  • accounting_equation_balances: verifies the balance sheet integrity constraint:

    \[\text{Total Assets} = \text{Total Liabilities + Equity}\]

    Companion diagnostics: - total_assets: computed total assets - total_liabilities_plus_equity: computed total liabilities + equity - abs_diff: absolute difference between the two totals (0.0 is ideal)

Small nonzero differences extremely close to 0 can occur due to floating-point arithmetic; treat anything near machine precision as “effectively zero.”

If a check fails, treat it like a controls exception: trace back to the offending transaction(s), confirm sign conventions, and verify statement rollups.

PyStatsV1 lab (Run it)

Using Python module commands

# 1) Simulate the LedgerLab *core* dataset (Chapter 1 needs only the core tables)
python -m scripts.sim_business_ledgerlab \
  --outdir data/synthetic/ledgerlab_ch01 \
  --seed 123 \
  --month 2025-01 \
  --n-sales 18

# 2) Analyze Chapter 1 and write a summary JSON + plots
python -m scripts.business_ch01_accounting_measurement \
  --datadir data/synthetic/ledgerlab_ch01 \
  --outdir outputs/track_d \
  --seed 123

Outputs you should see

  • Console output showing the integrity checks and key month metrics.

  • A LedgerLab core dataset folder in data/synthetic/ledgerlab_ch01.

  • (Later) the NSO v1 running case in data/synthetic/nso_v1.

  • A chapter output folder in outputs/track_d containing:

    • business_ch01_summary.json

    • business_ch01_cash_balance.png

    • business_ch01_balance_sheet_bar.png

How to modify the scripts (student exercises)

1) Change the business story

Open scripts/sim_business_ledgerlab.py and try changing:

  • the number of sales (--n-sales),

  • the average sale amount,

  • the fraction of sales on account (AR vs cash), or

  • the mix of expenses.

Then rerun the simulator and Chapter 1 analyzer.

2) Add your own integrity check

In scripts/business_ch01_accounting_measurement.py:

  • add a new check to flag negative inventory,

  • validate that revenue is never negative,

  • or compute a simple KPI like gross margin percentage.

Interpretation & decision memo

After you run the lab, answer these memo prompts in 6–10 sentences:

  1. Measurement: What does the accounting equation tell you about this business month?

  2. Trust: Do the integrity checks pass? If they failed, what is the most likely root cause?

  3. Decision support: Based on revenue, expenses, and cash balance, what is one action you would recommend?

  4. Forecasting preview: Which number in the statements would you most want to forecast next, and why?

End-of-chapter problems

1) Accounting equation classification

Classify each event as increasing/decreasing assets, liabilities, or equity:

  • owner contributes cash,

  • inventory purchase on credit,

  • customer sale on account,

  • collecting on AR,

  • paying AP,

  • recording rent expense.

2) Timing and classification gotchas

For each scenario, describe how it can distort analytics:

  • a large invoice posted in the wrong month,

  • an expense miscoded to the wrong account,

  • missing documentation requiring later reclass,

  • a bank reconciliation not performed.

3) PyStatsV1 reproducibility note

Write a short policy (8–12 lines) for how your team will ensure:

  • every analysis can be rerun,

  • outputs are saved with metadata (seed, run date, code version), and

  • changes are reviewed like production code.

What’s next

Chapter 2 treats the general ledger as a dataset: we’ll structure it for analysis, build clean extracts, and set up “analysis-friendly” conventions that make forecasting chapters much easier.

Textbook alignment notes

Textbook Part A: Chapters 1–3.