Accounting data as a dataset pipeline
Why this exists: Students often know debits/credits but not how that becomes an analyzable dataset. This bridges that gap.
Learning objectives
Describe the path from business events to statements and analytics.
Recognize the difference between a chart of accounts, journal, ledger, and trial balance.
Explain what a “normalization step” does and why it matters.
Outline
From events to reports
Business event → journal entry (date, accounts, amounts, memo).
Journal entries are the “source record”; the ledger is the “by-account view” of those entries.
Trial balance is a snapshot of balances by account.
Statements are views built from the trial balance and classifications.
From reports to analysis
Analytics usually starts from the journal/ledger (not the formatted financial statements).
We create time series (daily/monthly totals), ratios, and variance explanations.
We then ask: what changed, why, and what should we do next?
Where BYOD fits
Different systems export different CSV shapes.
Adapters convert exports into the Track D canonical tables.
After normalization, you typically work from
normalized/gl_journal.csv(plusnormalized/chart_of_accounts.csv).After normalization, analysis scripts don’t care where the data came from.
Where this connects in the workbook
Track D Dataset Map (what tables exist and what they mean)
Track D BYOD: Bring Your Own Data (the adapter/normalize/validate workflow)
Note
This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative.