Glossary (draft)

This is a student-friendly glossary for Track D. Keep definitions short, practical, and tied to what you see in the PyStatsV1 outputs.

Accounting terms

  • Account: A labeled bucket that records a type of financial activity (e.g., Cash, Sales, Rent Expense).

  • Chart of accounts (COA): The full list of accounts a business uses, usually grouped into Assets, Liabilities, Equity, Revenue, Expenses.

  • Account type: A label like Asset/Liability/Equity/Revenue/Expense that helps classify accounts for reporting and analysis.

  • Debit / credit: The two-sided bookkeeping convention used to keep entries balanced. In Track D, the scripts handle the analysis sign convention; when unsure, check the chapter page or the output summaries for “positive means what?”.

  • Journal entry: A dated record of debits/credits for one event (it should balance to zero when summed).

  • Posting: One line within a journal entry (one account + amount). A single entry usually has 2+ postings.

  • General ledger (GL): The journal entries viewed “by account over time” (a database-like history, not a formatted report).

  • Transaction vs posting: A transaction is the whole event; postings are the individual lines inside it. Analytics often works on postings, then aggregates back up.

  • Trial balance (TB): A snapshot of balances by account at a point in time; the starting point for building statements.

  • Financial statements: Summaries built from the trial balance using classifications (Income Statement / Balance Sheet / Cash Flow).

  • Balance: The net total in an account after combining debits and credits over a period.

  • Opening balance / beginning balance: The starting balance at the beginning of a period.

  • Accrual vs cash basis: The timing rule for when revenue/expense is recorded (earned/incurred vs when cash moves).

  • Accounts receivable (AR): Money customers owe you (an asset). Accounts payable (AP) is money you owe suppliers (a liability).

  • Revenue (sales): Money earned from customers for goods/services. Often recorded before cash is received (accrual).

  • Expense: Costs incurred to run the business (rent, wages, utilities). Often recorded before cash is paid (accrual).

  • COGS (cost of goods sold): Direct costs tied to producing/selling products; used to compute gross margin.

  • Gross margin: Revenue minus COGS (often shown as dollars and as a % of revenue).

  • Depreciation: Allocating an asset’s cost over time (a non-cash expense).

  • Reconciliation: A check that two “views” of the world agree (e.g., bank statement vs cash ledger).

  • Materiality: A practical threshold for “big enough to matter.” Helps decide what to investigate first.

Analytics terms

  • Observation / row: One record in a table (e.g., one posting, one invoice, one daily total).

  • Metric: A number you track (daily revenue proxy, monthly payroll total, cash balance).

  • Aggregation: Summarizing many rows into totals by day/month/category (the main move in Track D).

  • Grouping key: The fields you aggregate by (date, month, account, department, customer segment).

  • Tidy data: A table where each row is one observation and each column is one variable (easy to filter, group, and plot).

  • Time series: A metric tracked over time (daily revenue proxy, monthly expenses, weekly cash balance).

  • Baseline: A simple comparison point (last month, last year, moving average, seasonal naive).

  • Variance: A change between two periods or scenarios (this month vs last month; actual vs budget).

  • Driver: The category/account that explains most of a variance (the “why” behind the change).

  • Decomposition: Breaking a total change into parts (e.g., which accounts explain revenue growth).

  • KPI: A key performance indicator (e.g., gross margin %, days-to-pay, cash coverage).

  • Distribution: The spread of values (helpful for typical vs unusual transactions).

  • Outlier: A value that is unusual relative to typical observations (not automatically an error).

  • Seasonality: Predictable repeating patterns over time (holidays, summer sales, payroll cycles).

  • Structural break: A real change in how the business operates that makes “past ≈ future” less reliable.

  • Backtest: Testing a forecasting method on past data (train earlier, test later).

  • Error metric: A summary of forecast accuracy (e.g., MAE/MAPE). Lower is usually better, but context matters.

Track D + BYOD terms

  • Track D: The “big picture” track: statistics on accounting data, using reproducible scripts and artifacts.

  • Workflow loop: Export → normalize → validate → analyze → communicate (repeat this across chapters and BYOD projects).

  • Dataset contract: The required table names + column headers + meanings that scripts assume.

  • Canonical dataset: A known-good demo dataset shipped with the workbook (used for learning and expected outputs).

  • Source export: A CSV export from your accounting system (often messy and source-specific).

  • BYOD (Bring Your Own Data): Using your own accounting exports instead of the canonical demos.

  • BYOD project folder: A reproducible folder created by pystatsv1 trackd byod init (contains config.toml, tables/, and outputs).

  • config.toml: The project’s settings file (profile + adapter + any source-specific knobs).

  • tables/: Where you place raw exports (source-specific CSVs).

  • Adapter: Code that converts a source export into the Track D contract (repeatable + testable cleanup).

  • Normalize / normalization: Running pystatsv1 trackd byod normalize to produce canonical outputs under normalized/.

  • normalized/: Canonical tables produced by normalization (typically normalized/gl_journal.csv and normalized/chart_of_accounts.csv).

  • Schema: The expected columns + types for a table (what must exist for scripts to run correctly).

  • Validate: A fast schema + sanity check that catches missing columns, bad types, and common structural problems.

  • Daily totals: A first analysis-ready time series derived from normalized/gl_journal.csv (often written to normalized/daily_totals.csv).

  • Profile: A preset that defines which tables/columns are required for a workflow (e.g., core_gl).

  • Artifacts: The outputs created by scripts (tidy CSVs, figures, JSON summaries, short memos) under outputs/track_d/.

  • Reproducible: Someone else can rerun your project and get the same tables/figures (given the same inputs).

Note

Keep this glossary short and student-friendly. If you add a new term, prefer a one-sentence definition plus one concrete example. If anything conflicts with the CLI or outputs, the docs and --help should be updated to match what the code actually does.