Chapter 2 – Getting started with R (for Python-first learners)

This chapter is adapted from Chapter 2 (Introduction to R) of Applied Statistics with R by David Dalpiaz, reworked for a Python-first audience and the PyStatsV1 project. The original text is available at

Our goal here is to give you enough R to:

  • understand R-based textbook examples, and

  • compare them with the Python scripts in PyStatsV1.

You do not need to become an R expert to use this project, but seeing the same ideas in both languages is extremely valuable.

Why learn a bit of R if you like Python?

Many applied statistics courses and textbooks are written with R code. If you only know Python, it can feel like you are always “translating in your head”.

PyStatsV1 is designed to reduce that friction:

  • the ideas come from an R-first textbook,

  • the code you actually run is Python,

  • and R appears as a reference point.

A small investment in R gives you:

  • direct access to the original examples,

  • a second way to check your understanding,

  • and a broader mental model of how statistical software works.

Installing R and RStudio / Posit

R is both:

  • a language, and

  • a software environment for statistical computing.

To follow along with the original book or to compare outputs, you will usually install two pieces of software:

  1. R from the Comprehensive R Archive Network (CRAN).

  2. RStudio Desktop (now maintained by Posit), an IDE that provides a console, editor, plots pane, and more.

In very rough steps:

  • visit CRAN,

  • download the latest R version for your operating system,

  • then install RStudio Desktop and let it find your R installation.

(Exact installation instructions change over time; the official websites will have the current details.)

Once installed, you can:

  • run R directly from a terminal by typing R, or

  • open RStudio and work in its console and editor.

In this guide we will focus on the language, not the IDE, but RStudio is an excellent environment for learning.

Using R as a calculator (and comparing with Python)

Both R and Python can be used as fancy calculators. To warm up, we will mirror a few basic operations. The point is not to memorize every line, but to see how close the two languages really are.

Basic arithmetic

Addition, subtraction, multiplication, and division:

  • Python:

    3 + 2
    3 - 2
    3 * 2
    3 / 2    # 1.5
    
  • R:

    3 + 2
    3 - 2
    3 * 2
    3 / 2    # 1.5
    

Exponentiation and roots

  • Python (** for powers):

    3 ** 2          # 9
    2 ** -3         # 0.125
    100 ** 0.5      # 10
    import math
    math.sqrt(100)  # 10
    
  • R (^ for powers):

    3 ^ 2           # 9
    2 ^ (-3)        # 0.125
    100 ^ 0.5       # 10
    sqrt(100)       # 10
    

Constants and logarithms

R and Python both provide common mathematical constants and log functions.

  • Python (using math):

    import math
    
    math.pi          # 3.14159...
    math.e           # 2.71828...
    
    math.log(math.e)          # natural log
    math.log10(1000)          # base 10
    math.log2(8)              # base 2
    
  • R:

    pi               # 3.14159...
    exp(1)           # 2.71828...
    
    log(exp(1))      # natural log
    log10(1000)      # base 10
    log2(8)          # base 2
    log(16, base = 4)  # log base 4
    

Trigonometry

Angles are in radians in both languages.

  • Python:

    import math
    math.sin(math.pi / 2)  # 1.0
    math.cos(0)            # 1.0
    
  • R:

    sin(pi / 2)            # 1
    cos(0)                 # 1
    

The takeaway: if you can read basic Python math, R’s syntax is only a small step away.

Getting help in R

In the examples above we used functions such as sqrt(), exp(), log() and sin(). R has built-in documentation for these.

To open the help page for a function in R, type a question mark in front of its name at the console:

?log
?sin
?paste
?lm

RStudio displays the documentation in the Help pane. This help system is extremely rich and is often the best first place to look when you are unsure what a function does.

When that is not enough, general advice from the original text applies equally to R and Python:

  • Start by searching the exact error message.

  • If you ask another human for help, include: - what you expected the code to do, - what actually happened (including the full error), - and enough code to reproduce the problem.

This mindset carries over directly to PyStatsV1. When you open a GitHub issue, the same principles make it much easier for maintainers and instructors to help you.

Installing and loading packages in R

Base R ships with many functions and datasets, but one of its biggest strengths is the package system. Packages provide:

  • new functions,

  • new data sets,

  • or sometimes entire modeling frameworks.

There are two steps to using a package:

  1. Install it (once per machine), and

  2. Load it in each R session where you need it.

Installation

Use install.packages() to download and install from CRAN:

install.packages("ggplot2")

Think of this as adding a new cookbook to your shelf.

Loading

Each time you start R, call library() for any packages you want to use in that session:

library(ggplot2)

This is like taking the cookbook off the shelf and opening it. When you exit R, the package is no longer loaded, but it remains installed.

Python analogy

If you are used to Python:

  • install.packages("ggplot2") is similar to pip install matplotlib seaborn.

  • library(ggplot2) is similar to:

    import matplotlib.pyplot as plt
    

You install once per environment, but you import / load in every session.

Style guides and cheatsheets

The original R text recommends using a style guide and checking out the RStudio / Posit “cheatsheets” for base R. The exact style you follow is less important than being consistent.

For R, two well-known guides are:

  • Hadley Wickham’s style guide from Advanced R.

  • Google’s R style guide.

PyStatsV1 does not enforce a particular R style, but we do:

  • encourage consistency within an analysis,

  • follow PEP 8–style conventions on the Python side,

  • and use tools like ruff and pytest to keep scripts clean.

In your own work, it is worth choosing a small set of rules for:

  • how you name variables (snake_case vs CamelCase),

  • where you put spaces around operators,

  • and how you format function calls.

Suggested exercises

To solidify the ideas from this chapter, try the following:

  1. Mirror basic calculations. Open a Python REPL and an R console side by side. Recreate the examples above and invent a few of your own. Verify that both languages give the same results.

  2. Explore documentation. In R, run ?log and ?lm and skim the help pages. In Python, use help(math.log) or check the NumPy / SciPy docs for similar functions. Notice the similarities in structure.

  3. Install and load a package. In R, install and load ggplot2. In Python, install and import a plotting library you like (Matplotlib, Seaborn, Plotnine, …) and create a tiny plot in each language.

  4. Practice asking for help. Take an error message from either language (you can even create one on purpose), and write a short, clear question that you would be comfortable posting on a help forum or sending to an instructor.

How this chapter connects to PyStatsV1

In later chapters of PyStatsV1, you will see:

  • R-based textbook examples,

  • matching Python scripts in this repository,

  • and, where helpful, R snippets for comparison.

This chapter is your “minimal R toolkit” for understanding those connections. Whenever you see an R command that you do not recognize, you can:

  • revisit this chapter,

  • consult the original Applied Statistics with R text, or

  • open a GitHub Discussion to ask a question.

License and attribution

This chapter is a derivative work based on content from Applied Statistics with R by David Dalpiaz, used under the terms of the Creative Commons Attribution–NonCommercial–ShareAlike 4.0 International License. You can view the full license at:

We thank David Dalpiaz and contributors to the original text. PyStatsV1 extends their work by providing Python implementations, command-line workflows, and teaching-focused documentation for a Python + R environment.