- Name
- Tison Brokenshire
Updated on

Statistics Formulas Cheat Sheet for Students: Every Formula You Need
You open your statistics textbook to review for the exam. Chapter 3 has one set of formulas. Chapter 7 introduces completely different notation. Chapter 10 adds Greek letters that look identical but mean different things. The formulas are scattered across 400 pages with no single reference that pulls them together. You flip back and forth, losing time and confidence.
This is not a study problem. It is a format problem. Statistics textbooks teach concepts sequentially but never consolidate the formulas into one place. The PDFs floating around online are either too dense (Stanford's 12-page reference) or too sparse (a single page missing half the formulas you need). Most are not mobile-friendly, and none explain when to use each formula.
This cheat sheet solves that. Every core formula from introductory statistics is organized by topic, presented in clean tables, and paired with a plain-English explanation of what it calculates and when you need it. Bookmark this page and use it alongside your textbook, homework, and exam prep.
Descriptive Statistics
Descriptive statistics summarize a dataset. These formulas tell you the center, spread, and shape of your data.
Measures of Central Tendency
| Formula | Name | What It Calculates | When to Use |
|---|---|---|---|
| x̄ = Σxᵢ / n | Sample Mean | Average value of the dataset | Default measure of center for symmetric data |
| Median = middle value when sorted | Median | The value that splits the dataset in half | Use when data is skewed or has outliers |
| Mode = most frequent value | Mode | The value that appears most often | Use for categorical data or to find peaks |
Measures of Spread
| Formula | Name | What It Calculates | When to Use |
|---|---|---|---|
| Range = max − min | Range | Distance between the largest and smallest values | Quick but rough measure of spread |
| s² = Σ(xᵢ − x̄)² / (n − 1) | Sample Variance | Average squared deviation from the mean | When you need spread in squared units |
| s = √[Σ(xᵢ − x̄)² / (n − 1)] | Sample Standard Deviation | Average distance from the mean | Most common measure of spread |
| σ² = Σ(xᵢ − μ)² / N | Population Variance | Variance using the entire population | Only when you have the full population |
| σ = √[Σ(xᵢ − μ)² / N] | Population Standard Deviation | Standard deviation for the full population | Rare — usually use sample version |
| IQR = Q3 − Q1 | Interquartile Range | Spread of the middle 50% of data | Use with median for skewed data |
Position and Shape
| Formula | Name | What It Calculates | When to Use |
|---|---|---|---|
| z = (x − x̄) / s | Z-Score (sample) | How many standard deviations x is from the mean | Comparing values across different datasets |
| z = (x − μ) / σ | Z-Score (population) | Standard score using population parameters | When population mean and SD are known |
| Percentile rank = (values below x / n) × 100 | Percentile | Percentage of data below a given value | Standardized test scores, rankings |
| CV = (s / x̄) × 100% | Coefficient of Variation | Relative variability as a percentage | Comparing spread across different scales |
Probability
Probability formulas calculate the likelihood of events.
Basic Probability Rules
| Formula | Name | What It Calculates | When to Use |
|---|---|---|---|
| P(A) = favorable outcomes / total outcomes | Classical Probability | Probability of event A | Equal-likelihood outcomes (coins, dice, cards) |
| P(A') = 1 − P(A) | Complement Rule | Probability that A does not happen | When "not A" is easier to calculate |
| P(A ∪ B) = P(A) + P(B) − P(A ∩ B) | Addition Rule (General) | Probability of A or B or both | Two events that can overlap |
| P(A ∪ B) = P(A) + P(B) | Addition Rule (Mutually Exclusive) | Probability of A or B | Events that cannot happen together |
| P(A ∩ B) = P(A) × P(B) | Multiplication Rule (Independent) | Probability of A and B both | Events that do not affect each other |
| P(A ∩ B) = P(A) × P(B|A) | Multiplication Rule (Dependent) | Probability of A and B both | Events where A affects B |
| P(A|B) = P(A ∩ B) / P(B) | Conditional Probability | Probability of A given B occurred | When one event is already known |
Bayes' Theorem
| Formula | Name | When to Use |
|---|---|---|
| P(A|B) = [P(B|A) × P(A)] / P(B) | Bayes' Theorem | Updating probability with new evidence; medical testing, spam filters |
Counting Methods
| Formula | Name | What It Calculates | When to Use |
|---|---|---|---|
| n! = n × (n−1) × (n−2) × ... × 1 | Factorial | Total arrangements of n items | Permutations and combinations |
| P(n,r) = n! / (n−r)! | Permutation | Arrangements where order matters | Ranked lists, sequences |
| C(n,r) = n! / [r!(n−r)!] | Combination | Selections where order does not matter | Committees, card hands, groups |
Probability Distributions
Binomial Distribution
Use when counting the number of successes in a fixed number of independent trials with the same probability of success.
| Formula | What It Calculates |
|---|---|
| P(X = k) = C(n,k) × p^k × (1−p)^(n−k) | Probability of exactly k successes in n trials |
| μ = np | Mean of the binomial distribution |
| σ = √[np(1−p)] | Standard deviation of the binomial distribution |
Requirements: Fixed number of trials (n), two outcomes (success/failure), constant probability (p), independent trials.
Normal Distribution
The bell-shaped curve that describes many natural phenomena. Most statistical inference is based on the normal distribution.
| Formula | What It Calculates |
|---|---|
| z = (x − μ) / σ | Convert any normal value to a standard normal score |
| x = μ + zσ | Convert a z-score back to the original scale |
Key properties: Mean = median = mode. About 68% of data falls within 1 SD, 95% within 2 SD, 99.7% within 3 SD (the 68-95-99.7 rule).
Poisson Distribution
Use when counting the number of events in a fixed interval of time or space when events occur independently at a constant average rate.
| Formula | What It Calculates |
|---|---|
| P(X = k) = (λ^k × e^(−λ)) / k! | Probability of exactly k events when average rate is λ |
| μ = λ | Mean equals the rate parameter |
| σ = √λ | Standard deviation of the Poisson distribution |
Sampling Distributions
These formulas describe the behavior of sample statistics when you take repeated samples from a population.
| Formula | Name | What It Calculates | When to Use |
|---|---|---|---|
| μ_x̄ = μ | Mean of Sampling Distribution | Expected value of the sample mean | Center of all possible sample means |
| σ_x̄ = σ / √n | Standard Error of the Mean | Spread of the sampling distribution | Measures precision of the sample mean |
| σ_p̂ = √[p(1−p) / n] | Standard Error of a Proportion | Spread of sample proportions | Polls, surveys, proportion estimates |
Central Limit Theorem: For large enough n (typically n ≥ 30), the sampling distribution of x̄ is approximately normal regardless of the population shape.
Confidence Intervals
Confidence intervals estimate a population parameter within a range.
| Formula | Name | When to Use |
|---|---|---|
| x̄ ± z*(σ / √n) | CI for Mean (σ known) | Large sample, population SD known |
| x̄ ± t*(s / √n) | CI for Mean (σ unknown) | Small sample or population SD unknown |
| p̂ ± z*√[p̂(1−p̂) / n] | CI for Proportion | Estimating a population proportion |
| (n−1)s² / χ²_upper, (n−1)s² / χ²_lower | CI for Variance | Estimating population variance |
Common Z-Values for Confidence Levels
| Confidence Level | z* Value |
|---|---|
| 90% | 1.645 |
| 95% | 1.960 |
| 99% | 2.576 |
Sample Size Formulas
| Formula | What It Determines | When to Use |
|---|---|---|
| n = (z*σ / E)² | Sample size for mean | Planning a study to estimate a mean with margin of error E |
| n = p̂(1−p̂)(z* / E)² | Sample size for proportion | Planning a survey with margin of error E |
Hypothesis Testing
Hypothesis testing determines whether sample data provides enough evidence to reject a claim about a population.
Test Statistics
| Formula | Name | When to Use |
|---|---|---|
| z = (x̄ − μ₀) / (σ / √n) | Z-Test for Mean | σ known, large sample |
| t = (x̄ − μ₀) / (s / √n) | T-Test for Mean | σ unknown (most common case) |
| z = (p̂ − p₀) / √[p₀(1−p₀) / n] | Z-Test for Proportion | Testing a population proportion |
| t = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂) | Two-Sample T-Test | Comparing means of two groups |
| t = d̄ / (s_d / √n) | Paired T-Test | Before/after measurements on same subjects |
| χ² = Σ[(O − E)² / E] | Chi-Square Test | Categorical data, goodness of fit, independence |
Hypothesis Testing Decision Table
| Step | Action |
|---|---|
| 1 | State null hypothesis (H₀) and alternative hypothesis (H₁) |
| 2 | Choose significance level (α), typically 0.05 |
| 3 | Calculate the test statistic using the appropriate formula |
| 4 | Find the p-value or compare to critical value |
| 5 | If p-value < α, reject H₀. If p-value ≥ α, fail to reject H₀ |
Type I and Type II Errors
| Error Type | What Happens | Probability | Also Called |
|---|---|---|---|
| Type I | Reject H₀ when H₀ is true | α (significance level) | False positive |
| Type II | Fail to reject H₀ when H₁ is true | β | False negative |
| Power | Correctly reject H₀ when H₁ is true | 1 − β | Sensitivity |
Linear Regression
Linear regression models the relationship between two variables.
| Formula | Name | What It Calculates |
|---|---|---|
| ŷ = b₀ + b₁x | Regression Equation | Predicted value of y for a given x |
| b₁ = Σ[(xᵢ − x̄)(yᵢ − ȳ)] / Σ(xᵢ − x̄)² | Slope | Change in y for each unit change in x |
| b₀ = ȳ − b₁x̄ | Y-Intercept | Predicted y when x equals zero |
| r = Σ[(xᵢ − x̄)(yᵢ − ȳ)] / [(n−1) × sₓ × sᵧ] | Correlation Coefficient | Strength and direction of linear relationship (−1 to 1) |
| r² = (explained variation) / (total variation) | Coefficient of Determination | Proportion of y's variation explained by x |
| sₑ = √[Σ(yᵢ − ŷᵢ)² / (n−2)] | Standard Error of Estimate | Average distance of data points from the regression line |
Interpreting r Values
| r Value Range | Interpretation |
|---|---|
| 0.00 to 0.19 | Very weak |
| 0.20 to 0.39 | Weak |
| 0.40 to 0.59 | Moderate |
| 0.60 to 0.79 | Strong |
| 0.80 to 1.00 | Very strong |
Negative values indicate an inverse relationship. The sign of r matches the sign of the slope b₁.
ANOVA (Analysis of Variance)
ANOVA tests whether the means of three or more groups are significantly different.
| Formula | Name | What It Calculates |
|---|---|---|
| F = MS_between / MS_within | F-Statistic | Ratio of between-group variance to within-group variance |
| SS_between = Σnⱼ(x̄ⱼ − x̄)² | Between-Group Sum of Squares | Variation due to differences between group means |
| SS_within = ΣΣ(xᵢⱼ − x̄ⱼ)² | Within-Group Sum of Squares | Variation within each group |
| MS = SS / df | Mean Square | Sum of squares divided by degrees of freedom |
Decision rule: If F > F_critical (or p-value < α), at least one group mean differs significantly.
Quick Reference: Which Formula to Use
Use this decision table during exams to quickly identify the correct formula.
| You Want To... | Data Type | Use This |
|---|---|---|
| Describe center of data | Quantitative | Mean or median |
| Describe spread of data | Quantitative | Standard deviation or IQR |
| Find probability of an event | Categorical | Probability rules |
| Count successes in trials | Binary outcomes | Binomial distribution |
| Estimate a population mean | Quantitative sample | Confidence interval for mean |
| Estimate a population proportion | Categorical sample | Confidence interval for proportion |
| Test a claim about a mean | Quantitative, σ known | Z-test |
| Test a claim about a mean | Quantitative, σ unknown | T-test |
| Compare two group means | Two quantitative samples | Two-sample t-test |
| Compare three or more means | Multiple quantitative groups | ANOVA |
| Test relationship between two variables | Two quantitative variables | Linear regression / correlation |
| Test categorical association | Two categorical variables | Chi-square test |
How to Use This Cheat Sheet Effectively
During lectures: Keep this page open alongside your notes. When the professor introduces a new formula, find it on the cheat sheet to see where it fits in the bigger picture.
During homework: Use the "Which Formula to Use" table above to identify the right formula for each problem. The "When to Use" column tells you the conditions that must be met.
During exam prep: If your professor allows a formula sheet, use the tables in this article as a starting template and customize it with your professor's specific notation. If no formula sheet is allowed, use the cheat sheet to practice until the formulas are memorized.
For handwritten formula notes: Many students write formulas by hand for better retention. If you want a digital backup of your handwritten formula sheets, photograph them with Pixno (opens in a new tab) to create searchable, organized digital copies. This is especially useful for math-heavy courses where symbols and notation are easier to write by hand than to type.
Symbol Reference
| Symbol | Meaning |
|---|---|
| x̄ (x-bar) | Sample mean |
| μ (mu) | Population mean |
| s | Sample standard deviation |
| σ (sigma) | Population standard deviation |
| n | Sample size |
| N | Population size |
| p̂ (p-hat) | Sample proportion |
| p | Population proportion |
| α (alpha) | Significance level |
| β (beta) | Probability of Type II error |
| λ (lambda) | Rate parameter (Poisson) |
| χ² (chi-square) | Chi-square statistic |
| df | Degrees of freedom |
| H₀ | Null hypothesis |
| H₁ or Hₐ | Alternative hypothesis |
| Σ | Summation |
Related Reading
- Accounting Formulas Cheat Sheet for Students — Another formula cheat sheet covering 30 essential accounting equations for students.
- Spaced Repetition Schedule Cheat Sheet — Use spaced repetition to memorize statistics formulas efficiently before exams.
- PARA Method for College Students — Organize your statistics notes, formula sheets, and study materials with the PARA system.