Fatskills
Practice. Master. Repeat.
Study Guide: AP Statistics (AP Stats): Describing Distributions (Shape, Outliers, Center, Spread – SOCS)
Source: https://www.fatskills.com/ap-statistics/chapter/ap-stats-ap-statistics-describing-distributions-shape-outliers-center-spread-socs

AP Statistics (AP Stats): Describing Distributions (Shape, Outliers, Center, Spread – SOCS)

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~6 min read

AP Statistics – Describing Distributions (Shape, Outliers, Center, Spread – SOCS)


AP Statistics: Describing Distributions (SOCS) – Exam-Ready Study Guide


What This Is

Describing distributions using Shape, Outliers, Center, and Spread (SOCS) is the foundation of exploratory data analysis in AP Statistics. The AP exam frequently tests your ability to analyze and interpret distributions, whether in multiple-choice questions or free-response (FRQ) problems. For example, a researcher might collect data on the lifespans of light bulbs to determine if a new manufacturing process improves durability. By describing the distribution (e.g., skewed right, mean vs. median, standard deviation), you can draw meaningful conclusions about the data’s behavior and make informed decisions.


Key Terms & Formulas

  • Shape: Describes the overall pattern of the distribution (e.g., symmetric, skewed left, skewed right, bimodal, uniform).
  • Example: A right-skewed distribution has a long tail on the right (e.g., income data).

  • Outliers: Data points that fall outside the overall pattern. Use the 1.5 × IQR rule:

  • Lower fence = Q1 – 1.5(IQR)
  • Upper fence = Q3 + 1.5(IQR)
  • Any point outside these fences is an outlier.

  • Center: Measures of central tendency.

  • Mean (x̄): Average of all data points. x̄ = (Σxᵢ) / n
  • Median: Middle value when data is ordered (resistant to outliers).
  • In skewed distributions, the median is a better measure of center than the mean.

  • Spread: Measures of variability.

  • Range = Max – Min (not resistant to outliers).
  • Interquartile Range (IQR) = Q3 – Q1 (resistant to outliers).
  • Standard Deviation (s): Average distance from the mean. s = √[Σ(xᵢ – x̄)² / (n – 1)]
    • Use 1-Var Stats on TI-84 to compute (L1 for data).
  • Variance = s² (standard deviation squared).

  • Five-Number Summary: Min, Q1, Median, Q3, Max.

  • Use 1-Var StatsStatsCalc1-Var Stats on TI-84.

  • Boxplot (Box-and-Whisker Plot): Visual representation of the five-number summary.

  • Use STAT PLOTBoxplot on TI-84 (adjust window for outliers).

  • Histogram: Bar graph showing frequency/relative frequency of data in bins.

  • Use STAT PLOTHistogram on TI-84 (adjust bin width with Xscl).

  • Stem-and-Leaf Plot: Quick way to display small datasets while preserving individual values.

  • Resistant vs. Non-Resistant Measures:

  • Resistant: Median, IQR (not affected by outliers).
  • Non-resistant: Mean, range, standard deviation (affected by outliers).


Step-by-Step / Process Flow

How to Describe a Distribution (SOCS) in an FRQ:


  1. State the Context
  2. Example: “The distribution represents the number of hours students spend on homework per night.”

  3. Describe the Shape

  4. Is it symmetric, skewed left, skewed right, bimodal, or uniform?
  5. Example: “The distribution is skewed right, with most students studying 1–3 hours but a few studying 6+ hours.”

  6. Identify Outliers (if any)

  7. Use the 1.5 × IQR rule or visually inspect the boxplot/histogram.
  8. Example: “There appears to be an outlier at 10 hours, which is above the upper fence of 7.5 hours.”

  9. Describe the Center

  10. Report mean (x̄) or median, depending on shape.
  11. Example: “The median is 2.5 hours, which is a better measure of center than the mean due to the right skew.”

  12. Describe the Spread

  13. Report range, IQR, or standard deviation.
  14. Example: “The IQR is 2 hours (Q3 = 4, Q1 = 2), and the standard deviation is 1.8 hours.”

  15. Summarize in Context

  16. Example: “Most students study between 1 and 4 hours per night, but a few study much longer, pulling the mean above the median.”

Common Mistakes

  • Mistake: Using the mean to describe the center of a skewed distribution.
  • Correction: Use the median for skewed data because it’s resistant to outliers.
  • Why? The mean is pulled in the direction of the skew (e.g., right skew → mean > median).

  • Mistake: Forgetting to check for outliers before calculating spread.

  • Correction: Always use the 1.5 × IQR rule or inspect a boxplot.
  • Why? Outliers inflate the range and standard deviation, making them misleading.

  • Mistake: Confusing standard deviation (s) with variance (s²).

  • Correction: Standard deviation is in the original units (e.g., hours), while variance is in squared units (e.g., hours²).
  • Why? The AP exam expects s, not s², in most contexts.

  • Mistake: Describing shape without context (e.g., “the graph is skewed”).

  • Correction: Always name the direction of skew (left/right) and relate it to the data.
  • Why? The AP rubric awards points for specificity.

  • Mistake: Misinterpreting the IQR as “the middle 50% of the data.”

  • Correction: The IQR is the range of the middle 50%, not the actual values.
  • Why? The middle 50% of values fall between Q1 and Q3, not the IQR itself.


AP Exam Insights

  • What’s Frequently Tested?
  • Comparing distributions (e.g., “Compare the centers and spreads of Group A and Group B”).
  • Interpreting boxplots/histograms (e.g., “Is the median greater than the mean? Why?”).
  • Effect of outliers (e.g., “How would removing the outlier affect the mean and standard deviation?”).
  • Calculator skills (e.g., 1-Var Stats, boxplots, histograms).

  • Tricky Distinctions:

  • Mean vs. Median: Always choose based on shape (symmetric → mean; skewed → median).
  • IQR vs. Standard Deviation: IQR is resistant; standard deviation is not.
  • Outliers vs. Extreme Values: Not all extreme values are outliers (must exceed 1.5 × IQR).

  • Common FRQ Setups:

  • “Describe the distribution of [variable] using SOCS.”
  • “Explain why the median is a better measure of center than the mean for this dataset.”
  • “Identify any outliers and describe their effect on the mean and standard deviation.”

  • Calculator Pitfalls:

  • 1-Var Stats: Always check if data is in L1 and if Freq is set to 1.
  • Boxplots: Adjust the window to see outliers (use ZoomStat).
  • Histograms: Set an appropriate bin width (Xscl) to avoid misleading graphs.


Quick Check Questions

  1. Multiple Choice:
    A dataset has a mean of 50 and a median of 45. Which of the following is most likely true about the shape of the distribution?
  2. (A) Symmetric
  3. (B) Skewed left
  4. (C) Skewed right
  5. (D) Bimodal
  6. (E) Uniform

Answer: (C) Skewed right
Explanation: When the mean > median, the distribution is typically skewed right.


  1. FRQ Part:
    The boxplot below shows the test scores of 50 students. Identify any outliers and explain how they would affect the mean and standard deviation if removed.

Answer:
- Outliers: Any points outside the whiskers (e.g., scores below Q1 – 1.5(IQR) or above Q3 + 1.5(IQR)).
- Effect: Removing outliers would decrease the mean (since they’re likely high scores) and decrease the standard deviation (less spread).


  1. Multiple Choice:
    Which measure of spread is least affected by outliers?
  2. (A) Range
  3. (B) Standard deviation
  4. (C) Variance
  5. (D) IQR
  6. (E) Mean absolute deviation

Answer: (D) IQR
Explanation: The IQR is resistant to outliers because it only considers the middle 50% of data.


Last-Minute Cram Sheet

  1. SOCS: Shape, Outliers, Center, Spread – always describe in this order!
  2. Mean vs. Median: Symmetric → mean; skewed → median.
  3. Outlier Rule: 1.5 × IQR (Q1 – 1.5(IQR), Q3 + 1.5(IQR)).
  4. Spread Measures: IQR (resistant), standard deviation (not resistant).
  5. TI-84 Commands:
  6. 1-Var Stats (L1) → mean, median, SD, 5-number summary.
  7. STAT PLOT → Boxplot/Histogram (adjust window with ZoomStat).
  8. ⚠️ Always check shape before choosing center! (Skewed → median.)
  9. ⚠️ Outliers inflate range and standard deviation.
  10. ⚠️ IQR = Q3 – Q1 (not the middle 50% of values, but the range of them).
  11. ⚠️ Label axes and title graphs in FRQs!
  12. ⚠️ If a boxplot has a long whisker, check for outliers before concluding skew!