Fatskills
Practice. Master. Repeat.
Study Guide: Intro to Business Statistics: Descriptive Statistics - Measures of Shape, Skewness Kurtosis
Source: https://www.fatskills.com/business-analytics/chapter/intro-to-business-statistics-busstats-descriptive-statistics-measures-of-shape-skewness-kurtosis

Intro to Business Statistics: Descriptive Statistics - Measures of Shape, Skewness Kurtosis

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~4 min read

What This Is

Measures of shape, specifically skewness and kurtosis, are essential in business statistics to understand the distribution of data. A retail chain wants to know if average daily sales exceed $10,000, but the data is not normally distributed. By calculating skewness and kurtosis, they can determine if the data is heavily skewed or has extreme outliers, which can affect the accuracy of their sales forecasting models.

Key Formulas & Symbols

  • Skewness (?) = (1/n) * ?(xi - x?)³ / ?³ where xi = individual data point, x? = sample mean, n = sample size,-= sample standard deviation.
  • Kurtosis (?2) = (1/n) * ?(xi - x?)? / - 3 where xi = individual data point, x? = sample mean, n = sample size,-= sample standard deviation.
  • Kurtosis (?2) = (n(n+1)(2n+1)) / (6(n-1)(n-2)(n-3)) * ?(xi - x?)? / for population kurtosis.
  • Coefficient of Kurtosis (?2) = (n(n+1)(2n+1)) / (6(n-1)(n-2)(n-3)) for population kurtosis.
  • Skewness Coefficient (?1) = (n+1) / (6 * ?(n)) * ?(xi - x?)³ / ?³ for sample skewness.
  • Kurtosis Coefficient (?2) = (n(n+1)(2n+1)) / (6(n-1)(n-2)(n-3)) * ?(xi - x?)? / for sample kurtosis.
  • Pearson's Skewness Coefficient = (n * ?(xi - x?)³) / (?(n * ?(xi - x?)²)³) for sample skewness.
  • Pearson's Kurtosis Coefficient = (n * ?(xi - x?)?) / (?(n * ?(xi - x?)²)?) - 3 for sample kurtosis.
  • Kolmogorov-Smirnov Test Statistic (D) = max |F(x) - F?(x)| where F(x) = cumulative distribution function, F?(x) = empirical distribution function.
  • Anderson-Darling Test Statistic (A) = -n - 0.5 * (1 + (1/(1 - 2/n) * ?(2i-1) * (n+1-i) / n * (log(F(x_i)) + log(1-F(x_(n+1-i)))))) where F(x) = cumulative distribution function.

Step-by-Step Procedure

  1. State hypotheses: Determine the null and alternative hypotheses for skewness and kurtosis (e.g., H?:-= 0, H?:-? 0).
  2. Choose test: Select the appropriate test for skewness and kurtosis (e.g., Kolmogorov-Smirnov test, Anderson-Darling test).
  3. Compute test statistic: Calculate the test statistic (e.g., D, A) using the given data.
  4. Find p-value or critical value: Determine the p-value or critical value for the test statistic using a statistical table or software.
  5. Compare to ?: Compare the p-value or critical value to the significance level? (e.g.,-= 0.05).
  6. Conclude: Based on the comparison, reject or fail to reject the null hypothesis.

Common Mistakes

  • Mistake: Misinterpreting the p-value as the probability that the null hypothesis is true.
  • Correction: The p-value is the probability of observing the data (or more extreme) if the null hypothesis is true. Use it to determine the significance of the results.
  • Mistake: Failing to check the assumptions of the test (e.g., normality, independence).
  • Correction: Check the assumptions before applying the test to ensure the results are valid.
  • Mistake: Using the wrong test statistic or formula.
  • Correction: Verify the correct test statistic and formula for the specific test and data.

Quick Practice Problems

  1. A company wants to know if their sales data is normally distributed. They calculate the skewness coefficient as 0.5. What is the p-value for a two-tailed test at-= 0.05?

Answer: 0.31 (The p-value is calculated using a statistical table or software, but for this example, we'll use a hypothetical value.)

Explanation: The skewness coefficient is used to determine the p-value for a two-tailed test.

  1. A quality control team wants to check if their manufacturing process produces normally distributed products. They use the Anderson-Darling test and obtain a test statistic of 2.5. What is the p-value for a two-tailed test at-= 0.05?

Answer: 0.02 (The p-value is calculated using a statistical table or software, but for this example, we'll use a hypothetical value.)

Explanation: The Anderson-Darling test statistic is used to determine the p-value for a two-tailed test.

  1. A marketing team wants to know if their customer satisfaction scores are normally distributed. They calculate the kurtosis coefficient as 3.5. What is the p-value for a two-tailed test at-= 0.05?

Answer: 0.12 (The p-value is calculated using a statistical table or software, but for this example, we'll use a hypothetical value.)

Explanation: The kurtosis coefficient is used to determine the p-value for a two-tailed test.

Last-Minute Cram Sheet

  1. Skewness (?) = (1/n) * ?(xi - x?)³ / ?³ for sample skewness.
  2. Kurtosis (?2) = (1/n) * ?(xi - x?)? / - 3 for sample kurtosis.
  3. Kolmogorov-Smirnov Test Statistic (D) = max |F(x) - F?(x)| for normality test.
  4. Anderson-Darling Test Statistic (A) = -n - 0.5 * (1 + (1/(1 - 2/n) * ?(2i-1) * (n+1-i) / n * (log(F(x_i)) + log(1-F(x_(n+1-i)))))) for normality test.
  5. p-value is NOT the probability that H? is true – it’s the probability of observing the data (or more extreme) if H? is true.
  6. Assumptions of normality test: independence, normality, and equal variances.
  7. Use the correct test statistic and formula for the specific test and data.
  8. Check the assumptions of the test before applying it.
  9. Misinterpretation of p-value is a common mistake.
  10. Use-= 0.05 as the default significance level.