Fatskills
Practice. Master. Repeat.
Study Guide: Intro to Business Statistics: Introduction to Statistics - Data Sources, Primary vs. Secondary Internal vs. External Surveys Experiments Observational Studies
Source: https://www.fatskills.com/business-analytics/chapter/intro-to-business-statistics-busstats-introduction-to-statistics-data-sources-primary-vs-secondary-internal-vs-external-surveys-experiments-observational-studies

Intro to Business Statistics: Introduction to Statistics - Data Sources, Primary vs. Secondary Internal vs. External Surveys Experiments Observational Studies

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~5 min read

What This Is

Data sources are crucial in business decisions as they provide the necessary information to make informed choices. A retail chain wants to know if average daily sales exceed $10,000 to determine if they should expand their store hours. They collect data from their sales records, which is an example of a primary data source. However, they also consider data from industry reports, which is a secondary data source.

Key Formulas & Symbols

  • Primary data: Data collected directly from the source, such as surveys, experiments, or observations.
  • Secondary data: Data collected from external sources, such as industry reports, academic journals, or government statistics.
  • Internal data: Data collected within the organization, such as sales records or customer feedback.
  • External data: Data collected from outside the organization, such as market research or competitor analysis.
  • Survey: A method of collecting data through questionnaires or interviews.
  • Experiment: A method of collecting data by manipulating one or more variables and measuring the effect.
  • Observational study: A method of collecting data by observing the behavior or characteristics of individuals or groups without manipulating any variables.
  • Confidence interval: A range of values within which the true population parameter is likely to lie.
  • Margin of error: The maximum amount by which the sample estimate may differ from the true population parameter.
  • Sample size: The number of observations in the sample.
  • Population size: The total number of observations in the population.
  • Sampling error: The difference between the sample estimate and the true population parameter.
  • Bias: A systematic error in the data collection process that can lead to inaccurate results.
  • p-value: The probability of observing the data (or more extreme) if the null hypothesis is true.
  • Null hypothesis (H?): A statement of no effect or no difference.
  • Alternative hypothesis (H?): A statement of an effect or difference.
  • Type I error: The probability of rejecting the null hypothesis when it is true.
  • Type II error: The probability of failing to reject the null hypothesis when it is false.

Step-by-Step Procedure

  1. State hypotheses: Clearly define the null and alternative hypotheses.
  2. Choose test: Select the appropriate statistical test based on the research question and data type.
  3. Compute test statistic: Calculate the test statistic using the sample data.
  4. Find p-value or critical value: Determine the p-value or critical value using the test statistic and a statistical table or software.
  5. Compare to ?: Compare the p-value or critical value to the significance level (? = 0.05).
  6. Conclude: Make a decision based on the results, either rejecting or failing to reject the null hypothesis.

Common Mistakes

  • Mistake: Using a t-test when the population standard deviation is unknown.
  • Correction: Use a t-test with a small sample size (n < 30) or when the population standard deviation is unknown. This is because the t-test is more robust to non-normality and unequal variances.
  • Mistake: Misinterpreting the p-value as the probability that the null hypothesis is true.
  • Correction: The p-value is the probability of observing the data (or more extreme) if the null hypothesis is true. It does not provide information about the probability of the null hypothesis being true.
  • Mistake: Failing to check for assumptions of the statistical test.
  • Correction: Check for assumptions such as normality, equal variances, and independence before selecting a statistical test.

Quick Practice Problems

  1. A retail chain wants to know if the average daily sales exceed $10,000. They collect data from their sales records and calculate a sample mean of $9,500 with a sample standard deviation of $1,000. What is the 95% confidence interval for the population mean? Answer: ($8,419.11, $10,580.89) Explanation: The confidence interval is calculated using the formula: x? ± (Z * (?/?n)), where x? = sample mean, Z = critical value from the standard normal distribution,-= sample standard deviation, and n = sample size.

  2. A company wants to know if the average customer satisfaction rating is higher than 4. They collect data from a survey and calculate a sample mean of 4.2 with a sample standard deviation of 0.5. What is the p-value for the null hypothesis that the population mean is equal to 4? Answer: 0.012 Explanation: The p-value is calculated using a t-test with a small sample size (n = 30) and a significance level of-= 0.05.

  3. A marketing firm wants to know if the average response rate to an email campaign is higher than 2%. They collect data from a survey and calculate a sample mean of 2.5% with a sample standard deviation of 1%. What is the margin of error for the population mean? Answer: 0.25% Explanation: The margin of error is calculated using the formula: (Z * (?/?n)), where Z = critical value from the standard normal distribution,-= sample standard deviation, and n = sample size.

Last-Minute Cram Sheet

  1. Primary data is collected directly from the source.
  2. Secondary data is collected from external sources.
  3. Internal data is collected within the organization.
  4. External data is collected from outside the organization.
  5. A survey is a method of collecting data through questionnaires or interviews.
  6. An experiment is a method of collecting data by manipulating one or more variables and measuring the effect.
  7. An observational study is a method of collecting data by observing the behavior or characteristics of individuals or groups without manipulating any variables.
  8. The confidence interval is a range of values within which the true population parameter is likely to lie.
  9. The margin of error is the maximum amount by which the sample estimate may differ from the true population parameter.
  10. The p-value is the probability of observing the data (or more extreme) if the null hypothesis is true.
  11. p-value is NOT the probability that H? is true – it’s the probability of observing the data (or more extreme) if H? is true.
  12. Type I error is the probability of rejecting the null hypothesis when it is true.
  13. Type II error is the probability of failing to reject the null hypothesis when it is false.
  14. A t-test is used when the population standard deviation is unknown or with a small sample size (n < 30).
  15. A Z-test is used when the population standard deviation is known and the sample size is large (n-30).