Statistics Part I Hypothesis testing

Cards (55)

  • Descriptive statistics often relates to central tendency and variability in data sets
  • Measures of central tendency relates to how the data collected is cultured on a graph and what this means in terms of our results
  • Mode, Median, Mean are examples of central tendency
  • Mode= most common
  • Bimodal= 2 most common
  • Median= middle observation in data
  • Centre mass of data is the mean
  • Equation of the mean
    A) Adding sum of all data
    B) Dividing sum by amount of data
  • Measures of spread variability applies
    • Range
    • Inter-quartile range
    • Standard deviation
  • Standardised units of common measure
    A) Standard deviation
  • Standard deviation allows us to understand the spread and dispersion of our data, which helps in making predictions and drawing conclusions.
  • A small standard deviation means that the data points are close to the mean, indicating that the values in the dataset are consistent and similar to each other.
  • A large standard deviation, it tells us that the values in the dataset are more spread out and varied.
  • Standard deviation
    A) How far away/ spread the data is from the mean
  • A normal distribution symmetrically distributes data around the mean, forming a bell-shaped curve.
  • Larger population size= closer the convergence to true population mean
  • Normal distribution describes the average result found within the sample population
  • The perfect parameters in a normal distribution are?
    Mode, median, mean= 0 and SD=1
  • Standard deviation
    A) Standard deviation lengthens distribution
  • Residual errors are the differences between observed and predicted values in a model.
  • extrapolation and prediction are used within linear regression models
  • Correlation describes the degree of association between two variables.
  • Linear regression is for making predictions about the value of one variable (dependent) based on the value of another (independent) in future data
  • What does this calculation represent?
    A) Mean
  • The extreme bounds of the data relate to?
    The range
    A) highest-lowest range
  • Focusing on the Central tendency distribution relates to the interquartile range
    A) 25%
    B) 25%
  • The distribution of sample means will approach a normal distribution, regardless of the population's shape relates to The Central Limit Theorem
  • Variability and variance are both measures of how spread out a single variable's values are.
  • The data is reliable but often wrong which is defined as High Precision, High Bias
  • The data is unreliable and often wrong, which can be defined as Low Precision, High Bias
  • The data is reliable and mostly correct, which can be defined as High Precision and low Bias
  • If data is unreliable but mostly correct we define it as low Precision, low bias:
  • Precision describes the variability and bias is the central tendency
  • The sample mean is can be defined as an unbiased estimate of the population mean
  • The sample standard deviation is often termed as a biased estimate of the population standard deviation as it tends to underestimate the true value, especially for small sample sizes.
  • We can fix sample standard deviation bias by increasing the sample size and applying the formula correction N-1
  • When research only investigates an outcome variable, we often measure this through an observational design there is no independent manipulation
  • When we think there's an effect but there's actually not; what type of error is this?
    Type I error
  • When we think there's no effect but there's actually an effect; What type of error is this?
    Type II error
  • We think there's no effect (Retain Null) x There's actually no effect (Null is true) leads us to conclude a true negative