DESCRIPTIVE STATISTICS

Cards (33)

  • Skewness is an important measure of the shape of a distribution
  • Symmetric distributions have a skewness of zero, where the mean and median are equal
  • Moderately skewed left distributions have negative skewness, with the mean usually less than the median
  • Moderately skewed right distributions have positive skewness, with the mean usually greater than the median
  • Highly skewed right distributions have positive skewness, often above 1.0, with the mean usually greater than the median
  • Z-scores, also known as standardized values, indicate the number of standard deviations a data value is from the mean
  • An outlier is an unusually small or large value in a data set, often identified by z-scores less than -3 or greater than +3
  • The Empirical Rule applies to normally distributed data, showing the percentage of values falling within certain distances from the mean
  • The mode is the most frequently occurring score in a distribution.
  • The range is the difference between the highest and lowest scores in a distribution.
  • BOXPLOT 
    A graphical summary of data based on a five-number summary.
  • CHEBYSHEV'S THEOREM A theorem that can be used to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean.
  • COEFFICIENT OF VARIATION A measure of relative variability computed by dividing the standard deviation by the mean and multiplying by 100.
  • CORRELATION COEFFICIENT A measure of linear association between two variables that takes on values between 21 and 11. Values near 11 indicate a strong positive linear relationship; values near 21 indicate a strong negative linear relationship; and values near zero indicate the lack of a linear relationship.
  • COVARIANCE A measure of linear association between two variables. Positive values indicate a positive relationship; negative values indicate a negative relationship.
  • EMPIRICAL RULE  A rule that can be used to compute the percentage of data values that must be within one, two, and three standard deviations of the mean for data that exhibit a bell-shaped distribution.
  • FIVE-NUMBER SUMMARY A technique that uses five numbers to summarize the data: smallest value, first quartile, median, third quartile, and largest value.
  • GEOMETRIC MEAN A measure of location that is calculated by finding the nth root of the product of n values.
  • INTERQUARTILE RANGE (IQR) A measure of variability, defined as the difference between the third and first quartiles.
  • MODE A measure of location, defined as the value that occurs with greatest frequency.
  • PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT A measure of the linear relationship between two variables.
  • PERCENTILE A value that provides information about how the data are spread over the interval from the smallest to the largest value.
  • MEAN A measure of central location computed by summing the data values and dividing by the number of observations.
  • MEDIAN A measure of central location provided by the value in the middle when the data are arranged in ascending order
  • POINT ESTIMATOR A sample statistic, such as x, s2, and s, used to estimate the corresponding population parameter.
  • POPULATION PARAMETER A numerical value used as a summary measure for a population (e.g., the population mean, m, the population variance, s2, and the population standard deviation, s).
  • pth PERCENTILE For a data set containing n observations, the pth percentile divides the data into two parts: Approximately p% of the observation are less than the pth percentile and approximately (100 2 p)% of the observations are greater than the pth percentile
  • QUARTILES The 25th, 50th, and 75th percentiles, referred to as the first quartile, the second quartile (median), and third quartile, respectively. The quartiles can be used to divide a data set into four parts, with each part containing approximately 25% of the data.
  • RANGE A measure of variability, defined to be the largest value minus the smallest value.
  • SAMPLE STATISTIC A numerical value used as a summary measure for a sample (e.g., the sample mean, x, the sample variance, s2, and the sample standard deviation, s).
  • WEIGHTED MEAN The mean obtained by assigning each observation a weight that reflects its importance.
  • STANDARD DEVIATION A measure of variability computed by taking the positive square root of the variance.
  • VARIANCE A measure of variability based on the squared deviations of the data values about the mean.