C7 Correlation

Cards (21)

  • Bivariate is concerned with the examination of two variables simultaneously.
  • You cannot tell whether there is an association between two variables by examining the two frequency distributions, means, or variances. You must employ bivariate methods.
  • The correlation coefficient is a bivariate statistic that measures the degree of linear association between two quantitative variables.
  • Pearson's product moment correlation coefficient (sometimes known as PPMCC or PCC,) is a measure of the linear relationship between two variables that have been measured on interval or ratio scales. It can only be used to measure the relationship between two variables which are both normally distributed.
  • Paired data. Two sets of observations are paired if each observation in one set has a special correspondence or connection with exactly one observation in the other data set. To analyze paired data, it is often useful to look at the difference in outcomes of each pair of observations.
  • A discrete unit of information is called a data point. Any single fact is a data point, broadly speaking. A data point can be quantitatively or graphically represented and is typically produced from a measurement or research in a statistical or analytical context.
  •  Association is a statistical relationship between two variables. This relationship tells you nothing but the value on one variable when you are known of the value of the other. These variables or measured quantities are dependent.
  • A scatterplot reveals the presence of association between two variables. The stronger the linear relationship between two variables, the more the data points cluster along an imaginary straight line.
  • A scatterplot also will indicate the direction of the relationship.
  • In a positive (direct) association, the ellipse goes from the lower left corner to the upper right.
  • In a negative (inverse) association, the data points go from the upper left corner to the lower right.
  • The direction of a relationship is independent of its strength.
  • An outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are sometimes excluded from the data set.
  • A relationship is said to be linear if a straight line accurately represents the constellation of data points.
  • A curvilinear relationship in statistics refers to a type of relationship between two variables where the change in one variable is associated with a non-linear change in another variable.
  • In mathematics and statistics, covariance is a measure of the relationship between two random variables. The metric evaluates how much – to what extent – the variables change together. In other words, it is essentially a measure of the variance between two variables.
  • Covariance Formula: Cov= 𝚺(X-X-bar)(Y-Y-bar)/n-1
  • Pearson r (defining formula) r= Σ(X−X-bar)( Y−Y-bar)/ n /Sₓ Sᵧ = Cov /Sₓ Sᵧ. , r simply is the covariance placed over the product of the two standard deviations.
  • The magnitude of r ranges from 0 to ±1.00, regardless of the scales of the two variables.
  • When no relationship exists, r=0; when a perfect relationship exists, r=1.00 or −1.00; and intermediate degrees of association fall between these two extremes of r.
  • A bivariate distribution is represented in most complete fashion by a scatter diagram.