ECOL 425

Cards (71)

  • There are 2 types of studies:
    • Experimental
    • the researcher assigns treatments
    • can introduce artefacts
    • bias in measurements produced by unintended consequences of procedures
    • Observational
    • no influence over treatments
    • used for detecting large scale patterns
  • Experiments examine the casual relationship between a predictor (x) and response (y) variable
  • Strength is the effect of predictor (x) when isolated from the effects of confounding variables
  • Good experiments are designed "a priori" and:
    • reduce bias
    • reduce sampling error
    • difference between sample and population result
  • Bias is reduced through:
    • control groups
    • similar conditions as sample, no treatment
    • randomization
    • of individuals receiving treatment
    • cannot occur in observational studies
    • blinding
    • concealing information about treatment assigned
    • Single-blind = subjects unaware
    • Double-blind = researcher + subjects unaware
  • Sampling error is reduced by:
    • replication
    • necessary due to unique individuals
    • increased sample sizes decrease error and provide more information
    • to decide sample size:
    • predetermine level of precision OR power
    • pseudoreplication = measurements are not independent, but are recorded as so
    • balance
    • equal sample sizes in treatments
    • blocking
    • reduces variance by dividing individuals into groups and randomizing within a block
    • Ensures each group is representative of population
  • Confounding variables introduce bias.
    Can be minimised by:
    • pairing individuals with a control of similar characteristics
    • adjustment - categorising data based on the confounding variable and analysing the relationship between x and y
  • ANOVA compares group means by comparing variance between groups.
    • Individual treatment means are fitted and their distance from their treatment mean is analysed
    • Variation within groups is compared with variation among groups
    • If residuals (SSE) < individual treatment mean (SSA) = means are different
  • ANOVA is used when explanatory variables (x) are categorical
  • SSA measures the among group variation and has a df of k-1
  • SSE measures the within group variation has a df of N-k
  • The total variation in ANOVA has a df of N-1
  • Assumption of ANOVA:
    • Random sampling
    • Equal variance
    • Independence of errors
    • Normal distribution of errors
  • Factorial ANOVA tests the effects of 2+ factors and their interaction on a response (y) variable.
    • Reduces Type I error and accounts for variation from crossing variables
  • Factorial ANOVA compares variance of each effect to error variance using the mean square
  • Factorial ANOVA table:
    A) SSA
    B) SSB
    C) SSAB
    D) SSE
    E) ab - 1
    F) a - 1
    G) b - 1
    H) (a - 1)(b - 1)
    I) ab(n - 1)
    J) N - 1
    K) SSA / df
    L) MSA / MSE
    M) MSB / MSE
    N) MSAB / MSE
  • An interaction means that the effect of one factor on a response variable (y) is not constant and depends on the other factor
  • A contrast is an interpretation of a significant Multi-way ANOVA result.
    • compares groups of means (single df comparisons)
  • Contrast significance is judged by an F-test:
    F=F =SScontrastk(n1) \frac{SS_{contrast}}{k(n-1)}
    • "a priori" means "before"
    • "a posteriori" means "after the fact"
  • There can only be k-1 orthogonal contrasts.
    • statistically independent comparisons = compared only once
    • Product of contrast coefficient = 0
  • Contrast Coefficient: numerical description of hypothesis tested.
    • grouped levels get same sign
    • contrasting levels get opposite sign
    • Excluded levels get 0
    • All coefficients in contrast must sum to 0
  • Fixed effects influence the mean of y
  • Random effects influence the variance of y
    • Include numeric and factor levels
  • Nested sampling reduces random effects by accounting for variation contributed by each factor.
    Used for:
    • studies conducted at different spatial scales
    • repeated measurements from same individual
  • Split-plot analysis reduces fixed effects by splitting a sample into plots of different sizes and applying different treatments.
    • Each plot has own error variance
    • Ordered from largest plot with lowest replication to smallest plot with high replication
    Error term = error(largest/medium/smallest plot)
  • Difference between PCA and RDA:
    A) Variable reduction
    B) data visualisation
    C) Regression analysis
    D) relationship exploration
    E) only x variables
    F) x and y variables
    G) Unconstrained
    H) Constrained ordination analysis
    I) Captures overall data variation
    J) Explains variation in y by looking at variation in x
    K) PCs
    L) Significance test
  • Similarities between PCA and RDA:
    • Use loading systems
    • multi-variate
    • useful for large datasets
  • Linear regression is a measure of how steeply the response (y) variable changes with a change in explanatory variable (x)
    • uses least squares regression (line of best fit)
    • both variables are continuous
    • applied mostly in observational studies
  • Maximum Likelihood is applied for parameter estimation to increase the probability of observed data appearing
  • Regression line is calculated by: Y=Y =a+ a +bX bX
  • The regression slope is calculated by:
    b=b =(XiX)(YiY)(XiX)2 \frac{\sum (X_{i} - \overline{X})(Y_i - \overline{Y})}{\sum (X_i - \overline{X})^{2}}
  • The least squares regression line always goes through the means of x and y
    a=a =YbX \overline{Y} - b\overline{X}
  • The assumptions of linear regression:
    1. at each X, the mean of Y lies on the regression line
    2. at each X, the distribution of Y is normal
    3. at each X, Y variance is the same
    4. at each X, Y is a random sample from all possible Ys
  • Variance of residuals (MSresidual) quantifies the spread of the scatter above and below a regression line
    • df = n - 2
    • estimate slope and intercept
  • Outliers create non-normal distributions and affect estimates
    • Can affect slope calculations
    • Can cause equal variance assumption violations
  • Residuals plots can show normality and variance of the data
  • Uncertainty of estimation of the slope is measured with
    SEb=SE_b =MSresidual(XiX)2 \sqrt{\frac{MS_{residual}}{\sum (X_i - \overline{X})^2}}
  • Hypothesis testing with regression is used to evaluate whether the slope is equal to null slope, or β0.
    • under null, the df = n-2
    • t=t =bβ0SEb \frac{b - \beta_0}{SE_b}
  • Regression takes the deviation between an observation Y and mean Y and breaks it into a:
    • Residual component
    • YiY^Y_i - \hat{Y}
    • Regression component
    • Y^Yˉ\hat{Y} - \bar{Y}
    • if H0 true, both MS will be equal
    • if H0 not true, regression MS > residual MS