Save
ECOL 425
Save
Share
Learn
Content
Leaderboard
Learn
Created by
Iga M
Visit profile
Cards (71)
There are 2 types of studies:
Experimental
the researcher assigns treatments
can introduce
artefacts
bias
in measurements produced by
unintended
consequences of procedures
Observational
no influence over treatments
used for detecting
large scale patterns
Experiments examine the
casual relationship
between a predictor (
x
) and response (
y)
variable
Strength is the
effect
of predictor (
x
) when
isolated
from the
effects
of
confounding
variables
Good experiments are designed "a
priori
" and:
reduce
bias
reduce
sampling error
difference
between
sample
and
population
result
Bias is reduced through:
control
groups
similar conditions
as sample, no treatment
randomization
of individuals receiving
treatment
cannot occur in
observational
studies
blinding
concealing
information about
treatment
assigned
Single-blind
= subjects unaware
Double-blind
= researcher + subjects unaware
Sampling error is reduced by:
replication
necessary due to unique individuals
increased sample sizes
decrease
error and provide
more
information
to decide sample size:
predetermine level of
precision
OR
power
pseudoreplication = measurements are not
independent
, but are
recorded
as so
balance
equal
sample sizes in treatments
blocking
reduces
variance
by dividing individuals into
groups
and
randomizing
within a block
Ensures each group is representative of population
Confounding variables introduce
bias.
Can be minimised by:
pairing
individuals with a
control
of similar characteristics
adjustment
-
categorising
data based on the
confounding
variable and
analysing
the
relationship
between x and y
ANOVA compares group
means
by comparing
variance
between groups.
Individual
treatment means are fitted and their
distance
from their treatment mean is analysed
Variation
within
groups is compared with variation
among
groups
If
residuals
(SSE) <
individual treatment mean
(SSA) = means are
different
ANOVA is used when explanatory variables (
x
) are
categorical
SSA measures the
among
group variation and has a df of
k-1
SSE measures the
within
group variation has a df of
N-k
The total variation in ANOVA has a df of
N-1
Assumption of ANOVA:
Random sampling
Equal variance
Independence
of
errors
Normal distribution
of
errors
Factorial ANOVA tests the effects of
2+
factors
and their
interaction
on a
response
(y) variable.
Reduces
Type I error
and accounts for
variation
from
crossing
variables
Factorial ANOVA compares
variance
of each effect to
error variance
using the
mean square
Factorial ANOVA table:
A)
SSA
B)
SSB
C)
SSAB
D)
SSE
E)
ab - 1
F)
a - 1
G)
b - 1
H)
(a - 1)(b - 1)
I)
ab(n - 1)
J)
N - 1
K)
SSA / df
L)
MSA / MSE
M)
MSB / MSE
N)
MSAB / MSE
14
An interaction means that the
effect
of one
factor
on a response variable (
y
) is not
constant
and
depends
on the other factor
A contrast is an
interpretation
of a
significant
Multi-way ANOVA result.
compares
groups
of means (
single df
comparisons)
Contrast significance is judged by an F-test:
F
=
F =
F
=
S
S
c
o
n
t
r
a
s
t
k
(
n
−
1
)
\frac{SS_{contrast}}{k(n-1)}
k
(
n
−
1
)
S
S
co
n
t
r
a
s
t
"
a priori
" means "before"
"
a posteriori
" means "after the fact"
There can only be
k-1
orthogonal
contrasts.
statistically
independent
comparisons = compared only
once
Product
of
contrast coefficient
=
0
Contrast Coefficient:
numerical description
of hypothesis tested.
Rules:
grouped
levels get
same
sign
contrasting
levels get
opposite
sign
Excluded
levels get
0
All
coefficients in contrast must
sum
to
0
Fixed
effects influence the
mean
of y
Random
effects influence the
variance
of y
Include numeric and factor levels
Nested sampling reduces
random
effects by accounting for
variation
contributed by each
factor.
Used for:
studies conducted at different
spatial scales
repeated
measurements from same
individual
Split-plot analysis reduces
fixed
effects by
splitting
a sample into plots of different
sizes
and applying different
treatments.
Each plot has own
error variance
Ordered from
largest
plot with
lowest
replication to
smallest
plot with
high
replication
Error term =
error
(
largest
/
medium
/
smallest
plot)
Difference between PCA and RDA:
A)
Variable reduction
B)
data visualisation
C)
Regression analysis
D)
relationship exploration
E)
only x variables
F)
x and y variables
G)
Unconstrained
H)
Constrained ordination analysis
I)
Captures overall data variation
J)
Explains variation in y by looking at variation in x
K)
PCs
L)
Significance test
12
Similarities between PCA and RDA:
Use
loading
systems
multi-variate
useful for
large
datasets
Linear regression is a measure of how steeply the response (
y
) variable
changes
with a
change
in explanatory variable (
x
)
uses least squares regression (
line of best fit
)
both variables are
continuous
applied mostly in
observational
studies
Maximum Likelihood
is applied for
parameter
estimation to increase the probability of observed data appearing
Regression line is calculated by:
Y
=
Y =
Y
=
a
+
a +
a
+
b
X
bX
b
X
The regression
slope
is calculated by:
b
=
b =
b
=
∑
(
X
i
−
X
‾
)
(
Y
i
−
Y
‾
)
∑
(
X
i
−
X
‾
)
2
\frac{\sum (X_{i} - \overline{X})(Y_i - \overline{Y})}{\sum (X_i - \overline{X})^{2}}
∑
(
X
i
−
X
)
2
∑
(
X
i
−
X
)
(
Y
i
−
Y
)
The
least squares
regression line
always goes through the
means
of x and y
a
=
a =
a
=
Y
‾
−
b
X
‾
\overline{Y} - b\overline{X}
Y
−
b
X
The assumptions of linear regression:
at each X, the
mean
of Y lies on the
regression line
at each X, the
distribution
of Y is
normal
at each X, Y
variance
is the
same
at each X, Y is a
random sample
from all possible Ys
Variance
of residuals (MSresidual) quantifies the
spread
of the scatter above and below a
regression line
df =
n - 2
estimate
slope
and
intercept
Outliers create
non-normal
distributions and affect
estimates
Can affect
slope
calculations
Can cause equal
variance
assumption
violations
Residuals plots can show
normality
and
variance
of the data
Uncertainty
of
estimation
of the
slope
is measured with
S
E
b
=
SE_b =
S
E
b
=
M
S
r
e
s
i
d
u
a
l
∑
(
X
i
−
X
‾
)
2
\sqrt{\frac{MS_{residual}}{\sum (X_i - \overline{X})^2}}
∑
(
X
i
−
X
)
2
M
S
res
i
d
u
a
l
Hypothesis
testing with
regression
is used to evaluate whether the
slope
is equal to
null slope
, or β0.
under
null
, the df =
n-2
t
=
t =
t
=
b
−
β
0
S
E
b
\frac{b - \beta_0}{SE_b}
S
E
b
b
−
β
0
Regression takes the deviation between an observation Y and mean Y and breaks it into a:
Residual
component
Y
i
−
Y
^
Y_i - \hat{Y}
Y
i
−
Y
^
Regression
component
Y
^
−
Y
ˉ
\hat{Y} - \bar{Y}
Y
^
−
Y
ˉ
if H0 true, both MS will be
equal
if H0 not true,
regression
MS >
residual
MS
See all 71 cards