Save
How to be Scientifically lit
Statistics and Data Visualisation
Week 5
Save
Share
Learn
Content
Leaderboard
Share
Learn
Created by
shannon reilly
Visit profile
Cards (15)
What is the line of best fit defined by?
y =
ax + b
What are the fitted values?
The
prediction
for y based on the
observed
value of x
What are the residuals?
The vertical difference from the
regression
line to the recorded data point
When can the outcome be misleading?
When data doesn't have a
linear relationship
or contains
anomalies
When is a line of best fit not useful?
Typically, the more
complex
a graph is, the less useful a line of best fit is.
What are the 5 steps in linear regression
Plot the data
Consider
assumptions
Fit the regression
Diagnostic plots
Plot the results with
uncertainties
Step 1 in more detail
When considering
regression studies
, it's important to consider
third variables
that may impact bout
explanatory
and
response variables
.
Step 2 in more detail
Linearity of
expected value
, constant
variance
,
independence
, normally distributed residuals
What is the correlation coefficient?
R
, it is between -
1
and 1
1 means a
perfect correlation
with a
positive gradient
and -1 is a
negative gradient
If y = 2x, what is the correlation between x and y?
1
if y = -0.1, what is the correlation between x and y?
-1
if R=0 what does that mean?
This means that there is no
correlation
and no
linear relationship
between x and y
what does R^2 mean?
This is the fraction of
variance
explained
. If R^2 = 0.8, then
80%
of the variance in y is explained by variance in x
When is one way analysis of variance (ANOVA) useful?
When two or more levels in categorical
explanatory
variables.
When there are two
categories
, ANOVA contains an
f-test
analogous to the
two sample t-test
(it is not identical).
What are the assumptions we make in ANOVA analysis?
The validity of our data depends on the assumptions we make about our data
normally distributed
residuals
independent
errors
(if errors are correlated)
random sampling - closely related to independent errors
homogeneity
if
variance
(i.e residuals could all be sampled from same normal distribution)
See similar decks
AP Statistics
3427 cards
Statistics
OCR A-Level Further Mathematics > Optional Papers
262 cards
3.21 Use of Data in Statistics
AQA A-Level Mathematics > 3. Subject Content
114 cards
6.2 Representing Data
GCSE Mathematics > 6. Statistics
67 cards
Unit 3: Collecting Data
AP Statistics
269 cards
6.1 Statistical Measures
GCSE Mathematics > 6. Statistics
83 cards
3.3 Random Sampling and Data Collection
AP Statistics > Unit 3: Collecting Data
42 cards
Unit 1: Exploring One-Variable Data
AP Statistics
490 cards
6.2 Representing Data
AQA GCSE Mathematics > 6. Statistics
107 cards
6.2 Representing Data
OCR GCSE Mathematics > 6. Statistics
62 cards
6. Statistics
OCR GCSE Mathematics
167 cards
3.1 Introducing Statistics: Do the Data We Collected Tell the Truth?
AP Statistics > Unit 3: Collecting Data
35 cards
Unit 2: Exploring Two-Variable Data
AP Statistics
426 cards
6.1 Statistical Measures
Edexcel GCSE Mathematics > 6. Statistics
54 cards
1.8 Graphical Representations of Summary Statistics
AP Statistics > Unit 1: Exploring One-Variable Data
80 cards
6.1 Introducing Statistics: Why Be Normal?
AP Statistics > Unit 6: Inference for Categorical Data: Proportions
51 cards
6. Statistics
Edexcel GCSE Mathematics
200 cards
2. Statistics
OCR A-Level Mathematics
438 cards
Unit 9: Inference for Quantitative Data: Slopes
AP Statistics
269 cards
6.1 Statistical Measures
OCR GCSE Mathematics > 6. Statistics
55 cards
2.5 Correlation
AP Statistics > Unit 2: Exploring Two-Variable Data
60 cards