Save
Maths
statistics
Save
Share
Learn
Content
Leaderboard
Learn
Created by
Ellie
Visit profile
Cards (48)
primary
data - collect yourself
secondary
data - collected by someone else
discrete
data - counted data
continuous
data - measured data
quantitative
- numerical
qualitative
- words
census -
entire
population
sample -
part
of population
census
advantage -
limited
/no
bias
disadvantage -
time
consuming,
expensive
and
difficult
to collect
Sample
advantage -
cheaper
,
less
time consuming easier
disadvantage - could be
biased
simple random sampling
- every member of the sample has an equal chance of being selected
Non-random
sampling where the sample selection is based on factors other than just random chance,in other words it is
biased
in nature
Non random
sampling types are -
opportunity
sampling,
cluster
sampling,
quota
sampling,
systematic
sampling,
stratified
sampling
Cluster sampling
The
population
is split into smaller
groups
called clusters. One or more
clusters
are chosen at
random
and the sample is everyone in those clusters
View source
Opportunity sampling advantages
Easy
to do
Quick
Cheap
View source
Cluster sampling advantages
Representative of the
population
if clusters are
representative
of the population
Time
and cost efficient
View source
Opportunity sampling disadvantage
Can be
biased
based on where you chose to
stand
/time of
day
View source
Cluster sampling disadvantage
High sampling error
Complex
View source
Opportunity sampling
A sampling type where you pick a
location
to stand and ask
people
at that location
View source
Quota sampling
Where you select people of a certain
type
for your sample
View source
Systematic sampling
The sample is chosen from the sample frame by picking randomly and choosing the rest of the sample at
regular
intervals
View source
Stratified sampling
Each
sample
matches the
proportion
of the entire population
View source
Quota Advantages
quick
and
easy
cheap
representative
of
target population
Quota disadvantages
large potential for
bias
Not
generalisable
to population
Systematic advantages
Easy
to do
Easy
to use for a
large
population
Systematic disadvantages
Potential for
bias
How to conduct a random sample
Number the
population
in a
list
Randomly
select n
members
using a random number
generator
Ignore
,
repeats
continue until you have n unique
numbers
When getting the
midpoint
of age from a table you have to add
0.5
to the midpoint
For
discrete
data
LQ=n+1/
4
Median=n+
1
/
2
UQ=3(n+1)/
4
For
continuous
LQ=n/
4
Median=n/
2
UQ=3n/
4
Cumulative frequency graph
Plot the
cumulative frequency
against the
END POINTS
start from
0
Join
all
points
curve
Histogram
Frequency=
frequency density
x class
width
Box plots
No skew
median
is in middle mode=
median=mean
Positive
skew median closer to
LQ
mode<median<mean
Negative
skew meadian closer to
UQ
mode>median>mean
When comparing data from box plots you need to comment on the
median
,the spread(
IQR
or
range
) and
skewness
and also add
context
Any number more than 1.5 IQR's away from the nearest
quartile
is an
outlier
You still need to include any outliers as
x
or .
Anything more than
2
standard deviations away from the mean is also considered an
outlier.
Regression line
In the form
y=a+bx
the
co-efficient
tells
you
the change in y for each unit change in
x
Mutually exclusive
is two events that
can't
happen at the
same
time
AUB
Everything
in
A
and
B
AnB
The
intersection
of
A
and
B
A'
Everything
but
A
Discrete - typically
integer
value
E.g shoe size,number of students
Continuous - typically
fractions
or d.p
E.g foot length,height,weight,time,temperature,age
Binomial Distribution
is used when:
An
experiment
is repeated a given
number
of times
When there are only
2
outcomes (fail/
success
)
The trials are
independent
from each other,so the probability of
success
is the same
each
time
Binomial
PD
if (X=5)
Binomial
CD
if (X<or equal to 5)
See all 48 cards