categorical data. To help you remember that data are in named categories, think NOM, the French for 'name’. E.g grouping people according to their favourite subject
Ordinal data:
the results are points from a scale. To help you to remember that ORDinal data means points in ORDer along a scale. E.g. Putting subjects in order of liking.
Interval data:
data is measured using units of equal intervals. NO TRUE ZERO POINT. E.g., Temperature
Nominal Data
Data is allocated to mutually exclusive categories and is discrete as it can only appear in one category.
Data is in the form of frequencies.
Category labels are names so there is no order.
Simplest level of data, Eg. Smoker OR Non-Smoker
Measure of Central Tendency = Mode
Presented by: Table, tallies, bar chart or pie chart
Ordinal Data
Data is ordered in some way.
Data is often in the form of a scale which consists of ratings/rankings.
Using a scale allows you to make statements about size of scores but the extent of comparison is limited because the intervals between units are not equal.
Better than nominal data but lacks precision because it is based on subjective opinion.
Eg. On a scale of 1 – 10 how much do you like psychology?
Measure of Central Tendency = Median
Measure of Dispersion = Range
Interval Data
Like ordinal data but based on numerical scales that include units of equal, precisely defined size.
Better than ordinal data because we use public scales of measurement that produce data based on accepted units of measurement.
Eg. Temperature, Time, Weight.
The 20 degree difference between 10-30 celsius and 50-70 celsius is known to be equivalent because this scale has equal, precisely defined units. As a result we can add and subtract these values, but we cannot multiple or divide them.
Measure of Central Tendency = Mean
Measure of Dispersion = Standard Deviation
Descriptive statistics
The information collected in any study is called data.
This comes in two forms; qualitative and quantitative.
We are going to focus on the latter; numerical data, considering the various ways that we can analyse this data to draw meaningful conclusions.
This is known as descriptive statistics; which includes measures of central tendency, measures of dispersion and also graphs!
Measures of central tendancy
• These measures are ‘averages’.
• They give us the most typical values in a set of data.
• There average can be calculated in 3 different ways: mean, median, mode
Mean - add up and divide by total number
most informative and can only be used with interval data, measured on a standardised scale
Mean is most sensitive as it includes all scores/values in the data set within it’s calculation. It is therefore more representative of the data as a whole.
Median - middle value when data is ordered.
cannot use with nominal data but can use with ordinal and interval data
Median is less sensitive and therefore not impacted by extreme scores
Mode - most common
only measure of central tendency you can use with nominal data / categorical data
Mode is easy to calculate but not representative of whole data set.
Measures of central dispersion
Measures of Dispersion can also be used to analyse data. They are based on the spread of scores. How far scores vary and differ from one another.
The measures of dispersion that you need to know about are: Range, Standard deviation
Range
tells us the difference between the top and bottom values in a set of data.
It is customary to add 1 as it allows for the fact that raw scores are sometimes rounded up or down when they are recorded.
Range:
Strengths
Easy to calculate
Limitations
Affected by extreme values
Fails to account for distribution of the numbers i.e. whether the score are closely distributed around the mean or more spread out
Standard deviation
Standard deviation is a more precise measure of dispersion and gives us a single value that tells us how far score deviate (move away from) the mean.
A large SD means there is a large spread of data around the mean – therefore suggesting not all p’s were affected in the same way by the IV
Small SD means the data is clustered closer to the mean – implying that all p’s responded in a fairly similar way.
Example SD
Standard deviation:
Strengths
Takes into account all scores.
More precise measure of SD and is not difficult to work out with a calculator.
Limitations
It may hide some of the characteristics of the data set e.g extreme scores
Ways to display quantitative data and data distribution
Tables (For pre-analysis raw data, summarised data (SD, mean, range,ect)
Bar charts (For noncontinuous data, columns can't touch)
Histograms (For continuous data, each column shows class interval)
Scattergrams (For correlational relationships)
Data distribution
If you measure certain variables, such as height of all the people in sixth form, the frequency of these measurements should form a bell-shaped curve. This is called a normal distribution which is symmetrical.
Within a normal distribution, most people (or items) are located in the middle area with few at the extreme ends. The mean, median and mode all occupy the same midpoint of the curve.
Data distributions
Not all distributions have such a balanced symmetrical pattern.
Instead some may produce skewed distributions – this is when the distribution appears to lean to one side or another.
Positive skew is where distribution is concentrated towards the left, resulting in a long tail on the right.
E.g. a very hard test where most students scored low marks and few people scored high – this would cause a positive skew.
Negative skew is caused by the opposite e.g. an easy test where many scored highly and few got low marks.
Which skew:
mean first, then median, then mode
Negative
Which skew:
Mode first, then median, then mean
Positive
Characteristics of Normal Distributions
Mean, median and mode are all in the same midpoint
Distribution is symmetrical about the midpoint
Dispersion of scores of measurement either side of midpoint if consistent and can be expressed in standard deviations