A way to get information from data and a tool for creating new understanding from a set of numbers
Types of statistics
Descriptive statistics
Inferential statistics
Descriptive statistics
Involve summarizing data, such as calculating averages, ranges, and frequencies
Techniques of descriptive statistics
Graphical techniques
Numerical techniques
Numerical techniques of descriptive statistics include mean, median, and range
Inferential statistics
Involves making inferences and drawing conclusions about a population based on a sample
Population
Very large group
Sample
Smaller group
Parameter
Characteristic of the population
Statistic
Characteristic of the sample
In inferential statistics, we use a sample to draw conclusions about the parameters that might be true or might not be true
Statistical inference is the process of making an estimate about a population based on a sample
We use statistics to make inferences (predictions) about parameters because it is easier and sometimes less expensive, but it is not always correct
Measures used in statistical inference
Confidence level
Significance level
Confidence level is the proportion of times that an estimate will be correct, while significance level is the proportion of times that an estimate will be wrong
The range is the difference between the highest and lowest values, and the smaller the range, the more accurate the prediction
The significance level is represented by the Greek letter alpha (α)
Classes
Series of intervals that has a range of observations
Interpreting classes
Histogram (bar chart)
Sturge's formula
1+3.3log(n) - to determine how many classes should be defined
Shapes of histograms
Symmetric
Positively skewed
Negatively skewed
Unimodal (one peak)
Bell-shaped
Cross-sectioned data
Observations measured at the same point in time (regardless of the time)
Time series data
Observations measured at successive points in time (attention of the time)
Modality
Unimodal & Bimodal
Frequency and relative frequency tables and cross-classification table (Contingency) are used to identify patterns in data
Interval data
Scatter diagram is used to understand how two interval variables are related