Statistics is a science that deals with the collection, organization, summarization, presentation and analysis of data.
Descriptive statistics aim in summarizing and presenting data in the form which will make them easier to analyze and interpret.
Inferential statistics aim at drawing and making decision on the population based on evidence obtained from a sample.
Parametric statistics is a statistical approach that assumes random sample from a normal distribution and involves testing of hypothesis about the population.
Nonparametric statistics is a statistical approach with no underlying data distribution assumed and involves hypothesis testing about a population median.
Ordinal scale measurement data can be arranged in order, but differences either can’t be found or are meaningless.
Ratio scale measurement there is a natural zero starting point and ratios make sense.
Nominal scale measurement data cannot be arranged in order.
Interval scale measurement differences are meaningful, but there is no natural zero starting point and ratios are meaningless.
Nominal cause of death (cancer, heart attack, accident, and others)
Gender (male, female)
Bloodtype
Civilstatus
Eyecolors
Pain level (none, mild, moderate, severe)
Grades (A, B, C, D, or F)
Temperature (body temperature of 36.7°C and 37.1°C
Years (1492 and 1776 can be arrange in order, and the difference of 284 years can be found and is meaningful.)
Heights of the students: heights of 180 cm and 90 cm for a high school student and a preschool pupil (0 cm represents no height, and 180 cm is twice as tall as 90 cm.)
Class times: times of 50 min and 100 min for a statistical class (0 min represents no class time, and 100 min is twice as long as 50 min.)
Qualitative Variable (categorical) – consists of names or labels.
Quantitative Variable (numerical)
Discrete (countable)
Continuous (measurable)
Probability is the chance of an event occurring.
A probability distribution for a discrete random variable consists of values that the variable can assume, and the probabilities associated with the values.
The probability of an event happening, P(E), is equal to the number of ways it can happen, n(E), divided by the total number of outcomes, n(S).
Discrete probability distributions can be presented by using a graph, table or notation formula.
A discrete probability distribution, Pr(X), must satisfy the following requirements: the probability of each of the events in the sample space must be from 0 to 1, and the sum of the probabilities of all events must be equal to 1.
In a probability distribution, the probability of an event happening, P(X), is represented as a number between 0 and 1.
The probability of an event happening, P(X), can also be represented as a fraction, where the numerator is the number of times the event has occurred and the denominator is the total number of outcomes.
Real life applications of probability distributions include understanding the number of customers in an office canteen on a certain 6-day period, and determining the probability of having a certain number of cigarettes each day in 30 days.
Constructing a histogram for a probability distribution involves arranging the values in the distribution from lowest to highest, and assigning a color to each value.
Determining probabilities in a probability distribution involves understanding the probability of each event happening, the probability of an event not happening, the probability of an event happening between certain values, and the probability of an event happening more than certain values.
A random variable is a variable that represents a random phenomenon.
Random variables can be classified as discrete or continuous.
The possible values of a random variable can be found.
The sample space of an experiment is the set of all possible outcomes of the experiment.
Continuous random variable: Takes on values on a continuous scale, represents measured data.
Discrete random variable: Set of all possible outcomes is countable, represents count data.