The science of collecting, organizing, presenting, analyzing, and interpreting numerical data to assist in making more effective decisions
Statistic
A characteristic of a sample (mean, standard deviation, variance, or any other measure based on a sample data)
Why study statistics
To scientifically measure conditions of any given problem and assess existing relationship(s)
To show the laws underlying facts and events that cannot be determined by individual observations
To reveal cause and effect relations that otherwise may remain unknown
To uncover ambiguous trends and behavior in related conditions
Uses and applications of statistics
Marketing
Accounting
Business
Quality control
Politics
Sports
Health administration
Education
Descriptive statistics
The method of organizing, summarizing, and providing a description of the sample data in an informative way
Descriptive statistics
Presenting data in percentage, ranks, standard units, frequency distribution, measures of location, measures of dispersion
Inferential statistics
Used to infer the truth or falsity of a hypothesis, make a decision, estimate, prediction, or generalization about a population based on a sample
Population (N)
A collection of possible individuals, objects, elements, or measurements of interest
Sample (n)
A portion, or part, of the population of interest
Population and sample
N = 296 3rd year nursing students
n = 72 students selected to participate in the survey
Non-parametric statistics
The branch of statistics wherein the gathered data to be analyzed are not required to fit a normal distribution
Non-parametric statistics
Use data that are often ordinal, do not rely on number but rather a ranking or order, can be used without the mean, sample size, standard deviation, or the estimation of any other related parameters
Parametric statistics
The branch of statistics concerned with data measurable on interval or ratio scales and the sample size is appropriate, so that arithmetic operations are applicable to them, enabling parameters such as the mean of the distribution to be defined
Parametric vs non-parametric
If your measurement scale is nominal or ordinal then you use non-parametric statistics, if you are using interval or ratio scales, you use parametric statistics
Qualitative or attribute variable
The characteristic or variable being studied is nonnumeric
Qualitative variables
Gender, religious affiliation, type of automobile owned, place of birth, hair color
Quantitative variable
The variable can be reported numerically
Discrete variable
Can only assume certain values and there are usually "gaps" between values, typically result from counting
Discrete variables
Number of chairs in a classroom, number of cars exiting the University main gate over an hour, number of students in each section of graduate Statistics class, number of children in a family
Continuous variable
Can assume any value within a specific range
Continuous variables
Time it takes to fly from Cebu to Manila, air pressure in a tire, weight of a shipment of grains, distance between Cebu and Bohol, balance in your checking account, minutes remaining in class
Datum
One information
Data
Many information, known facts, figures, observations, statistics, records and reports
Ungrouped data
Raw, unorganized information
Grouped data
Data presented in a frequency distribution table, organized, or processed data
Data
Singular: datum, Plural: data (many information, known facts, figures, observations, statistics, records and reports)
Nominal level data
The "lowest" level of data measurement, classification has no natural order, no measurement involved, only counts, mutually exclusive and exhaustive categories with no logical order
Nominal level data examples
hair color
gender
religious affiliation
Ordinal level data
May be arranged in some order, but differences between data values cannot be determined or are meaningless, one category is "higher" or "better" than the next one, mutually exclusive and exhaustive categories ranked according to the particular trait they possess
Interval level data
Includes all the characteristics of the ordinal level, plus the difference between values is a constant size, no natural zero point
Ratio level data
The "highest" level of data measurement, has all the characteristics of the interval level, plus the zero (0) point is meaningful and the ratio between two numbers is meaningful
Ratio level data examples
Money
Units of production
Weight
Income
Number of students
Measure of central location
Also called measure of central tendency, refers to any measure indicating the center of a set of data arranged in either descending or ascending order, the most commonly used are mean, median and mode
Mean
The arithmetic mean of a set of values, the quantity commonly called the mean or the average, the sample mean is an unbiased estimator for the population mean
Properties of the arithmetic mean
Every set of interval-level and ratio-level data has a mean
All the values are included in computing the mean
A set of data has only one mean, the mean is unique
The mean is a useful measure of comparing two or more populations
The arithmetic mean is the only measure of location where the sum of the deviations of each value forms the mean will always be zero
Weighted mean
An average in which each quantity to be averaged is assigned a weight, data elements with high weight contribute more to the weighted mean than do elements with low weight
Median
The middle value when the number of observations is odd, or the arithmetic mean of the two middle values when the number of observations is even, usually divides the group into two equal parts
Median examples
Median of 27, 23, 29, 5, 25, 24, 22, 30, 23, 9, 26, 22, 24, 13, 26 is 24
Median of 14, 5, 8, 3, 8, 18, 21, 25, 24, 10, 3, 4, 15, 28 is 12 (arithmetic mean of 10 and 14)
Mode
The value that appears most frequently, especially useful in describing an ordinal level of measurement
Mode examples
In the series 84, 85, 92, 78, 65, 69, 79, 69, 78, 93, 91, 68, 75, 80, 67, the modes are 69 and 78, the median is 78, and the mean is 78.2