Statistical techniques for summarizing and presenting data in a form that will make them easier to analyze and interpret
Counts (ratios, range, rates, median, standard deviation), proportions, tables, graphs, summary measures, etc.
reject or accept null hypothesis
positivity or negativity of an occurrence
Inferential Statistics
Concern with making estimates, predictions, generalizations and conclusions about a target population based on information from a sample
Descriptive and Inferential statistics CAN be used simultaneously.
Choice of numeric or graphic descriptive statistics is dependent on type of distribution of data
Levels of Variables
Quantitative
Ratio
Interval
Qualitative
Ordinal
Nominal
Qualitative Variable
variables whose categories are simply used as labels to distinguish one group from another
numerical representation of the categories are for labeling/coding and not for comparison (greater or less)
E.g. Religion, place of residence, disease status
Dichotomous: two labels (gender)
Trichotomous: three labels
Multinomous: 4 or more labels
Quantitative Variable
Discrete
Can assume only integral values or whole numbers
Continuous
Can attain any value including fractions or decimals
Continuous Data
Where there are infinite number of possible values (e.g. blood pressure measurements)
Means and standard deviations maybe used
Discrete Data
Where there are only a few possible values (e.g., sex)
Percentages of people for each value may be considered
Quantitative Data are either in discrete or continuous form.
Nominal
A classificatory scale where the categories are used as labels only (does not represent quantity)
Number or names which represent a set of mutually exclusive and exhaustive classes to which individuals or objects (attributes) may be assigned
E.g. Sex (Male and Female), Race, Blood Groups, seat belts in car, psych diagnosis, patient ID no.
Ordinal
Categories can be ordered or ranked; however the distance between the two categories cannot be clearly quantified
E.g. Age groups (Infant, children, teenager, adult), Likert scales (strongly disagree, disagree, agree, strongly agree)
Interval
Distances between all adjacent classes are equal
Conceptually, these scales are infinite, in that they have neither beginning nor ending
Zero point is arbitrary and does not mean absence of the characteristic (only a point of reference)
E.g. Temperature, IQ
Ratio
A meaningful zero point exists;there is value
Ratio of two numbers can be meaningfully computed and interpreted
E.g. Weight, Blood Pressure, Height, Doctor visits, number of DMF teeth
Variable: Cause-and Effect relationship
Has quality or quantity:
Dependent Variable - something that might be affected by the change in the independent variable
What is observed?
What is measured?
The data collected during the investigation
Independent Variable - something that is changed by the scientist
What is tested?
What is manipulated?
Controlled Variable
A variable that is not changed
always the instrument used to measure the dependent variable
Also called constants
Allow for a “fair test”
E.g. Duration of the experiment, experimental technique, species, sample volume, etc
Graphical Method
Simple to “read” and appeal to more people, especially those who are not numerically inclined
Horizontal line (abscissa/X-axis)
Basis of classification
Vertical line (ordinate/Y-axis)
Enumerative data (e.g., number of observation, percentages or rates)
Pie Chart
Shows the percentages of the total number of observations falling into each categories
Qualitative variable
Examples:
civil status
gender
religion
blood type
Bar Graph
Used to portray numerical measurements across categories of qualitative variable or a discrete quantitative variable
Should be equal width and gaps should separate them to show discontinuities
can be horizontal or vertical
good to present nominal and ordinal
Horizontal - qualitative
Vertical - quantitative
Component Bar Diagram
Percentages of two or more variables within the nominal (quantitative), there are several ordinals.
Histogram
Presents frequency distribution of continuous quantitative variable
Class intervals are joined on the horizontal axis against its corresponding frequencies on the vertical axis presented by bars (frequency polygon - barless)
Variables establish continuity (E.g., salary)
establish association and relationship of variables (same with frequency polygon)
used when the variables share clusters and related to each other
Line Graph
Portrays trends over time; these could be trends of disease, mortality rates, % immunized, annual family income, etc.
Time series
should not be connected to 0
Frequency Polygon
connected to 0
Presents frequency distribution of continuous quantitative variable(same as Histogram)
Looks like a line graph, however plots of the first and last class intervals are joined in the horizontal axis
Midpoints of each class interval are connected against its corresponding frequencies
Stem-and-Leaf Plot
used when provided with several numeral data and you just want to classify them into clusters
Similar to histogram
one way to organize data
used by PRC before all the technology
Looks like a histogram, depicts not only the frequencies but also the range, mode, median and shape of distribution
For quantitative variables
Box Plot
"whisker plot"
Useful for showing description of a large quantitative data including the center, spread, shape, tail length and outliers
Can be presented in either horizontal or vertical
Systematic reviews
can only be used to 50 variables in/with comparison
Scatter Plot
Presents relationships between two quantitative variables
One variable plotted on the x-axis and the other on the y-axis
Plotted points fall in a straight line indicate a linear relationship between x and y
Widely scattered points indicate NO relationship between x and y
Guidelines in Graph Construction
Title should be self-explanatory
Can be placed below or after the chart, but observe consistency in position
Scales should have good proportioning, not too compressed or wide
For multiple trend lines or curves, identify them using labels or a legend
Use of color to emphasize or differentiate various items
Guidelines in Graph Constructions
title should be self-explanatory
Can be placed below or after the chart, but observe consistency in position
Table → title (above)
Graphical Illustrations → footnote (below)
Scales should have good proportioning, not too compressed or wide
For multiple trend lines or curves, identify them using labels or a legend
Use of color to emphasize or differentiate various items