Data and Information

Cards (28)

  • Descriptive Statistics
    • Statistical techniques for summarizing and presenting data in a form that will make them easier to analyze and interpret
    • Counts (ratios, range, rates, median, standard deviation), proportions, tables, graphs, summary measures, etc.
    • reject or accept null hypothesis
    • positivity or negativity of an occurrence
  • Inferential Statistics
    • Concern with making estimates, predictions, generalizations and conclusions about a target population based on information from a sample
  • Descriptive and Inferential statistics CAN be used simultaneously.
  • Choice of numeric or graphic descriptive statistics is dependent on type of distribution of data
  • Levels of Variables
    1. Quantitative
    2. Ratio
    3. Interval
    4. Qualitative
    5. Ordinal
    6. Nominal
  • Qualitative Variable
    • variables whose categories are simply used as labels to distinguish one group from another
    • numerical representation of the categories are for labeling/coding and not for comparison (greater or less)
    E.g. Religion, place of residence, disease status 
    • Dichotomous: two labels (gender)
    • Trichotomous: three labels
    • Multinomous: 4 or more labels
  • Quantitative Variable
    1. Discrete
    2. Can assume only integral values or whole numbers
    3. Continuous
    4. Can attain any value including fractions or decimals
  • Continuous Data
    • Where there are infinite number of possible values (e.g. blood pressure measurements)
    • Means and standard deviations maybe used
  • Discrete Data
    • Where there are only a few possible values (e.g., sex)
    • Percentages of people for each value may be considered
  • Quantitative Data are either in discrete or continuous form.
  • Nominal
    • A classificatory scale where the categories are used as labels only (does not represent quantity)
    • Number or names which represent a set of mutually exclusive and exhaustive classes to which individuals or objects (attributes) may be assigned
    • E.g. Sex (Male and Female), Race, Blood Groups, seat belts in car, psych diagnosis, patient ID no.
  • Ordinal
    • Categories can be ordered or ranked; however the distance between the two categories cannot be clearly quantified
    • E.g. Age groups (Infant, children, teenager, adult), Likert scales (strongly disagree, disagree, agree, strongly agree)
  • Interval
    • Distances between all adjacent classes are equal
    • Conceptually, these scales are infinite, in that they have neither beginning nor ending
    • Zero point is arbitrary and does not mean absence of the characteristic (only a point of reference)
    • E.g. Temperature, IQ
  • Ratio
    • A meaningful zero point exists;there is value
    • Ratio of two numbers can be meaningfully computed and interpreted
    • E.g. Weight, Blood Pressure, Height, Doctor visits, number of DMF teeth 
  • Variable: Cause-and Effect relationship 
    • Has quality or quantity:
    • Dependent Variable - something that might be affected by the change in the independent variable
    • What is observed?
    • What is measured?
    • The data collected during the investigation
    • Independent Variable - something that is changed by the scientist
    • What is tested?
    • What is manipulated?
  • Controlled Variable
    • A variable that is not changed
    • always the instrument used to measure the dependent variable
    • Also called constants
    • Allow for a “fair test”
    • E.g. Duration of the experiment, experimental technique, species, sample volume, etc
  • Graphical Method
    • Simple to “read” and appeal to more people, especially those who are not numerically inclined
    • Horizontal line (abscissa/X-axis)
    • Basis of classification
    • Vertical line (ordinate/Y-axis)
    • Enumerative data (e.g., number of observation, percentages or rates)
  • Pie Chart
    • Shows the percentages of the total number of observations falling into each categories 
    • Qualitative variable 
    • Examples:
    • civil status
    • gender
    • religion
    • blood type
  • Bar Graph
    • Used to portray numerical measurements across categories of qualitative variable or a discrete quantitative variable 
    • Should be equal width and gaps should separate them to show discontinuities
    • can be horizontal or vertical
    • good to present nominal and ordinal
    • Horizontal - qualitative
    • Vertical - quantitative
  • Component Bar Diagram
    • Percentages of two or more variables within the nominal (quantitative), there are several ordinals.
  • Histogram
    • Presents frequency distribution of continuous quantitative variable
    • Class intervals are joined on the horizontal axis against its corresponding frequencies on the vertical axis presented by bars (frequency polygon - barless)
    • Variables establish continuity (E.g., salary)
    • establish association and relationship of variables (same with frequency polygon)
    • used when the variables share clusters and related to each other
  • Line Graph
    • Portrays trends over time; these could be trends of disease, mortality rates, % immunized, annual family income, etc.
    • Time series
    • should not be connected to 0
  • Frequency Polygon
    • connected to 0
    • Presents frequency distribution of continuous quantitative variable(same as Histogram)
    • Looks like a line graph, however plots of the first and last class intervals are joined in the horizontal axis
    • Midpoints of each class interval are connected against its corresponding frequencies
  • Stem-and-Leaf Plot
    • used when provided with several numeral data and you just want to classify them into clusters
    • Similar to histogram
    • one way to organize data
    • used by PRC before all the technology
    • Looks like a histogram, depicts not only the frequencies but also the range, mode, median and shape of distribution
    • For quantitative variables
  • Box Plot
    • "whisker plot"
    • Useful for showing description of a large quantitative data including the center, spread, shape, tail length and outliers
    • Can be presented in either horizontal or vertical
    • Systematic reviews
    • can only be used to 50 variables in/with comparison
  • Scatter Plot
    • Presents relationships between two quantitative variables
    • One variable plotted on the x-axis and the other on the y-axis
    • Plotted points fall in a straight line indicate a linear relationship between x and y
    • Widely scattered points indicate NO relationship between x and y
  • Guidelines in Graph Construction
    1. Title should be self-explanatory
    2. Can be placed below or after the chart, but observe consistency in position
    3. Scales should have good proportioning, not too compressed or wide
    4. For multiple trend lines or curves, identify them using labels or a legend
    5. Use of color to emphasize or differentiate various items
  • Guidelines in Graph Constructions
    • title should be self-explanatory
    • Can be placed below or after the chart, but observe consistency in position
    • Table → title (above)
    • Graphical Illustrations → footnote (below)
    • Scales should have good proportioning, not too compressed or wide
    • For multiple trend lines or curves, identify them using labels or a legend
    • Use of color to emphasize or differentiate various items