Data- Raw material of statistics. Values that the variables can assume
Data has two kinds of numbers, measuring and counting.
Data set- Collection of data values
Data Value or Datum- Each value in the data set
Sources of data- Suitable data to serve as the raw material for our investigations
Sources of data can come from:
•Routinely kept records
•Personal interviews
•Surveys, questionnaires
•Experiments
•External sources
Statistics -Science of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data (drawing inferences from a sample of population)
Biostatistics -When data analyzed are derived from the biological sciences, medicine, and public health
Purpose of statistics: To investigate and evaluate the nature and meaning of information contained in numbers (data)
Descriptive Statistics- Consists of the collection, organization, summarization, and presentation of data
Inferential Statistics -Consists of generalizing from samples to populations, performing estimations, probabilities, hypothesis testing, determining relationships among variables, and making predictions
Population (collection of entities) -Consists of all subjects (human, animals, machines, places) being studied for which we have an interest at a particular time
Population of values:
•Finite –can be measured/counted
• Infinite –cannot be measured/counted
Sample -Group of subjects or entities selected from a population
Variable -A characteristic or attribute that assumes different values
Kinds of variables:
Qualitative
Quantitative
Random
Random -occurs by chance; cannot be predicted
Discrete variables (finite) -Assume values that can be counted (0,1,2,3)
Continuous variables (infinite) -Can assume an infinite number of values in an interval between any two specific values. Often include fractions and decimals
Boundaries- given in 1 additional decimal place and always end with the digit 5
Nominal Level -Names; No rank or order can be placed on the data
Ordinal Level -Can be placed into categories; can be ordered or ranked. Precise measurements between ranks do not exist
Interval Level -Precise distance or differences DO exist between units. One property lacking: no true zero value
Ratio Level -Relationship between two numbers. Have differences between units. Exists a true zero value or true ratio between values
Systematic samples- Numbering each subject of the population then selecting every kth subject
Random samples- Selected by using chance methods or random numbers. Computer-generated random numbers, Table of random numbers
Stratified samples- Obtained by dividing the population into groups (strata) according to some characteristic then sampling from each group randomly
Cluster Samples- Population is divided into groups called clusters by some means. Researcher randomly selects some of these clusters and uses all members of the selected clusters as subjects
Convenience sampling –subjects that are convenient
Sequential sampling –studies one group after another
Double sampling –employs initial then follow up samples
Multi-stage sampling –taking samples in stages; using smaller and smaller units at each stage
Observational studies -Researcher merely observes what is happening or what has happened in the past and tries to draw conclusions based on these observations
Experimental studies -Researcher manipulates one of the variables and tries to determine how the manipulation influences other variables
Quasi-experimental study –researcher manipulates variables without the random assignment of participants
Independent variables -Also known as explanatory variable. Variable that is manipulated by the researcher
Dependent variables-Resultant variable or the outcome variable. Variable being studied to see if it has changed significantly due to the manipulation of the IV
Experimental study groups:
•Treatment group
•Control group
Hawthorne effect -Behavioral change due to awareness of being observed
Confounding variable -Unforeseen variable affecting the results of the study