Variable - a measure of a single characteristic that can vary
Variation - is seen not only in the presence or absence of disease but also in the stages or extent of disease
2 MEASUREMENT OF ERROR
• Systemicerror - type of variation that can distort data systematically in one direction
Can introduce bias
• Randomerror - type of variation that arerandom
Does not introduce bias
Biological difference - difference in genes, nutrition, race, environmental exposures, sex, age
STATISTICS AND VARIABLES
Quantitative data - characterized by using a defined continuous measurement of scale
Qualitative data - described by features, generally in words rather than numbers
TYPES OF VARIABLES
nominal variable
dichotomous (binary) variable
ordinal (ranked) variable
continuous (dimensional) variable
ratio variable
nominal variables - naming or categoric variables that are not based on measurement scales or rank order
dichotomous (binary) variables - "cut into two" variables with only two levels
ordinal ( ranked ) variables - data can be characterized in terms of three or more qualitative values that have a clearty implied direction from better to worse
continuous (dimensional ) variables - data are measured on continuous measurement scales.
ratiovariables - if a continuous scate has a true 0 point
Dichotomous, nominal and ordinal variables are referred as Discrete Variables because the numbers of possible values they can take are countable
Risk and Proportions - two important types of measurement in medicine, share some characteristics of a discrete variable and some characteristics of a continuous variable
Unit of observation - is the person or thing from which the data originated.
frequency distribution - shows the values of the variable along one axis and the frequency of the value along the other axis.
Types of frequency distribution
Real frequency distribution
theoretical frequency distribution
Real frequency distribution - types of frequency distribution that are those obtained from actual data or a sample
Theoretical frequency distribution - type of frequency distribution that are calculated using assumptions about the population from which the sample was obtained.
normal distribution - also called as "Gaussian Distribution"
described after JohannkarlGauss
looks something like a bell shape seen from the side
Parameters of a Frequency Distribution
measures of central tendency
Measures of dispersion
measures of central tendency
mode - most commonly observed value
median - the middle observation when data have been arranged in order from the lowest value to the highest value.
Mean - is the average value, or the sum (S) of all the observed values (xi) divided by the total number of observations (N)
variance - the fundamental measure of dispersion
Standard deviation - the square root of the variance
Skewness - A horizontal stretching of a frequency distribution to one side or the other, so that one tail of observations is longer and has more observations than the other tail
Kurtosis - is characterized by a vertical stretching or flattening of the frequency distribution
Hypotheses testing of one mean
deals only with one sample or group
Hypotheses testing of two mean
deals only with two sample or group
T-test
used to compare the means of a continuous varuable in two research sample
Two sample t-test
used if the two research sample come from two diffetent group
Paired t-test
used if the two research sample come from the same diffetent group
Central limit theorem
the entire data will approximate a normal distribution
the higher the sample the closer to a normal distribution
Population variance is known
z test is used
sampling distribution of mean is normally distributed