experimental studies have one or more controlled factor that is manipulated to see how it affects the dependent variable
observational studies record factors of interest and their outcomes but do not change any factors (natural variation)
categorical data describes a property of individuals which can be grouped e.g hair colour, eye colour, sex
ordinal (ranked) data can be put in an order but not measured. body condition scoring falls under this category
discrete data exists as whole numbers such as counts of animals, age, number of times an animal was sick
continuous data is where any number could be measures including decimals or fractions e.g weight, length, age, temperature
mode is most commonly used in categorical data. this is the most common number/category
range is the measure of spread of data. (largest - smallest )
The interquartile range (IQR) is found when numbers are ordered and those a quarter and three quarters along are found. Their difference is the IQR
Standard deviation is the measure of spread and looks at how numbers differ from the overall mean.
a group of individuals in a given area is called a population. A random sample can be measured to represent the population as a whole. We can then use an answer from the sample (called an estimate) to give a value for the population (called a parameter)
due to random chance, estimates from samples vary. estimates from small samples vary more . This means that the bigger the sample the more likely the estimate is to be close to the true population value
standard error is a measure of how much estimates in a sample size vary. It is not as clinically relevant as confidence interval
A confidence interval is a precision or uncertainty statement for an estimate. They give sensible percentage limits within which a value is likely to fall in the wider population
the null hypothesis is the concept you are trying to disprove or nullify.
a p value is the probability that the difference found between two samples would occur if the null hypothesis were true. A p value less than 0.05 is sufficient to disprove the null hypothesis
if a null hypothesis is true and there is no difference between samples, but the statistician decides there is a difference (reject the null) this is a type 1 error (reject null when null is actually true)
if the null hypothesis is false and there is a difference between samples, but the statistician decides there is no evidence of a difference (don't reject null) then this is a type 2 error
A powerful study has a low chance of type 2 error. power is influenced by sample size, difference between populations (more marked, more obvious) and individual variation (big variation between individuals increases random variation)
a statistical value can be found to be statistically significant but not be clinically important
a deduction is a premise drawn on a general to a particular e.g all dogs (general) are mammals (particular)
AN induction is a premise drawn on a particular to a general e.g a distemper vaccine (particular) protects a dog (general)
a theory is a supposition that explains something, but cannot be proved
A hypothesis is a theory of low testedness which is repeatable and objective, can be tested. Is largely inductive
koch's postulates state that an organism is causal is it is present in all cases of disease, does not occur in other diseases and can be isolated, cultured and reintroduced to cause disease in a healthy animal
statistically significant relationships are not always causal. relationships can arise through cofounding other factors. think fan ventilation of a pig house and respiratory disease in pigs. Pig stocking density is a cofactor
concomitant variation is a graded difference in (disease) frequency
the main measurements for quantifying disease are morbidity and mortality
meaningful disease measures require definition of :
numerator - cases, deaths etc
denominator - population at risk
populations can be contiguous (most human populations, nomadic cattle, wildlife i.e. they can move and interact) or separated
morbidity has two main measures:
prevalence (number affected out of the at-risk population, % value)
incidence (cumulative (risk) or rate, measures aveage risk)
cumulative incidence measures the number of animals which become diseased in a period divided by the number of healthy animals at the BEGINNING of that period. It is statistically easy to handle but doesn't allow for the addition of animals and can only be calculated for the first occurrence of disease
incidence rate is the number of cases of disease that occur in a population divided by the sum of ALL individuals present during that length of time. Can be used to predict the speed of development of disease in a population
a baseline prevalence is increased by incidence and decreased by deaths or cures
case fatality = number of deaths / number of diseased animals
survival = ( number of cases - number of deaths ) / number of cases
serological epidemiology involves any variable in serum composition. Conventionally this is antibody levels
antibodies can be assayed either with a single serial dilution (common, complement fixation tests, agglutination tests) or multiple serial dilutions (experimental, often to assess vaccine potency)
diagnostic tests are evaluated for their validity and reliability (repeatability)
sensitivity is the probability that a diseased animal tests positive