Statistics homework

Cards (96)

  • The number of deaths due to cancer has grown over the years in the Netherlands. In 1970, it was reported that 25 217 people died of cancer. In 2002, the number of deaths reported was 37 975. A politician uses these figures to claim that no progress has been made in treating cancer; in fact, the politician claims that treatment has gotten worse.
  • Variable
    A better summary measure than the number of deaths for examining the effectiveness of treatment of cancer in the Netherlands
  • There are indications that an increase in the quantity of calcium in one's diet can help reduce blood pressure. In an experiment, additional calcium was added to the diet of an experimental group, while a placebo was added to a control group's diet. Each participant's systolic blood pressure while resting was measured before the experiment, and again 12 weeks into the experiment.
  • Within-subjects design
    Each participant's score after is compared with their score before
  • A student wants to know if the students in his study year are about the same age. To answer this question, he asks six different students their age.
  • Mean
    The average of a set of numbers
  • Median
    The middle value in a set of numbers when they are arranged in order
  • Standard deviation
    A measure of the spread of a set of numbers around the mean
  • The Questionnaire of Study Habits and Attitudes (QSHA) is a psychological test designed to measure student motivation, study habits, and attitudes. A University gives the QSHA to a sample of 18 female first-year students.
  • Histogram
    A graphical representation of the distribution of numerical data
  • Outlier
    An observation that is numerically distant from the rest of the data
  • Five-number summary
    The minimum, first quartile, median, third quartile, and maximum of a set of numbers
  • 1.5 x IQR rule

    A rule for identifying outliers based on the interquartile range
  • Boxplot
    A graphical representation of the five-number summary of a set of numbers
  • SPSS (Statistical Package for the Social Sciences) is a fully-featured statistical software package. SPSS is designed to make statistical calculations easy.
  • Data view
    The SPSS window where you can view and edit data, like a spreadsheet
  • Variable view
    The SPSS window where you can define variables, rename variables, and change their properties
  • SPSS files always end in the extension .sav
  • In the dataset "1 - IQ.sav", you will find (hypothetical) IQ scores from 12 children, along with their sex and age.
  • In the column for the variable sex, there are two scores: 1 and 2. This is a coding of the sex for each case. The code 1 stands for female, and the code 2 stands for male.
  • Measure
    A column in the SPSS variable view that determines what type of variable the row represents
  • mine the coding in use in a data set, we need to use the SPSS's "Variable view"
  • Examining variable properties in SPSS Variable view
    1. Click on "Variable view" on the bottom of the screen
    2. Look at the row corresponding to a variable
    3. Look under the "Values" column
    4. Click on the "Values" cell and the three dots to examine the coding
    5. Add new codings for data sets whose codings are not explicit
  • Label column in SPSS Variable view
    Add descriptions to variables
  • Measure column in SPSS Variable view
    Determines what type of variable the row represents (Nominal, Scale, etc.)
  • Computing summary statistics of IQ scores in SPSS
    1. Click AnalyzeDescriptive Statistics → Descriptives
    2. Select the IQ variable
    3. Click the symbol to add IQ to the list of variables
    4. Click "Options..." to modify the descriptive statistics computed
    5. Click "OK" to compute the descriptive statistics
  • The output from SPSS analyses can be saved or printed for later use
  • SPSS output files are saved with the .spo extension
  • To save SPSS data, click File → Save or File → Save As...
  • A body temperature of 37°C has been considered "normal" for the past century
  • The data file "1 - Temperature.sav" contains measurements of body temperatures of many adult men and women
  • Making a histogram for the body temperature variable in SPSS
    1. Ignore the Gender variable for now
    2. Make one histogram for body temperature
  • "Normal" temperature
    The typical or average temperature
  • Measuring how much temperatures vary
    Using summary statistics like range, variance, standard deviation
  • The "normal" temperature of 37°C may not accurately represent the distribution of body temperatures in the data
  • Making histograms and boxplots to compare body temperatures between men and women
    1. Create a histogram with Gender as the panel variable
    2. Create a boxplot with Temp as the variable and Gender as the category axis
  • The "normal" temperatures may not be the same for men and women
  • What IQ scores ensure that a child will not receive CBT?
  • Tufte's principles
    Principles described by Edward Tufte to follow when creating graphical displays of data to clearly communicate the intended information
  • Tufte's principles exercise
    1. Describe the conclusion drawn from the plot
    2. Describe any difficulties in interpreting the plot
    3. Describe how to make the plot better
    4. Sketch a better plot and describe conclusions