questionnaire design issue 2

Cards (36)

  • scales and instruments
    • measurement instruments - studying info not directly available for conscious awareness
    • attitude scales - feelings, perception likes, dislikes, interests and preferences
    • psychometric - psychological assessment - intelligence, personality
  • types of scales - likert scale - attitude measure
    • likert scale developed in an attempt to improve the levels of measurement in social research
    • a 5 point scale to express magnitude of agreement or disagreement
    • respondents are requested to state their level of agreement with a series of statements
  • optimal responding vs non contingent responding
    • optimal responding: ideal or desired behaviour - implies thoughtful, accurate, and contextually appropriate answers. avoiding biases such as social desirability or acquiescence. it aims to capture the true opinions, attitudes and behaviours of the respondents
  • non contingent responding: tendency to provide consistent answers across different questions, regardless of the content or context of each question. It suggests that individuals respond in a stable and uniform manner throughout the questionnaire. not unduly influenced by question wording, order effects, or other external factors. however, extreme non-contingency might indicate rigidity in responses
  • thurstone scale
    • developed by thurstone in 1928, as a means of measuring attitudes towards religion
    • equal appearing interval, each statement represents a different scale value for the attitude (highly favourable attitude towards religion, to neutral to highly unfavourable, determined by a panel of judges
  • thurstone scale - limitations
    • subjectivity in statement selection: the choice of statements can impact the validity and reliability of the scale
    • true interval scale: achieving true interval scaling is challenging - assumes the distances between adjacent scale points are equal
    • limited sensitivity: limited availability to differentiate between individuals with subtle differences in attitudes
    • assumption of unidimensionality: in reality, attitudes and opinions may be more complex and multidimensional, which can limit the scales ability to capture the full range of variation
  • thurstone scale-limitations cont
    • scoring complexity: can be complex and may involve sophisticated statistical techniques such as factor analysis
    • response bias: reluctance to use extreme categories
    • reliance on expert judgement: the creation of thurstone scales often involves expert judgement in the selection and ranking of statements. critics argue that this reliance on expert opinion may introduce biases and limit the scales generalizability
  • guttman scale
    • unidimensional scale with cumulative property: statements are ordered so that person who accepts a particular item will also accept all previous items
    • preise is that if a person agrees on an extreme indicator on a variable in question, they will also agree on a less extreme indicator
  • guttman scale - to determine if a hierarchical pattern exists between among responses
    1. do you drink
    2. do you smoke marijuana
    3. do you use cocaine?
    if person answers yes to 3 then they would also answer yes to 2 and 1
  • guttman scale limitations
    • unrealistic assumption of unidimensionality: that all items are measuring a single latent trait. The previous example shows that this is not true
    • stringent requirements for scalability: all respondents must endorse all items below their own position on the scale. limited flexibility in item placement
    • the guttman scaling procedure assumes a fixed hierarchy of items, this lack of flexibility can be a limitation when researchers want to add or remove items from the scale or consider alternative items
  • guttman scale
    • difficulty in scale development: requires a rigorous process of item development, testing and refinement. Achieving a perfect hierarchical arrangement of item can be challenging
  • guttman scale limitations
    • scoring complexity: the scoring and analysis of guttman scales can be complex, especially when dealing with large datasets
  • guttman scale limitations
    • limitation measurement precision of the underlying trait. the scale provides ordinal data but may not offer a precise measurement of the interval between different levels on the scale
  • guttman scale limitations
    • guttman scales may struggle to detect intermediate positions on the latent trait. in other words, the scale might not effectively capture nuances or variations in respondents' attitudes or behaviours
  • semantic differential scale
    • type of rating scale designed to measure the connotative meaning to receive the attitude towards the given object, event or concept
    • measures more depth in someone's attitude - encourages participant to think around topic more deeply
  • semantic differential scale limitations
    • relatively simplistic representation of attitudes or concepts may not capture the full complexity and nuances of the underlying construct being measured
    • have a limited number of scale points, from 5-7 and may not provide enough granularity to accurately reflect the subtleties of respondents attitudes
    • interpretation subjectivity: different individuals may interpret the scale points differently, leading to potential variations in how respondents understand and use the scale
    • assumptions of linearity: implying the distance between each point is equal
  • semantic differential scale limitations
    • culture and language sensitivity: the choice of words to represent the scale endpoints may not have the same connotations or meanings in different cultural or linguistic contexts
    • limited contextual information: semantic differential scales provide a numerical score but may lack detailed contextual information about why respondents chose a particular position on the scale
    • inability to capture changes over time
    • difficulty in developing anchors
  • semantic differential scale limitations cont
    • selecting appropriate and balanced scale anchors can be chalenging
    • potential response bias: respondents may exhibit response bias, such as a tendency to only use certain points on the scale to avoid extreme categories
  • semantic differential scale limitations continued
    • selecting appropriate and balanced scale anchors can be challenging
    • potential response bias: respondents may exhibit response bias, such as tendency to use only certain points on the scale or to avoid extreme categories
  • psychometric tests
    • tests and questionnaires are often referred to as 'psychometric' because psychological theories of human behaviour and its measurement have been used in their construction
    • used to measure a person's capacities, work style or values. employers need this sort of information when they want to recruit a new employee or understand the potential and development needs of an existing one
  • psychometric tests
    when developing a new psychometric measure e.g. for work performance
    1. psychologists first carefully define what it is they want to measure
    • involves researching evidence on work performance to identify which personal factors are related to quality of functioning in a particular area
  • 3 diff categories of psychometric tests
    1. normative tests - where data exists which tells us the range of scores expected from the population under consideration e.g. IQ scores
    2. criterion referenced tests - tests commonly used in education where a candidate has to meet some pre-arranged standard
    3. idiographic tests - tests are used in therapy to observe an individuals progress over time
  • applications of personality traits
    • criminal psychologists might employ questionnaires to measure impulsivity and its relation to crime
    • health psychologists might measure peoples optimism in relation to their response to cancer diagnosis
    • occupational psychologists often employ personality tests to predict job performance and job suitability
    • all require standardisation - must be administered and scored the same way every time
  • Construction of Psychometric tests
    • Rigorous construction procedures including several pilot stages
    • Large number of questions (test items) typically >40
    • Item analysis: – individual items– combined effects of test items – Filter redundant/ non equivalent questions out
    • Evaluation to ensure: – Reliability (same scores over time)– Validity (does it measure what it is supposed to?)– Appropriate convergent and discrimination(compared to other measures ..?)
  • test-retest reliability: if a person retakes the test or takes a similar test within a short time after first testing, does he or she receive approximately the same score?
  • reliability :split half method: half of the test is administered on one occasion, the second half on another, to the same participants
  • alternative-forms method: two equivalent versions of test developed and given to same participants on two occasions
  • internal reliability
    • determines the internal consistency or average correlation of items in a questionnaire to assess its internal reliability
    • greek letter alpha
    • should range between 0 an 1 ( if negative, check you have reversed scored the correct items)
    • an a> .70
  • face validity - does the test seem valid according to common sense
  • validity cont
    • criterion validity: the extent to which a measure relates to an outcome
  • external validity: how well the findings of a study generalise to other situations or populations
  • content validity: a test should sample the full range of a behaviour represented by the theoretical concept being tested ( not just one part of it - difficulty concentrating, remembering details and making decisions )
  • construct validity
    • does it truly represent the theoretical construct it was developed to assess
  • ecological validity: are the results representative of the results that would be obtained from studying that behaviour in the natural environment
  • some threats to internal validity
    • morality: affects longitudinal studies - participants drop out before study is completed
    • maturation: change independently of your study. factors such as tiredness, boredom and hunger
  • threats to internal validity of experiments
    regression effect: tendency of participants with extreme scores on a first measure to score closer to the mean on a second testing
    • related to the fact that extreme scores tend to be due to random error and so on second testing performance will be closer to the mean. you might conclude good students did worse (less effort) and bad students did better (benefitted from interventions)