2 = Attention

Cards (38)

  • Attention has a very broad meaning   =>  There are likely be multiple psychological and/or perceptual mechanisms behind these. As such, this makes defining and/or researching attention rather difficult.
  • Definition of Visual Attention:
    The selection of some visual stimulus or set of visual stimuli at the expense of others for further visual and cognitive analysis and often for the control of behaviour.
    Alert = not choosing/selecting what to focus/attend to, just a little more alert.
  • Multiple types of attention:

    External, Internal, Overt, Covert, Divided, Sustained
  • External attention:

    Attending to stimuli in the world
  • Internal attention:
    Attending to one line of thought over another or selecting one response over another
  • Overt attention:

    Directing a sense organ towards a stimulus, like pointing your eyes or turning your head
  • Covert Attention:
    Attending without directing sensory organs towards a stimulus
  • Divided Attention: 

    Splitting attention between two different stimuli
  • Sustained Attention:
    Continuously monitoring some stimulus
  • Why do we need attention?
    Translated into information measures, each eye takes in ~1 GB per second
    The amount of sensory information coming in imposes a requirement for multiple mechanisms to enable it to process information efficiently (e.g. spatial summation or spatial filters) However, even then it is also necessary to select information, to prevent overload and hitting the capacity limits for transmission, processing and storage
  • Define spatial summation:
    Spatial summation in the context of attention involves the brain’s ability to integrate multiple sensory inputs from different locations to form a coherent perception. This process allows us to focus on important stimuli while filtering out irrelevant information.
    For example, when in a noisy room, your brain uses spatial summation to combine auditory signals from various sources. This helps you concentrate on a specific conversation despite the background noise. 
  • Define spatial filters (attention):
    Spatial filters refer to the brain’s ability to selectively process information from specific regions of the visual field while ignoring irrelevant areas. This mechanism helps us focus on important stimuli and filter out distractions.
    For example, when you’re looking for a friend in a crowded room, your brain uses spatial filters to prioritize visual information from the area where you expect to find them, while ignoring other parts of the room. This selective attention allows you to efficiently locate your friend despite the surrounding noise and movement
  • What is selected?
    Salient information – selected for bottom-up or top-down salience
    Bottom-up: salient aspects of the stimulus capture attention and drive the system from ‘the bottom-up’ - something that happens that we then attend to.
    Top-down: internally determined (e.g. through motivation) salient content is selected from ‘the top-down’ - think of something, then attending to it as a result of the thought.
  • What is salience?
    Salience refers to the quality of a stimulus that makes it stand out and capture attention.
  • How is it selected?
    Location-based (spatial) attention (i.e. look at the back)
    Feature-based attention (i.e. white spots on bunnies)
    Object-based attention (i.e. follow the ball)
  • Initial models of (spatial) attention (focusing on a specific location)
    "Spotlight" model:
    Attention is restricted in space and moves from one point to the next. Areas within the spotlight receive extra processing (Posner, 1980).
    "Zoom lens" model:
    The attended region can grow or shrink depending on the size of the area to be processed (Eriksen and Yeh, 1985)
    But what about the features? (i.e. colour, motion direction, orientation?)
  • Finding the location of a feature: Visual Search (Triesman & Gelade, 1980) – requires attention?
    Feature search, Conjunction search, Shape (i.e., spatial configuration) search


    Visual search is where you have to find a particular ‘target’ element amongst many other elements (distractors).

    This is the task that Triesman and Gelade designed and used to study ATTENTION to features. The findings from their studies led to the development of the feature integration theory (FIT).
  • Visual Search: Feature Search
    Feature search: target defined by the presence of a single feature (i.e. find red bar amongst blue bars), doesn’t matter how many other “distractors” there are, incredibly quick at identifying
  • Visual Search: Conjunction search
    Conjunction search: target defined by the conjunction (co-occurrence) of two or more features. Reaction time finding the target takes longer the more distractors there are., most likely because double checking you are correct in you identification.  
  • Visual search: Shape (i.e., spatial configuration) search:
    Shape search: the target and distractors contain the same basic, spatial features
  • What have we learned from visual search?
    The visual search literature provided the methods to quantify attentional efficiency.
    The efficiency of visual search is the average increase in RT for each item added to the display.
    Measured in terms of search slope, or ms/item 
    The larger the search slope (more ms/item), the less efficient the search.
    Some searches are efficient and have small slopes (parallel – can look at all and identify quickly).
    Some searches are inefficient and have large slopes (serial – have to go from item to item to make sure it’s the right one)
  • Define ecological validity:
    Ecological validity refers to the extent to which the findings of a research study can be generalized to real-world settings
  • The methods of visual search also provided a way to investigate one of the key questions of perception: how are objects represented?
    Colour, motion, and orientation are represented by somewhat specialized brain regions (e.g. motion in dorsal, orientation and colour in ventral stream)
    So how do we combine these features when perceiving the bar?
    The challenge of tying different attributes of visual stimuli, which are handled by somewhat different brain circuits, to the appropriate object so we perceive a unified object
  • Terms of reference for the brain:
    Superior/Dorsal = top of the brain
    Inferior/Ventral = bottom of the brain
    Anterior/Rostral = front of the brain
    Posterior/Caudal = back of the brain
    Towards the Midline of the brain = medial
    Away from the Midline of the brain = lateral
  • Anne Treisman: Feature integration theory (Triesman & Gelade, 1980)

    This theory holds that a limited set of basic features can be processed in parallel pre-attentively, but that other properties, including the correct binding of features to objects, require attention.
  • Define the pre-attentive stage in the Feature Integration Theory (FIT)
    Pre-attentive stage: The processing of a stimulus that occurs before selective attention is deployed to that stimulus
  • Feature Integration Theory: Details
    Sensory “features” (colour, size, orientation, etc.) coded in parallel by specialized modules
    These specialised Modules form two kinds of “maps”
    Feature maps (Colour maps, orientation maps etc.)
    Master of map locations
  • Role of attention - Feature Integration Theory (FIT)

    Moves within the location map
    Selects whatever features are linked to that location
    Features of other objects are excluded
    Attended features are then entered into the current temporary object representation
  • Feature Integration Theory (FIT) maps explained:
    1. Feature Maps: These are separate maps in the brain that register basic visual features like colour, orientation, and motion. Each feature is processed independently and in parallel.
    2. Master Map of Locations: This map contains all the locations where features have been detected. It acts as a central reference point, integrating information from the various feature maps
    So it goes visual scene (Stimulus), the feature maps, master map of locations (with attentional spotlight) and then object perception and recognition.
  • … But search is not that simple...
    The visual search paradigm produced data that were not consistent with the feature integration theory
    Search is very fast only when the objects look 3D even though they consist of the same elements (Enns & Rensink, 1991) – 3D objects have some primary focus in our visual system.
  • Can we easily define a feature?
    Faster detection of presence than absence - but what is the “feature”, when the two shapes are the same?
    Search asymmetries exist because the visual system is highly sensitive to only certain types of information – and because attentional guidance is connected to memory representations
    This study suggests that the visual system seeks out discontinuities in curvature and uses them, among other cues, to obtain a 3-D representation of the 2-D retinal image
  • Feature Integration Theory viewed from the contemporary perspective
    FIT is a comprehensive theory of visual processing rather than just of attention
    it starts with the earliest stages of sensory encoding (e.g., parallel, serial)
    ends with nature of internal representations implicated in object recognition (Quinlan, 2003)
    Theory was too simple in its assumptions that feature binding/object formation requires attention
    However, cross-dimensional feature binding is indeed attention-demanding process
    Location does play a key role in attentional selection
  • Cross-modal perception and attention
    Integration of information from different senses provides more reliable and coherent information about the world – and cross modal cues are highly effective drivers of attention

    Multisensory neuron in the superior colliculus (… that directs eye movements quickly) of the cat
    Receptive fields of multi-sensory neurons in the visual, auditory and somatosensory space overlap
  • Social cues to attention Kuhn and Land, 2006
    Body language and eye gaze cues are highly salient drivers of attention – which can also be used for misdirection
  • Conclusions
    Attention is a multi-faceted phenomenon and hence a vast area of study in cognitive psychology, vision science and neuroscience
    Attentional selection is key to information processing in complex everyday environments
    It is a great challenge to encompass all the different attentional mechanisms (spatial, feature, object-based) within a single theory – it would necessarily have to become a theory of visual perception as a whole
  •  Handling objects under visual control is a common and frequent task for humans, which requires close interaction between hand and eye movements, and between visual and somatosensory information. This is also a cross-modal interaction.
  • An interesting case in point is the cross-modal adaptation between the visual system and the proprioceptive system in the perception of one’s own running speed (Pelah and Barlow 1996). The least obvious, but arguably most important, function of multi-sensory interaction is to generate an internal representation of a coherent world: we do not perceive there to be separate visual and acoustic spaces around us but a unitary and physically coherent environment.
  • Motion sickness explained:

    The tendency to create a unified presentation is so strong that some cases of cue conflict can create even physiologically aversive reactions – when the vestibular information about our own movement is in conflict with the visual information, we experience motion sickness. Because of this high importance of multi-sensory integration, there is a large interest from the engineering disciplines in understanding the basic underlying mechanisms.