Our behavior will reflect the amount of reinforcers received
The proportion of behavior should equal or match the proportion of reinforcers available for that behavior
Instrumental Response
Thorndike and Skinner viewed instrumental conditioning as producing repetitions of the same response
Instrumental Reinforcer
Relevance or Belongingness
You can reinforce opening your mouth, or mouth movements, but you can not reinforce a true yawn
Quantity and Quality of a Reinforcer
Both matter, have to think about if what you are getting now is relatively better or worse than before
Response-Reinforcer Relation
Effects of delay between response and reinforcer
Each group was compared against performance in a yoked control group that got the same number and rate of pellet delivery but without any response dependency
Exploitation
Always only doing one thing or marrying one person because it pays out
Exploration
Keeping options open or responses variable or still dating around to see if something pays out even more
Variability can be increased by reinforcement
In the absence of explicit reinforcement of variability, responding becomes more stereotyped
Instinctive Drift
When you try to reinforce a behavior, it might naturally drift onto something that is more natural for the animal to do
Rates are first trained to run down a runway for either a small (2 pellets) or large (22 pellets) rewards, then in phase 2, half of the rats in each condition were rewarded with the quantity used in the other condition
Premack Principle
Reinforcement occurs when an instrumental act allows access to a more preferred behavior, punishment occurs when an instrumental act is followed by a less preferred behavior
Contiguity
How close in time the reinforcer follows the response; TEMPORAL relation
Master person gets reinforced for every 2 hours of studying, yoked person gets reinforced independently of their own behavior
Conditioned Reinforcers
Immediately follow the correct response with a CS that has previously been paired with the reinforcers
Marking Stimuli
Delivering something new to the animal to signal that they did something good or bad, differ from secondary reinforcers because they are paired EQUALLY with correct and incorrect response
Contingency
The probability you will get something by doing an action - the probability you will get something by not doing an action
Belief/Contingency
Action A-> goal X
Instrumental actions should be sensitive to changes in the causal relation (contingency) between action and outcome, and sensitive to changes in the value of the reinforcer
Responses that pass an omission test are driven by a CS-US association, not an R-O association
Contingency degradation reduces the causal relationship between the action and the outcome
R
Thorndike's Law of Behavior, Pavlovian: do you elicit a response to it?
O
Two Process Theory, how it affects instrumental behavior
Outcome Expectancy (R-O) and Two Process Theories (Original and Revised) make different predictions after devaluation with a single response and outcome, two responses and outcomes, and a single manipulandum with two responses and outcomes
Specific Pavlovian to Instrumental Transfer
When you get O1 and O2 in the same context, so there's no way to get specific increase in one of those behaviors because anything that comes out of that S will increase both equally
If instrumental performance is no longer sensitive to devaluation, it has become a HABIT due to overtraining strengthening the S-R association
Overtraining
Alters performance after outcome devaluation
If behavior is S-R driven
The responding DOES NOT change after devaluation because there is no O
You aren't going to press the lever for example if it gives you the outcome you don't want
Devaluation
You keep giving it until they will not eat it
Habit
When instrumental performance is no longer sensitive to devaluation, it has become a HABIT
R association
Has been strengthened overtime so now it just seems like a habit to do those things
S will generate the R without much confusion about why you are doing it
Devaluation logic
An action sensitive to the value of the outcome will decrease when the value of the outcome has changed
Overtraining
Results in the S-R association controlling responding
R associations
Elicit responses independent of the current value of the outcome