Operantconditioning uses reinforcement to encourage behaviour to continue.
Operantconditioning is learning through trial and error using reinforcement & punishment.
Skinner (1904 -1990) was an American psychologist believed to be the founding father of behaviourism. He suggested reinforcement was primary in the shaping of behaviour.
Skinner developed the ‘Skinner Box’ to show the effects of reinforcement on the behaviour of rats.
Rats were placed in the Skinner box and through trial and error discovered the lever which released foodpellets. They then learnt that by pressing the lever food could be released. This is an example of positivereinforcement.
Positivereinforcement strengthens a behaviour by providing a consequence such as giving a reward.
Negativereinforcement encourages a behaviour to continue by taking away something unpleasant.
Negativereinforcementstrengthens behaviour because it stops or removes an unpleasant experience.
Punishment is the opposite of reinforcement since it is designed to weaken or eliminate a response rather than increase it.
Primaryreinforcement is a type of reinforcement that satisfies an individual’s basic needs, such as food/ water or sex (essential for survival).
Secondaryreinforcement is a type of reinforcement which becomes associated with the primaryreinforcement. It is anything the individual/ animal has to learn to regard as positive through experience (e.g. money).
The law of reinforcement suggests a positive reward/reinforcement (e.g. food/praise) increases the chance of learning a behaviour.
The law of contiguity suggests that we associate things that occur close to each other in time and space (thunder & lightening).
Schedules of reinforcement can be used in the learning process.
Continuousreinforcement gives a reward after every response the animal makes. For example, a rat will receive a pellet of food after every lever press.
Partialreinforcement gives a reward after only some responses. Skinner found four schedules of partial reinforcement.
Fixedratio schedule when a reward is given after a certain number of responses. For example, a food pellet after every 8 presses on the lever.
Variableratio schedule is when a reward is given after a certain number of responses. For example, food mostly after 8 presses, but there is sometimes a reward after the 6th press and sometimes after the 10th press.
Fixedinterval schedule is when a reward is given following the first response after a certain interval of time. For example, food for a lever press every 5 minutes.
Variableinterval schedule is for example, when the food reward is given about every two minutes (sometimes 1.5 minutes, sometimes 2.5 minutes). Intervals can vary.
Thorndike (1898) thought that learning happens by trial and error.