Operant conditioning is a learning process based on reinforcement and punishment. As a reminder, in terms of operant conditioning, reinforcement always means that the behavior is strengthened (more likely to occur again), and punishment always means that the behavior is weakened (less likely to occur again). As discussed in the previous article about operant conditioning, people tend to learn most effectively through reinforcement rather than punishment. The degree of impact from reinforcement strategies depends on the schedule of reinforcement used; that is, the timing and frequency of the introduction of the reinforcing response.
Schedules of reinforcement can be divided into two broad categories: continuous reinforcement and partial reinforcement. Continuous reinforcement schedules involve reinforcing the desired behavior each and every time it occurs. It is very advantageous during the initial learning process and tends to shape a behavior quickly and effectively. The problem, as you might imagine, is that it’s extremely time-consuming (and draining on other resources) and difficult to maintain. Over time, with each lack of reinforcement for a particular response, the behavior begins to weaken or disappear entirely. This process is known as operant extinction and is the main reason why continuous schedules of reinforcement need to be switched to partial reinforcement strategies in order to maintain the learned behavior.
Partial, or intermittent, reinforcement involves reinforcing the desired behavior only part of the time. It is much more resistant to extinction but does take longer to achieve the desired behavior if this strategy is used initially (compared to a continuous schedule). There are four types of partial reinforcement schedules that are determined according to the timing and frequency of the response. These include fixed interval, variable interval, fixed ratio, and variable ratio.
Fixed interval schedules involve reinforcing a behavior after a specific amount of time has elapsed. A person who is paid hourly or on a monthly stipend despite how hard they actually worked is being reinforced on a fixed interval schedule. It is predictable and steady; they know that at the end of every hour or month they will have earned a certain amount of money. Fixed interval schedules are fairly easy to maintain but they have relatively low operant strength compared to the other alternatives, which means that the person is more likely to quit or reduce responding. For example, if someone stopped being paid for their work, they would likely stop working very quickly due to the lack of reward.
Variable interval schedules are similar to fixed interval schedules, except that instead of occurring at regular, predictable intervals, the reinforcer is unpredictable. Reinforcement is still contingent on the passage of time, but each interval may vary from a few minutes to several days or months. Because the person cannot predict the timing of the reinforcer, they are likely to behave in a relatively steady manner, hoping the reinforcer will be coming soon. Fishing is a great example of a variable interval schedule. You may catch your first fish moments after casting the line, but it could be hours until you catch your second. If your set on catching fish that day, you’ll continue to wait with your line in the water until you are sufficiently reinforced (that is, catch your desired number of fish).
Fixed ratio schedules occur when a response is reinforced after a certain number of responses. Rather than being contingent on time, ratio schedules are based on the actual activity of the individual. While this schedule tends to lead to a high rate of response, it can lead to burnout and/or lower quality work. For example, let’s say a parent offers to pay their child $5 each time they empty the dishwasher. It’s likely that the child will be motivated to complete this chore, but in an attempt to gain their reward as quickly and easily as possible, they’re also likely to rush through it and perhaps break a dish in the process. Similarly, a child who is rewarded for every 10 books read is likely to breeze through reading each book at the risk of not fully comprehending the story or gaining the benefits of mindful reading.
Variable ratio schedules are also based on actual input from the individual, but rather than being a fixed number of responses, the required number of responses vary randomly. The response rate is very high and steady because the individual is totally unsure of how many responses are needed before reinforcement will occur. Consider how you feel while playing a slot machine or checking your Facebook account. Every attempt comes with an exciting rush associated with the possibility of reinforcement. Despite the number of times you receive a disappointing lack of reinforcement, deep down you know it’s coming eventually, so you continue to play or check for notifications. Not surprisingly, therefore, this schedule is most associated with behavioral addictions and is most resistant to operant extinction.