Misconception 3: Reconsolidation Is an Enhancement of Extinction

This post was originally published in Neuropsychotherapist magazine January 2015

and written by
Bruce Ecker
Coherence Psychology Institute

Misconception 3: Erasure Is Brought About During the Reconsolidation Window by a Process of Extinction: Reconsolidation Is an Enhancement of Extinction

Reconsolidation and extinction are different phenomena, with distinctly different effects, but misconceptions have developed for reasons described in this section. In the process that has been known for a century as extinction, the target learning is not revised or erased, but only suppressed temporarily by new counterlearning, and the new learning is encoded in its own memory circuitry that is anatomically separate from, and in competition with, the circuits of the target learning. Later, however, the target learning wins that competition and reemerges into full expression (Bouton, 2004; Foa & McNally, 1996; Milner et al., 1998). In contrast, during the reconsolidation process, a target learning is destabilized and rendered susceptible to being revised fundamentally by new learning, which can either weaken it, strengthen it, alter its details, or fully nullify and erase it, and these changes are lasting, as described earlier. Researchers have determined that reconsolidation and extinction are distinct and even possibly mutually exclusive processes at the behavioral, neural, and molecular levels (Duvarci & Nader, 2004; Duvarci, Mamou, & Nader, 2006; Merlo, Milton, Goozée, Theobald, & Everitt, 2014).

“Reconsolidation cannot be reduced down to facilitated extinction” was the conclusion of Duvarci and Nader (p. 9269). Despite those signature differences in process and effects produced, confusion about the relationship between reconsolidation and extinction nevertheless arises because to a degree they share certain operational and procedural patterns:

First, while the nullification learning that contradicts and erases a destabilized target learning can have any convenient procedural design, in many studies it has had the same design as a conventional extinction training—a series of numerous identical counterlearning/unreinforced trials—so it can be confused with and mislabeled as an extinction training, even though extinction is not actually involved.

Second, extinction, like reconsolidation, begins with the two-step sequence of reactivation and nonreinforcement (that is, a recueing of the target learning followed by nonoccurrence of what the target learning expects to happen, such as playing an audio tone without also delivering the mild electric shock that had previously been paired with the tone). It can be confusing and difficult to see how reconsolidation and extinction are two separate phenomena if they share the same initiating sequence of reactivation and nonreinforcement.

The main aim of this section is to dispel those two confusions, as well as to review various research findings that clarify the nature and relationship of reconsolidation and extinction and their differential triggering.

In addition, the discussion will explore these questions: Are the empirical findings on reconsolidation and extinction understandable entirely, or only partially, in terms of the mismatch requirement and mismatch relativity (MRMR)? Does it prove instructive to consider how the findings would have to be understood in order for them to be entirely consistent with MRMR? The heuristic exploration of those questions in this section extends significantly the degree to which the mismatch/prediction error requirement has been applied, to date, to the interpretation of experimental findings. Extinction-like procedure used for nullification learning. As noted earlier, for inducing erasure, some reconsolidation researchers have used a format of nullification learning during the reconsolidation window that has the same procedural structure as classical extinction training: a series of numerous identical counterlearning (nonreinforcement) experiences.

The result of this procedure is not extinction (temporary suppression of the target learning), but rather the permanent erasure of the target learning, such that even strong recueing (reinstatement) cannot reevoke the target learning into expression. Nevertheless, these researchers have unfortunately labeled this procedure as “extinction” by naming it with such phrases as the “memory retrieval-extinction procedure,” “extinction-induced erasure,” “extinction during reconsolidation,” or other phrases containing “extinction” (e.g., Baker, McNally, & Richardson, 2013; Clem & Huganir, 2010; Liu et al., 2014; Monfils et al., 2009; Quirk et al., 2010; Schiller et al., 2010; Steinfurth et al., 2014; Xue et al., 2012).

Such labeling is a source of much misunderstanding of reconsolidation and extinction.

We are faced with these empirical facts: When a repetitive counterlearning procedure is applied to a target learning that is in a stable state when the procedure begins, the result is extinction—the target learning is suppressed but is intact and later reemerges into expression. However, when the same repetitive counterlearning procedure is applied to a target learning that is already in a destabilized/deconsolidated state, the result is erasure—the target learning’s encoding is rewritten according to this new counterlearning, permanently nullifying the content of the target learning (Monfils et al., 2009; Schiller et al., 2010). Thus a particular learning procedure (repetitive counterlearning) can have extremely different neurological and behavioral effects depending on whether or not it is carried out during the reconsolidation window. So, any label for the erasure procedure that includes the term “extinction” is a misnomer that invites the misconception that reconsolidation utilizes and enhances the process of extinction. The use of repetitive counterlearning during the reconsolidation window could more appropriately be labeled “nullification learning,” “update learning,” or “erasure learning,” rather than “extinction training,” to avoid conceptual errors and confusion. However, the “extinction” labeling has already become standard among researchers who use this particular procedure and is probably here to stay.

The great significance and usefulness of the reconsolidation window lies in the fact that, during that window, to unlearn is to erase, regardless of the specific form of the unlearning or nullification experience. The repetitive counterlearning procedure is a convenient protocol under the highly simplified conditions of laboratory studies but is not suitable in general for nullification of the far more complex emotional learnings encountered in real-life psychotherapy.

There is a potentially unlimited number of formats in which nullification learning can occur in psychotherapy (for many examples of which, see Ecker et al., 2012). The triggering of reconsolidation versus extinction. As already described, both reconsolidation and extinction begin with the two-step sequence of reactivation and nonreinforcement, that is, a recueing of the target learning followed by nonoccurrence of what the target learning expects to happen. What, then, determines whether reconsolidation or extinction is the result? We know that the experience of mismatch (prediction error) is what triggers destabilization and reconsolidation, as discussed earlier.

This seems to imply that when memory reactivation plus nonreinforcement create a mismatch experience, reconsolidation is triggered, whereas when reactivation plus nonreinforcement occur without creating a mismatch experience, the extinction process begins. Therefore, in order to understand what causes the triggering of reconsolidation versus extinction, it may be necessary to understand why reactivation with nonreinforcement creates mismatch in some circumstances but not in others. With that question in mind, it is instructive to examine a range of instances where reconsolidation or extinction was induced. Many observations of the triggering of reconsolidation versus extinction have been made in animal studies through reactivating a CS-US target learning by presenting the unreinforced CS only, and then promptly applying a chemical agent that disrupts nonconsolidated or deconsolidated learnings but has no effect on stable, consolidated memory circuits. The findings of many such studies can be summarized in terms of how the effect of reactivating a target learning with CS-only presentations depends on their time structure. In the studies summarized here, the target learning was formed by two or more CS-US pairings with 100% reinforcement.

Reconsolidation is triggered. After a single brief CS presentation, there is no extinction learning and the target learning is still at full strength (Nader et al., 2000; Eisenberg, Kobilo, Berman, & Dudai, 2003; Jarome et al., 2012; Merlo et al., 2014; Pedreira et al., 2004). In this case, prompt application of a chemical agent that blocks consolidation (and reconsolidation) disrupts the target learning, which is found to be significantly weakened or completely erased 24 hr later, indicating that the target learning was destabilized (deconsolidated) by the CS presentation, triggering the reconsolidation process. This implies that unreinforced reactivation by a single CS presentation does create a mismatch experience. Extinction is triggered. After a single prolonged CS or a series of many short CS presentations, an influential extinction learning exists and largely suppresses the target learning, so the behavioral expression of the target learning is significantly diminished. In this case, prompt application of a consolidation-blocking chemical agent disrupts the newly formed, not yet consolidated extinction learning, and this restores the target learning to full strength (see, e.g., Eisenberg et al., 2003). The return of the target learning to full strength implies that the target learning was unaffected by the chemical agent and therefore was not in a destabilized state when the chemical agent was administered after the single prolonged CS or the series of many short CSs. This in turn implies that although the first of the many short CS presentations must have created a mismatch experience (as in the single brief CS situation), mismatch must have been terminated promptly by the ensuing CSs, restabilizing the target learning, despite the fact that each CS in itself would seem to be a nonreinforced reactivation that should maintain mismatch. These logical inferences have been corroborated by several studies, described below. Neither reconsolidation nor extinction is triggered. For an intermediate number of unreinforced CS presentations, the target learning remains at full strength and a variety of chemical interventions that either disrupt or enhance reconsolidation or extinction have no effect on subsequent target learning expression/ This is understood to mean that neither reconsolidation nor extinction is underway (Flavell & Lee, 2013; Merlo et al., 2014; Sevenster, Beckers, & Kindt, 2014). Merlo et al. (2014) commented, “In the continuum of possible retrieval conditions, reconsolidation and extinction processes are mutually exclusive, separated by an insensitive phase where the amount of CS exposure terminates the labilization of the original memory, but is insufficient to trigger the formation of the extinction memory” (p. 2429). Of the many studies that have reported the kinds of findings summarized above, very few also addressed the question of why one, or a few, or many nonreinforced CS reactivations have the observed effects of triggering or not triggering the destabilization and reconsolidation of a target learning. Here the focus of discussion now turns to an examination of why reactivation with nonreinforcement creates mismatch and triggers destabilization in some circumstances and not in others. MRMR model of triggering reconsolidation or extinction. The critical role of mismatch in triggering reconsolidation was first reported by Pedreira et al. (2004), as noted earlier. A mismatch exists when there is a significant discrepancy between what is expected and what is actually experienced. Thus reconsolidation and all of its complex cellular and molecular machinery is an experience-driven phenomenon. A growing number of experimental observations require a view of mismatch as being a fluid, dynamical quality of experience that can vary on a moment-to-moment basis with the passage of time and with new experiences (see, e.g., Jarome et al., 2012; Merlo et al., 2014; Sevenster et al., 2014). The following paragraphs apply that dynamical view of mismatch and offer the proposal that the reconsolidation/extinction dichotomy may be largely or completely governed by the mismatch requirement and mismatch relativity (MRMR), as defined earlier. To explore this proposal and show that MRMR potentially could be responsible for a wide range of reconsolidation and extinction phenomenology, what follows is a discussion of how several significant research findings can be understood as being entirely MRMR effects. The discussion shows specifically how reactivation with nonreinforcement creates mismatch, triggering destabilization, in some circumstances and not in others. Single CS-only presentation.

First consider the simplified case in which the sequence of reactivation and mismatch (nonreinforcement) occurs only once and is not repeated. For example, if a conditioned stimulus (CS, such as a blue light, an audio tone, or a particular physical environment) has previously been paired repeatedly with a mild electrical shock (unconditioned stimulus, US) just before the CS turns off, what happens subsequently if the CS turns on and then turns off unreinforced (no shock), just once? The CS turning on immediately reactivates the target learning, generating the expectation of receiving a shock. Several studies have shown that the time period from CS onset to CS offset (with no US) controls whether reconsolidation or extinction occurs, and that whichever process occurs is triggered by CS offset and does not begin before CS offset (Kirtley & Thomas, 2010; Lee, Milton, & Everitt, 2006; Mamiya et al., 2009; Pedreira & Maldonado, 2003; Pedreira et al., 2004; Pérez-Cuesta & Maldonado, 2009; Suzuki et al., 2004).

In these studies, the CS onset-to-offset time that originally created the target learning (with CS-US pairing) was short, typically in the 1- to 5-min range. Subsequently, CS onset and offset with no shock induced reconsolidation if the CS onset-to-offset time was less than about one hour (destabilizing the target learning, making it revisable by new learning during the next five hours), but it induced extinction if the CS onset-to-offset time was more than about an hour (that is, the target learning remained stable and a separate counterlearning formed in competition with the target learning). To my knowledge, researchers have not proposed or identified a mechanism that explains the observations that short versus long periods of CS onset-to-offset induce reconsolidation or extinction, respectively. If MRMR (mismatch requirement and mismatch relativity) are the cause, they would operate as follows. Consider first the case where, after the target learning was formed by a series of CS-US pairings (100% reinforcement), there is a single unreinforced CS reexposure with short duration of onset-to-offset (about equal to the CS onset-to-offset time in the original CS-US training), triggering reconsolidation. As noted earlier in describing the study by Nader et al. (2000), the absence of the expected US creates a decisive US mismatch that destabilizes the target learning. Next, consider the case where a single short CS-only presentation occurs after the target learning has been formed by a partial reinforcement schedule. Partial reinforcement results in the subject expecting not that the US will always occur following the CS, but only that it might occur. In this case MRMR predicts that a single short CS-only presentation would not constitute a decisive mismatch and would therefore not induce destabilization. The target learning would remain stable, as was found to be the case by Sevenster et al., (2014), who used a 50% reinforcement schedule to create a fear learning and showed that a single, short CS-only presentation did not induce destabilization. (Learnings created by partial reinforcement require significantly more extinction trials to suppress, as compared with learnings created by continuous reinforcement, because the initial trials are not experienced as a decisive mismatch or prediction error.

This is the “partial reinforcement extinction effect”, e.g., Pittenger & Pavlik, 1988.)

A special case of learnings formed by partial reinforcement is single-trial learning, which again results in the expectation that the US might occur following the CS, not that it will always occur. Here too, a subsequent single CS-only reexposure does not create a decisive US mismatch. This was the case in a study of conditioned taste aversion in rats reported by Eisenberg, Kobilo, Berman, and Dudai (2003). A single training trial produced lasting avoidance behavior, but a single CS-only reexposure did not destabilize the target learning (as evidenced by no disruption of the target learning from anisomycin administered immediately after the CS reexposure). When Eisenberg et al. used a series of two CS-US pairings to create the target learning instead of one pairing, creating strong US-expectancy due to the 100% reinforcement, the same single CS-only presentation now did trigger destabilization, allowing disruption by anisomycin, implying that now a mismatch was created. The foregoing examples and those below illustrate that the principle of mismatch relativity emphasizes a detailed consideration of all features of the target learning, in order to predict accurately whether or not a given reactivation procedure creates a decisive mismatch/prediction error experience in relation to the target learning in question. Mismatch relativity also alerts us to understand that any given successful experimental destabilization procedure reveals not the inherent, fundamental properties of the brain’s reconsolidation process, but only a way of creating mismatch relative to the particular features of the target learning created by the researchers. Next we have to consider why, according to MRMR, a single long-duration, unreinforced CS causes extinction rather than reconsolidation. For example, Pedreira et al. (2004) found that a 2-hr CS presentation failed to destabilize the target learning into reconsolidation and instead produced an extinction learning. The original target learning began with a 5-min exposure to the CS (the training chamber), and then the US, a simulated predator, was presented every 3 min, 15 times. The fact that a 2-hr unreinforced CS reexposure did not trigger reconsolidation means, according to MRMR, that the 2-hr CS did not function as a mismatch of the target learning’s 5-min exposure before the US began to appear. Why it did not function as a mismatch can be inferred from mismatch relativity: Relative to the original learning experience with its 5-min CS exposure, a 2-hr CS reexposure presumably was an tively from the original learning to such a degree that the 2-hr reexposure experience registered as a contextually unrelated experience, not as a mismatch or even as a reminder of the 5-min experience encoded in the target learning. Thus the experience of mismatch, which would have occurred with CS offset for some time after the 5-min point, no longer occurred with CS offset at the 2-hr point.

Due to the relativity of mismatch, an experience that is too greatly dissimilar to the original learning experience does not function as a reminder or mismatch of it, so the target learning does not destabilize, which causes the new learning driven by the unreinforced CS to form separately as an extinction learning. This example suggests the possibility that the presence or absence of mismatch can change over time during CS presentation, which will figure importantly in the analysis of multitrial extinction below. The general principle of mismatch relativity is that experience B is a mismatch of expected experience A if B resembles A enough to register as a reminder and repetition of A, while also containing saliently discrepant or novel features relative to those of A. Testable predictions arise from the MRMR interpretation above. For example, the original learning could be created by a 2-hr CS with the US occurring in the final minutes, with a repetition of that CS-US beginning 30 min later, and so on three or four times. Mismatch relativity predicts that now a 2-hr CS-only reexposure would serve as a reminder and mismatch and would achieve destabilization; and perhaps now a 5-min CS reexposure without US would fail to do so because the dissimilarity might be too great for the short reexposure to serve as a reminder of the extremely long duration in the target learning. If the mismatch requirement and mismatch relativity govern whether reconsolidation or extinction occurs, then there is no absolute time duration of unreinforced CS reexposure that defines the boundary between the two phenomena. Rather, the time boundary (the largest and smallest unreinforced CS reexposure durations that function as a mismatch and trigger reconsolidation) would depend on the original learning’s CS duration. That predicted dependency of the reconsolidation/extinction time boundary on the time structure of the original training serves as another test of the MRMR model and could be directly measured by extending existing studies to vary the reinforced CS duration in the original learning while measuring the maximum and minimum unreinforced CS reexposure durations that trigger reconsolidation. Multiple CS-only trials.

The foregoing paragraphs addressed reconsolidation and extinction having a shared initiating sequence in the case of single-trial, nonreinforced CS reexposure.

Next, consider the case of a series of numerous identical counterlearning experiences of reactivation and nonreinforcement, that is, the classical extinction procedure. It is well known that the multitrial extinction procedure does not destabilize or erase the target learning, yet, as discussed above, a single short CS-only trial does do so (for a target learning created by multiple CS-US presentations). This raises the question: Given that the first CS-only presentation mismatches and destabilizes, how does the state of the target memory evolve with each successive CS-only presentation, such that there is no destabilization and no erasure resulting from the series? It will be assumed in what follows that the target learning was formed originally by a series of CS-US pairings having the same time structure as in the subsequent extinction procedure. This assumption allows for an unambiguous delineation of the logic of MRMR in this instance, but it does not limit the relevance of MRMR to only these assumed conditions as a special case. The question requiring an answer is this: Why does the standard extinction procedure fail to destabilize and then erase the target learning, given that the first CS-without-US in the series mismatches and destabilizes the target learning and the ensuing series of CS-without-US experiences could be expected to then function as a nullification learning that erases the target learning? MRMR implies that because the result of multiple-trial counterlearning is extinction rather than erasure, it must be the case that multiple-trial counterlearning does not sustain a mismatch experience long enough for erasure to occur.

The question therefore becomes: Why does multiple-trial counterlearning not sustain a mismatch that keeps the target learning destabilized and allows erasure to occur, even though every unreinforced trial in the series seems to be a mismatch of the expected CS-US pairing?

The answer to that question has emerged from several recent studies (Jarome et al., 2012; Merlo et al., 2014; Sevenster et al., 2014). Jarome et al. (2012) paired sound and footshock to create a learned fear of the sound in rats, and then, 1 day later, applied anisomycin immediately following either a single unreinforced CS or two unreinforced CSs that were separated by 1 hr. (Longer periods were also tested.) On the next day, tests of fear in response to the CS showed that after single-CS reexposure, the fear learning had been largely disrupted and erased by experience that qualitatively differed subjectively from the original learning to such a degree that the 2-hr reexposure experience registered as a contextually unrelated experience, not as a mismatch or even as a reminder of the 5-min experience encoded in the target learning. Thus the experience of mismatch, which would have occurred with CS offset for some time after the 5-min point, no longer occurred with CS offset at the 2-hr point. Due to the relativity of mismatch, an experience that is too greatly dissimilar to the original learning experience does not function as a reminder or mismatch of it, so the target learning does not destabilize, which causes the new learning driven by the unreinforced CS to form separately as an extinction learning. This example suggests the possibility that the presence or absence of mismatch can change over time during CS presentation, which will figure importantly in the analysis of multitrial extinction below. The general principle of mismatch relativity is that experience B is a mismatch of expected experience A if B resembles A enough to register as a reminder and repetition of A, while also containing saliently discrepant or novel features relative to those of A. Testable predictions arise from the MRMR interpretation above. For example, the original learning could be created by a 2-hr CS with the US occurring in the final minutes, with a repetition of that CS-US beginning 30 min later, and so on three or four times. Mismatch relativity predicts that now a 2-hr CS-only reexposure would serve as a reminder and mismatch and would achieve destabilization; and perhaps now a 5-min CS reexposure without US would fail to do so because the dissimilarity might be too great for the short reexposure to serve as a reminder of the extremely long duration in the target learning. If the mismatch requirement and mismatch relativity govern whether reconsolidation or extinction occurs, then there is no absolute time duration of unreinforced CS reexposure that defines the boundary between the two phenomena. Rather, the time boundary (the largest and smallest unreinforced CS reexposure durations that function as a mismatch and trigger reconsolidation) would depend on the original learning’s CS duration. That predicted dependency of the reconsolidation/extinction time boundary on the time structure of the original training serves as another test of the MRMR model and could be directly measured by extending existing studies to vary the reinforced CS duration in the original learning while measuring the maximum and minimum unreinforced CS reexposure durations that trigger reconsolidation. Multiple CS-only trials.

The foregoing paragraphs addressed reconsolidation and extinction having a shared initiating sequence in the case of single-trial, nonreinforced CS reexposure. Next, consider the case of a series of numerous identical counterlearning experiences of reactivation and nonreinforcement, that is, the classical extinction procedure. It is well known that the multitrial extinction procedure does not destabilize or erase the target learning, yet, as discussed above, a single short CS-only trial does do so (for a target learning created by multiple CS-US presentations). This raises the question: Given that the first CS-only presentation mismatches and destabilizes, how does the state of the target memory evolve with each successive CS-only presentation, such that there is no destabilization and no erasure resulting from the series? It will be assumed in what follows that the target learning was formed originally by a series of CS-US pairings having the same time structure as in the subsequent extinction procedure. This assumption allows for an unambiguous delineation of the logic of MRMR in this instance, but it does not limit the relevance of MRMR to only these assumed conditions as a special case. The question requiring an answer is this: Why does the standard extinction procedure fail to destabilize and then erase the target learning, given that the first CS-without-US in the series mismatches and destabilizes the target learning and the ensuing series of CS-without-US experiences could be expected to then function as a nullification learning that erases the target learning? MRMR implies that because the result of multiple-trial counterlearning is extinction rather than erasure, it must be the case that multiple-trial counterlearning does not sustain a mismatch experience long enough for erasure to occur. The question therefore becomes: Why does multiple-trial counterlearning not sustain a mismatch that keeps the target learning destabilized and allows erasure to occur, even though every unreinforced trial in the series seems to be a mismatch of the expected CS-US pairing? The answer to that question has emerged from several recent studies (Jarome et al., 2012; Merlo et al., 2014; Sevenster et al., 2014). Jarome et al. (2012) paired sound and footshock to create a learned fear of the sound in rats, and then, 1 day later, applied anisomycin immediately following either a single unreinforced CS or two unreinforced CSs that were separated by 1 hr. (Longer periods were also tested.) On the next day, tests of fear in response to the CS showed that after single-CS reexposure, the fear learning had been largely disrupted and erased by anisomycin, indicating destabilization had occurred, but after the two-CS exposure there was no reduction in fear due to anisomycin. This implies that the second CS rapidly changed the neurological condition of the target learning, either returning the target learning to stability (according to the standard interpretation of anisomycin’s effect) or, alternatively and more conjecturally, launching the updating/erasure process and thereby altering the prevailing molecular mechanisms such that even though destabilization persisted, anisomycin no longer caused disruption (T. J. Jarome, personal communication, 24 November, 2014). Sevenster et al. (2014) also demonstrated rapid changes in target memory condition caused by successive nonoccurrences of the US when it was expected according to the original training. A fear learning was created in human subjects by pairing an image with a wrist shock, and the effects of 0, 1, and 2 nonreinforcements by CS-only presentations were studied. Whether the target learning was destabilized was determined by administering propranolol, which disrupts destabilized CS-US fear learnings in humans (Kindt et al., 2009; Soeter & Kindt, 2011). This revealed that a single nonreinforcement functioned as a mismatch and destabilized the target learning, launching reconsolidation, but 0 and 2 nonreinforcements did not. This indicates again, as in Jarome et al. (2012), that a target learning destabilized by an initial unreinforced CS presentation is restabilized by the second unreinforced CS presentation. Here, however, the time interval from first to second CS was 40 s rather than 1 hr. Importantly, in addition to measuring the level of fear in response to each unreinforced CS presentation, during each unreinforced CS presentation Sevenster et al. (2014) also measured subjects’ subjective rating of their US-expectancy, that is, the felt level of anticipation that the shock would occur at the end of the current 7-s CS image presentation. US-expectancy was rated by subjects on a scale from –5 (certainty of not happening) to 0 (uncertain) to +5 (certainty of happening). This revealed that as the first nonreinforcement was about to happen, average US-expectancy was strong at +3.8, which created a mismatch experience when the US did not occur, but as the second nonreinforcement was about to happen, average US-expectancy had decreased sharply to 0.9, close to the “uncertain” level and presumably too low to create a mismatch experience when the US did not occur. The first US nonoccurrence had created new learning that reduced the US-expectancy created by the original training, and it was this reduced US-expectancy that then encountered the second US nonoccurrence. The direct implication is that immediately after the first nonoccurrence of the US when the US would be expected on the basis of the original learning, subjects were in the experience of mismatch, so the target learning was found to be destabilized, but immediately after the second nonoccurrence of the US when it would be expected according to the original learning, subjects were not in an experience of mismatch, so the target learning was found to be stable. Thus the presence or absence of a mismatch experience evidently switches destabilization on or off, respectively, in real time. By comparing their measurements of fear and US-expectancy, Sevenster et al. also showed that the sharp drop in self-reported US-expectancy was not accompanied by a decrease in physiologically measured fear. This means that with accumulating unreinforced CS presentations, US-expectancy began to decrease, evidently returning the target learning to stability, before there had been enough counterlearning to initiate the formation of an extinction learning. This is consistent with other studies indicating that reconsolidation and extinction are mutually exclusive phenomena (e.g., Duvarci & Nader, 2004; Duvarci et al., 2006; Merlo et al., 2014).

Thus after two US nonoccurrences, the target learning was stable and neither reconsolidation nor extinction was occurring.

Observations by Merlo et al. (2014) provide further corroboration that accumulating unreinforced CSs switch off reconsolidation before extinction is in effect. After 1, 4, 7, and 10 presentations of an unreinforced CS, Merlo et al. tested a conditioned fear learning in rats for susceptibility to alteration by various chemical agents applied locally in the basolateral amygdala (BLA). After the fourth CS presentation, the target learning was no longer chemically alterable, meaning that it was no longer in a destabilized state in the BLA. Furthermore, there were no behavioral or molecular markers of extinction, so neither reconsolidation nor extinction was occurring. Merlo et al. infer from these findings that the target learning’s state (stable or unstable) may be reset on a moment-to-moment basis as CS-only presentations accumulate. In light of the studies just reviewed, there is now growing evidence indicating why the multiple-trial counterlearning of conventional extinction training does not sustain mismatch or destabilization and does not erase the target learning: A target learning’s state of destabilization and erasability evidently is maintained by the ongoing presence of the experience of mismatch or prediction error and can quickly terminate if the experience of mismatch or prediction error terminates. Thus the mismatch requirement first identified by Pedreira et al. (2004) functions as a dynamic on/off switch. The destabilized state can be toggled on/off or off/on as mismatch is subjectively present/ absent or absent/present, respectively. (Destabilization lasts for a time window of about five hours, as described earlier, if, once destabilized, the target learning is not further recued by additional experiences.) In this picture of dynamic mismatch bipolarity, the principle of mismatch relativity governs how each successive unreinforced CS affects the target learning. In other words, the target learning consists of expectations that can be revised by an individual CS in the series if that CS deviates from the expectations extant just prior to that CS. The evolving expectational content of the target learning must be considered in detail in order to understand the effect of each successive CS. In short, the studies by Jarome et al. (2012), Merlo et al. (2014) and Sevenster et al. (2014) indicate that MRMR principles determine the effects of the multiple-trial extinction procedure, as follows. With a target learning created by CS-US pairings with continuous (100%) reinforcement, the subject has the expectation that the US always accompanies the CS. The first CSwithout-US presentation is therefore a decisive mismatch (that is, the nonoccurrence of the US creates strong surprise and a felt inability to anticipate accurately) because the learned expectation that the US always accompanies the CS has now encountered the mismatching current perception that the US does not always accompany the CS. This has two effects. First, this strong mismatch abruptly destabilizes the target learning.

Second, the nonoccurrence of the expected US creates new learning that the US does not always accompany the CS. This new learning persists and results in a sharply reduced US-expectancy during the second unreinforced CS presentation. The second US nonoccurrence is therefore not experienced as a mismatch, because now there is no surprise or prediction error felt. Rather, there is now an experience that this US nonoccurrence is in accord with the expectation that the US might or might not happen. This termination of mismatch terminates destabilization, because destabilization is dynamically maintained in real time by the persisting experience or context of mismatch. The target learning shifts into a stable state. (Whether the new learning created by the first US nonoccurrence immediately updates the target learning’s model of the CS-US association is not yet known, though molecular findings by Monfils et al., 2009, and Jarome et al., 2012, seem to imply that the destabilization event does not also launch updating. Possibly, updating is launched only if mismatch saliently persists after destabilization occurs.) In that way, the multitrial extinction procedure destabilizes and then quickly restabilizes the target learning before erasure can occur.

With the third unreinforced CS, presumably there would no longer be any surprise or mismatch whatsoever. With the target memory in a stable state as CS repetitions continue, the target learning remains intact and the new learning created by the ongoing series of harmless CS presentations forms separately. That is the MRMR account of standard multitrial extinction. Standard multitrial extinction training was converted into an effective erasure procedure in studies by Monfils et al. (2009) and Schiller et al. (2010), as described in a previous section, simply by increasing the time interval between the first and second CS-only presentations.

Why that seemingly minor alteration of extra time in the first interval could make such a qualitative and drastic difference in outcome becomes apparent by applying the MRMR model and examining the timing difference in terms of its mismatch effects. That exercise is carried out here next for the Monfils et al. study, as this section’s final and most intricate example of applying the MRMR model. The Schiller et al. study, which had human subjects, is described in the Appendix of this article. In the procedure that Monfils et al. (2009) used with rats, the original fear acquisition consisted of three CS-US (tone-shock) pairings every 3 min, with CS duration of 20 s, ending with a half-second shock. On the next day, the interval between the 19 CS-only presentations was also 3 min, except for a longer initial interval between CS1 and CS2 of 10 min or 1 hr, both of which resulted in long-term erasure of the learned fear, which could not be reevoked later by either the CS or the US. The control group did not have the longer initial interval, making the procedure a conventional extinction training, and for these rats the learned fear was later reevoked. The functioning of the erasure procedure is understood as follows according to the MRMR model. It can be reliably assumed, based on many other studies as described earlier, that CS1 created a US mismatch that quickly destabilized the target learning.

Therefore, after CS1 the target learning was open to being updated by any variations in the procedure relative to the original training. An immediate and salient variation was the appearance of CS2 defining a 10-min or 1-hr interval since CS1, far longer than the 3-min interval expected based on the original acquisition training. The already destabilized target learning was updated according to that longer interval, so the timing expectation going forward was now that after each colored square there would be either 3 min or the longer time (10 min or 1 hr). The longer interval defined by CS2 also was a mismatch of timing expectations, and that second mismatch experience, coming while the target learning was already destabilized, would only have made the destabilized state more robust. However, as discussed earlier, CS2 would not create a US mismatch as CS1 had done. Thus CS2 ended the US mismatch while creating a timing mismatch. Did the target learning restabilize due to the termination of US mismatch, or did it remain destabilized due to the timing mismatch? One indication comes from Jarome et al. (2012), who largely replicated this situation with two CSs 1 hr apart, as described earlier. Anisomycin applied immediately after the second CS did not reduce fear in response to another CS 1 day later.

That is usually understood as meaning that the target learning was stable, because anisomycin disrupts a destabilized memory.

However, while anisomycin disrupts a memory that is newly destabilized but not undergoing updating, its effect on a memory during the updating process is not known. On the cellular and molecular level, the process that destabilizes the target learning and the process that updates/erases it appear to be two distinct though coupled processes (Jarome et al., 2012; Lee et al., 2008). Updating occurs through a molecular mechanism that potentially alters the molecular processes involved in the memory’s dynamical progression.

Anisomycin is a protein synthesis blocker.

If the updating/erasure mode eliminates the protein synthesis that a nonupdating memory requires for restabilization, then anisomycin would not have a disruptive effect on a destabilized memory that is undergoing updating, as Timothy J. Jarome (personal communication, 24 November, 2014) has pointed out. Only further research can settle the question of whether the target learning in Monfils et al. (2009) was stable or unstable after CS2, so here the MRMR account must branch to follow both possibilities. If CS2 caused restabilization due to elimination of US mismatch, the fact that erasure then resulted from CS3 to CS19 implies that CS3 must have destabilized the target learning yet again. That in turn implies that a new mismatch experience was created by CS3, which in turn directs us to identify the procedural elements that created that mismatch. CS3 occurred 3 min after CS2, which created another timing mismatch because an interval of 10 min or 1 hr was expected after the updating driven by the longer interval from CS1 to CS2. This timing mismatch created by CS3 onset would have redestabilized the target learning. (This is a prediction of MRMR that could be tested by extending the Jarome et al. study to include a CS3 that occurs 3 min after CS2, and conducting molecular tests for destabilization promptly after CS3.) Having been destabilized by CS3, the target learning would then be updated by the 3-min interval from CS3 to CS4, as well as by CS4 itself as an experience that the CS is harmless. The condition required for erasure to be occurring is having the target learning in a destabilized state concurrent with a fresh or freshly remembered experience that contradicts and disconfirms the target learning’s expectations or model of how the world functions. Erasure of the CS-US association may have been underway following CS3, and more so when the destabilized target learning encountered CS4. The next 3-min interval from CS4 to CS5 would have been as expected, ending the experience of mismatch, which may have terminated the destabilized state and, with it, the erasure process also. That would imply that erasure was fully accomplished by CS1 through CS4, and that CS5 through CS19 were not needed, which could be tested by repeating the Monfils et al. (2009) experiment without CS5 through CS19 and seeing whether or not the results are unchanged. If, on the other hand, CS2 maintained prior destabilization by creating a timing mismatch, it is probable that CS2 began the erasure process. Then, after CS2, the effects of the procedure’s time intervals and CSs would be the same as described in the previous paragraph (with the exception that CS3 would now maintain rather than reinitiate destabilization). Thus the question of whether or not CS2 restabilizes the target learning does not influence the outcome, according to the MRMR model.

There is an additional possibility for how the updating process could affect the unfolding dynamics of the target learning. Engagement of the updating/ erasure process possibly could maintain the destabilized state even without an ongoing experience of mismatch. In order for the adaptive process of updating to proceed, destabilization must be in effect during the new learning that is driving the updating (otherwise what occurs is not updating but a separate encoding of new learning, as in extinction). Therefore, because the adaptive success of updating depends on destabilization, it is likely that whenever new learning during destabilization is driving updating, the destabilized state is maintained directly by molecular signals from the updating/erasure process and is no longer dependent on an ongoing experience of mismatch, so that updating will not be prematurely terminated by an absence of mismatch causing a return to stability. This could be termed maintenance of destabilization by updating, or MDU. Presumably, at the point where no further encoding, reencoding, or de-encoding is occurring for updating, the molecular signals driving MDU cease, and the updated target learning then returns to stability promptly. It is well established that a target learning returns to stability after about five hours if there has been destabilization but no updating (such as by a single short CS-only presentation; Duvarci & Nader, 2004; Pedreira et al., 2002; Pedreira & Maldonado, 2003; Walker et al., 2003), but if updating has also occurred, it is possible that restabilization occurs through a different molecular process with a different temporal characteristic. If MDU is included in the MRMR framework, the picture becomes one of memory mismatch initiating and maintaining destabilization until memory updating is occurring, from which point destabilization is maintained directly by the updating process and continues until updating terminates either due to saturation of encoding or cessation of new learning input. The MRMR account of the erasure procedure used by Schiller et al. (2010; see Appendix) more strongly requires and implies MDU.

Obviously, further studies are needed to test these possibilities and clarify how the stability status of the target learning evolves with each successive CS presentation in various procedural configurations. The above analysis of results of Monfils et al. (2009) illustrates how assuming the results of experimental procedures to be governed by MRMR principles can illuminate previously unrecognized dynamics and resolve dilemmas of interpretation and apparent inconsistencies between studies. The foregoing MRMR accounts are offered heuristically, to indicate the kinds of phenomenology that are brought into consideration by the MRMR framework. The MRMR model has the systemic implication that the neural and molecular processes of reconsolidation or extinction are under the direct control of brain regions and circuits that assess, detect, and signal mismatch (prediction error) occurring between learned expectations and currently experienced temporal, spatial, and/or somatosensory perceptions (as well as, in the human clinical context, attributed meanings). A direct indication of that supervening role of mismatch detection can be seen in the findings of Reichelt, Exton-McGuinness, and Lee (2013) and Sevenster et al. (2014).

The latter showed, as described above, that the switching off of reconsolidation during a series of unreinforced CSs (reminders) can be directly attributed to a sharp decline in US-expectancy and corresponding termination of the experience of mismatch. Reichelt et al. demonstrated that a successful mismatch procedure for destabilizing goal-tracking memory in rats, allowing chemical disruption, became ineffective as a result of impairment in the ventral tegmental area, a brain region that is believed to be critical for generating prediction error signals but is not a site of memories undergoing reconsolidation. Understanding how mismatch signals are generated and how they supervene upon the machinery of reconsolidation and extinction may prove to be particularly fruitful for arriving at dexterous control of these phenomena.

For a discussion of prediction error signal generation and ideas for future research, see Exton-McGuinness et al. (2015).

In summary, from the MRMR perspective, the triggering of reconsolidation versus extinction by any particular reactivation procedure is to be understood in terms of the presence or absence of a mismatch (prediction error) experience at each point of the procedure. In addition to identifying what may control the reconsolidation/extinction dichotomy, the MRMR model provides a new, fundamental understanding of classical extinction by identifying why repetitive counter-learning creates a separate learning in competition with the target learning, rather than erasing the target learning. The MRMR account potentially unifies a broad range of reconsolidation and extinction phenomena.