5 Questioning Foundations

The first four chapters introduced cognitive psychology’s foundations. The first chapter presented the computer metaphor adopted by cognitive psychologists. The second chapter discussed how cognitive psychologists use experimental methods to infer human information processes. The third chapter defined cognitive psychology’s philosophy of science, functional analysis. The fourth chapter demonstrated how cognitive psychology’s hypotheses, methods, and philosophy allow cognitive psychologists to propose many competing theories. Competing theories lead to many debates about the nature of human cognition, debates that I explore in Chapter 5. Each debate challenges cognitive psychology’s foundations. By understanding these debates, we can sharpen our understanding of—and concerns about—the discipline’s foundations.

5.1 Questioning Foundational Assumptions

What are cognitive psychology’s foundations? Cognitive psychologists often debate cognitive psychology’s foundational assumptions (Dawson, 2013). The rise of connectionism in the mid-1980s challenged the digital computer metaphor. Embodied cognition challenged cognitive psychology’s dismissal of the environment and the body. Cognitive neuroscience challenged cognitive psychology’s functionalism. We can express such challenges as questions about foundations. Does cognitive psychology need the computer metaphor? Does cognition require rules? Do people think? Can we reduce cognition to brain operations?

Chapter 5 explores such challenges. Each section poses a different question, a different challenge to a foundational assumption. I begin by considering a different notion of information processing, connectionism. Next, I discuss a different challenge, embodied cognition. Then I explore cognitive neuroscience’s role in cognitive psychology and address more general questions about the “textbook” presentation of cognitive psychology. I end the chapter by considering the question in the book’s title: what is cognitive psychology?

5.2 Do We Need the Computer Metaphor?

Chapter 1 launched our discussion of cognitive psychology by introducing the computer metaphor: the assumption that cognition involves symbol manipulation like the information processing operations used by digital computers. The computer metaphor leads directly to cognitive psychology’s experimental methodologies (Chapter 2) and explanatory practices (Chapter 3). Chapter 4 revealed that the computer metaphor permits cognitive psychologists to develop diverse cognitive theories by making different architectural assumptions. Most examples in Chapter 4 involved specific assumptions (replacing parallel processing with serial processing or proposing new formats for symbols).

Importantly, radically different cognitive theories arise when cognitive psychologists propose more consequential architectural challenges. Not all cognitive psychologists endorse the computer metaphor. Connectionists abandon the metaphor, propose alternative brain-like theories, and move cognitive psychology in very different directions. I briefly mentioned connectionism earlier (Sections 3.13, 4.2, 4.7), but now I consider connectionism in more detail as a challenge to traditional cognitivism. Sections 5.2 through 5.4 explore connectionism’s challenges to the computer metaphor and which evidence connectionists need to support radically different theories.

Our discussion of connectionism begins with a core question: why did cognitive psychology adopt the digital computer metaphor? When cognitive psychology arose in the 1950s, the digital computer offered the best example of information processing. However, other information processing examples also existed.

For instance, analog computers appeared decades before digital computers. Analog computers do not use rules to manipulate symbols. Instead, they vary continuously changeable physical properties (e.g., a mechanical or electrical value) to model variables for solving problems. Some researchers suggested that neurons were a kind of analog computer (von Neumann, 1958). Thus, even in the beginning, the digital computer metaphor had plausible alternatives.

Connectionism arose from exploring alternatives to the digital computer metaphor. Connectionists hypothesize that networks of simpler processors, operating in parallel, process information. We often describe connectionist models as neuronally inspired or biologically plausible. Analogous to neurons, connectionist processors send signals to one another through weighted connections. Processors receive signals sent from other processors; we call the total signal received the net input. Processors convert net input into internal activity ranging in value between 0 and 1.

A connectionist network, a system of such processors, converts a stimulus into a response. Processors called input units represent stimuli, and processors called output units represent responses (see Figure 5-1). The network distributes the knowledge for converting a stimulus into a response among all of the weighted connections, a distributed representation. Connectionists replace the digital computer metaphor with the idea that parallel distributed processing (PDP) networks carry out human cognition.

Using networks to model human cognition has a long history. Warren McCulloch and Walter Pitts (1943) proposed the first artificial neural networks, using mathematical logic to describe neural processing. When the net input for a McCulloch-Pitts neuron exceeds a threshold, the neuron generates an activity of 1. Otherwise, the neuron generates an activity of 0. McCulloch and Pitts mapped the binary activity of their neurons onto the logical notions of “true” or “false.” McCulloch and Pitts then established the power of networks by creating a universal Turing machine (Section 1.5) from a network of McCulloch-Pitts neurons: “To psychology, however defined, specification of the net would contribute all that could be achieved in that field” (p. 37).

The output unit of a perceptron is represented by a large circle. Below it are 12 smaller circles in a row representing input units. Solid lines indicate each input unit sends a signal to the output unit via a weighted connection. — Figure 5-1 A perceptron consisting of 12 input units connected to one output unit via weighted connections.

Networks of McCulloch-Pitts neurons can solve difficult problems but have one drawback: such neurons do not learn. Instead, the network’s designer must predetermine a neuron’s connection weights and threshold. Experience neither creates nor modifies thresholds or connection weights. Frank Rosenblatt (1958, 1962) introduced learning to networks with his perceptron. A perceptron has multiple input units and a single output unit. Perceptrons have weighted connections from each input unit to the output unit (Figure 5-1). The activity of a perceptron’s output unit, like a McCulloch-Pitts neuron, is either 0 or 1.

Perceptrons differ from a McCulloch-Pitts neuron because perceptrons learn. When the perceptron receives a stimulus, the output unit determines its net input and then responds. A learning rule computes response error by comparing the output unit’s response to the desired response. Rosenblatt’s learning rule then alters connection weights to decrease response error. His perceptron convergence theorem guaranteed that his learning rule could teach a perceptron to solve a problem—if the perceptron could represent a solution (Rosenblatt, 1962). We will see below that perceptrons cannot learn to solve every problem.

One example perceptron generates the logical operator AND. Using two input units, we present two stimuli to the perceptron; each stimulus has a value of either 0 or 1. The AND perceptron outputs a value of 1 if both stimuli equal 1. Otherwise, the perceptron responds with 0. The top of Figure 5-2 plots the pattern space for the AND perceptron. Each circle represents one of the four possible stimuli (pairs of input values). The input unit values give the coordinates of each circle: that is, 0,0, 0,1, 1,0, and 1,1. The colour of each circle represents the perceptron’s response to each pattern. A white circle indicates a response of 0, and a black circle indicates a response of 1. The bottom part of Figure 5-2 shows a perceptron trained to generate correct AND responses. Each connection weight has a value of 1, and the threshold of the output unit has a value of 1.5.

How does the perceptron’s structure generate AND? Each input unit sends activity to the output unit, but the perceptron first multiplies the activity by the input unit’s connection weight. To compute net input, the perceptron’s output unit sums up the two weighted signals. When we activate both input units with 1, the perceptron receives a net input of 2. A net input of 2 causes the output unit to respond with 1, because the net input exceeds the threshold of 1.5. For the three other possible patterns, the net input equals either 0 or 1, does note the threshold (either 0 or 1), and causes the output unit to respond with 0.

Rosenblatt’s perceptron led to growing interest in training networks and motivated a formal analysis of what perceptrons could and could not learn to do (Minsky & Papert, 1969). Minsky and Papert proved that perceptrons cannot learn to solve many problems that people can learn to solve. Perceptrons cannot model human cognition.

The pattern space for AND is at the top, with four circles arranged in a square, the top right circle in black, and the other three circles in white. A straight dashed line separates the black circle from the other three. Below the pattern space is a two-input unit perceptron for solving AND. — Figure 5-2 The top part of the figure provides the pattern space for AND. The bottom part of the figure illustrates the structure of a perceptron trained to generate the correct responses for AND.

Minsky and Papert (1969) established the negative assessment of perceptrons by proving that they can only learn linearly separable problems. We call a problem linearly separable if a single, straight cut through a pattern space separates all patterns associated with a response of 1 from all patterns associated with a response of 0. The pattern space for AND (Figure 5-2) is linearly separable because the figure’s one dashed line separates the “on” pattern from the three “off” patterns.

Perceptrons cannot solve linearly non-separable problems. Figure 5-3 provides the pattern space for another logical operator, exclusive or (called XOR). Like AND, XOR performs a logical judgment about two stimuli, generating a response of 1 when the two stimuli (each input unit value) differ from one another. XOR returns a value of 1 when one stimulus is 1 and the other is 0. If both stimuli are 1, or if both are 0, then XOR returns a 0.

Figure 5-3 illustrates that XOR is a non-linearly separable problem because one single straight cut cannot separate both black circles from both white circles. XOR’s pattern space requires two cuts. A perceptron can never solve XOR, because the output unit can make one cut, or the other, but not both. How can we increase the power of a perceptron? We could include intermediate processors between input and output units, called hidden units. The multi-layer perceptron presented earlier (Figure 4-6) possesses one layer of hidden units.

Hidden units add power by detecting more complex features, such as correlations between different input unit activities. Each hidden unit makes its own straight cut through a pattern space (Lippmann, 1989). To solve XOR, we require two such cuts; we can solve XOR using a multi-layer perceptron with two hidden units; each hidden unit provides one required cut (Rumelhart et al., 1986).

By the late 1960s, researchers realized that replacing the computer metaphor with connectionist processing required more powerful networks such as multi-layer perceptrons. However, connectionists did not yet know how to train networks with hidden units. As a result, connectionist research languished (Medler, 1998; Papert, 1988). Connectionist research lay largely dormant until the mid-1980s, when a learning rule for training multi-layer perceptrons, called backpropagation of error, appeared in the journal Nature (Rumelhart et al., 1986). Backpropagation of error led to an explosion of connectionist models of cognitive phenomena.

The pattern space for XOR is four circles arranged in a square pattern. The upper left and lower right circle are both black; the other two circles are white. Two straight dashed lines separate the two black circles from each of the white circles. — Figure 5-3 The pattern space for the linearly non-separable logical operator XOR (Input 1, Input 2).

Some famous examples of connectionist models included networks for converting present tense verbs into past tense verbs (Rumelhart & McClelland, 1986a), for simulating recognition memory (Ratcliff, 1990), for learning categories (Kruschke, 1992), and for generating the Stroop effect (Cohen et al., 1991). We often see connectionist networks in the modern cognitive literature. Backpropagation of error launched a connectionist revolution.

Although the connectionist revolution arose directly from the ability to train multi-layer perceptrons, broader issues made cognitive psychology eager for change (Bechtel & Abrahamsen, 2002; Dreyfus, 1972; Dreyfus & Dreyfus, 1988; Fodor & Pylyshyn, 1988; Medler, 1998; Rumelhart & McClelland, 1986b). Researchers argued that fatal flaws existed in models inspired by the computer metaphor, which used slow serial processing and failed to recognize that brains differed from digital computers. The connectionist revolution offered new ideas to researchers dissatisfied by slow progress in developing sophisticated simulations of cognition. “Almost everyone who is discontent with current cognitive psychology and current ‘information processing’ models of the mind has rushed to embrace ‘the Connectionist alternative’” (Fodor & Pylyshyn, 1988, p. 3). Connectionism offered a paradigm shift for cognitive psychology (Schneider, 1987).

The rise of connectionism shows that the computer metaphor does not provide the only view of information processing for cognitive psychology. However, replacing that metaphor with connectionism does not abandon the information processing hypothesis. Connectionists still treat cognition as information processing (Churchland et al., 1990). But they also believe that cognitive information processing differs from the information processing performed by computers. “These dissimilarities do not imply that brains are not computers, but only that brains are not serial digital computers” (Churchland et al., 1990, p. 48, their italics).

How can we make sense of a discipline possessing two views that appeal to paradigmatically different notions of information processing? Computer metaphor models and connectionist networks do have many similarities (Dawson, 1998, 2013). Both perspectives agree that cognition is information processing but disagree about its basic information processing properties. Thus, the two views provide different proposals about the cognitive architecture.

In the next section, I explore greater similarities between the two views of information processing than we might expect. I will examine the connectionist claim that cognition does not require the rule-governed manipulation of symbols, a claim surprisingly hard to defend.

5.3 Does Cognition Require Rules?

In Section 5.2, I considered the connectionist claim that human information processing differs from computer information processing. I described connectionism as exploring brain-like processing in which simple, neuron-like processors send signals to other processors. However, in challenging the computer metaphor, the connectionist revolution encouraged more focused reactions to traditional cognitive architectures. In Section 5.3, I explore one example, claiming that cognition not only is not rule-governed symbol manipulation but also requires neither rules nor symbols. I also note that connectionists often fail to use appropriate evidence—interpretations of network structures—to support cognitive theories with no rules or symbols. I end by showing that, when we collect appropriate evidence, we blur the differences between traditional and connectionist cognitive psychology.

Multi-layer perceptrons offered cognitive psychology a paradigm shift (Schneider, 1987). Why might we describe connectionism as a paradigm shift? When we examine a production system (Figure 4-11), we see a set of explicit rules (the productions) and explicit symbols (in working memory). In contrast, when we examine a connectionist network (Figure 4-6), we see neither rules nor symbols. “One thing that connectionist networks have in common with brains is that if you open them up and peer inside, all you can see is a big pile of goo” (Mozer & Smolensky, 1989, p. 3). We describe connectionist networks as offering a paradigm shift for traditional cognitive psychology because networks process information without apparent rules or symbols.

To criticize the computer metaphor, connectionists trained networks to solve problems that other (traditional) researchers believed required rules and symbols. If connectionist networks solved such problems, then connectionists would claim that the networks offered alternative, rule-less and symbol-less, accounts. For example, Rumelhart and McClelland (1986a) trained one network to convert present tense verbs into past tense verbs. We saw in Section 4.7 that Chomsky proposed that mastering a language involves acquiring grammatical rules. Rumelhart and McClelland proposed an alternative: “We suggest that lawful behavior and judgments may be produced by a mechanism in which there is no explicit representation of the rule” (p. 217).

Rumelhart and McClelland’s (1986a) network learned the past tense task, and its changes in performance during learning mirrored the development of handling verb tenses by children. As a result, the model became very influential by providing a radically different account of language learning. Rumelhart and McClelland could describe the network as if it used rules, but no rules were actually represented. “The child need not figure out what the rules are, nor even that there are rules” (p. 267).

However, we can see problems with that conclusion (Pinker & Prince, 1988). Rumelhart and McClelland (1986a) claim that the network does not use rules, but they do not supply evidence about how the network operates to support that claim. Instead, the claim emerges from an uncritical assumption of qualitative differences between networks and rule-governed systems. Rumelhart and McClelland do not actually report the network structure to reveal the alternative (rule-less) nature of its processing. Without such evidence, we cannot establish a qualitative difference between the network and other rule-based models.

Rumelhart and McClelland’s (1986a) failure to examine the internal structure of the past tense verb network should not surprise us. We encounter great difficulties when we attempt to understand the internal workings of connectionist networks (Hecht-Nielsen, 1987; Mozer & Smolensky, 1989). Some researchers argue that failing to understand network structure limits networks’ ability to contribute new cognitive theories (McCloskey, 1991; Seidenberg, 1993). However, various techniques do exist for understanding how networks operate (Berkeley et al., 1995; Dawson, 2018; Dawson et al. 2020; Hanson & Burr, 1990). Importantly, when we use such techniques to interpret networks, the distinction between networks and rule-based models becomes less clear.

Consider a network trained to perform logical judgments. Propositional logic is a system of rules for manipulating symbols. Therefore, logical reasoning provides a prototypical example of rule-governed thinking (Johnson-Laird, 1983; Leighton & Sternberg, 2004; Wason, 1966; Wason & Johnson-Laird, 1972), a position challenged by connectionist networks. Bechtel and Abrahamsen (1991) trained a multi-layer perceptron to classify logical arguments and to indicate argument validity. They hypothesized that “connectionist networks encode knowledge without explicitly employing propositions” (p. 147). Thus, if a network could solve the logic problem, then logical reasoning need not require using rules to manipulate symbols. After successfully training the network, Bechtel and Abrahamsen claimed, “propositionally encoded knowledge might not be the most basic form of knowledge” (p. 174). However, similar to the study by Rumelhart and McClelland (1986a), Bechtel and Abrahamsen did not examine their network’s internal structure.

Other researchers interpreted a different network trained on the Bechtel and Abrahamsen (1991) logic problem (Berkeley et al., 1995). The interpretation discovered the features detected by each hidden unit and determined how the network combined detected features to solve the logic problem. The analysis revealed standard rules of logic represented by network structure. Berkeley et al.’s network solved the logic problem by discovering, and using, rules.

Other examples also show that network interpretations reveal striking similarities between connectionist models and more traditional information processing. For instance, cognitive neuroscientists use dissociations as evidence to relate psychological processes to brain structure. A dissociation occurs when damage to a brain area produces a specific cognitive or behavioural deficit. Researchers combine such evidence with the locality assumption that the brain consists of functionally localized areas (Farah, 1994). Under the locality assumption, cognitive neuroscientists use dissociations to infer that specific brain areas bring to life specific cognitive or behavioural processes.

Farah (1994) used connectionist networks to challenge the locality assumption. She demonstrated that lesions to networks produce dissociations and used her evidence against the locality assumption. Farah argued that networks are distributed systems, not local systems. Therefore, networks do not conform to the locality assumption. If lesions to (non-local) networks produce dissociations, then we cannot claim that dissociations must be caused by damaging localized brain functions.

However, Farah (1994) did not examine the internal structure of her networks to support her argument. She did not confirm that her lesions did not remove a localized structure. A different study lesioned connectionist networks but also interpreted the structure of ablated processors (Medler et al., 2005). When the lesioned networks produced dissociations, the interpretation revealed that local network structure had been removed.

Musical cognition provides another example. Many researchers train connectionist networks to perform various musical tasks (Griffith & Todd, 1999; Todd & Loy, 1991). Most researchers who conduct such research assume that networks capture musical properties that we cannot express using formal rules (Bharucha, 1999). However, when we interpret the internal structures of musical networks, we discover many formal musical properties (Dawson, 2018; Dawson et al., 2020).

A final example concerns a benchmark problem in the machine learning literature. Schlimmer’s (1987) mushroom problem requires a system to learn to classify over 8,000 mushrooms as being edible or poisonous. We describe each mushroom as a set of 21 different features. One study trained a connectionist network with 10 hidden units to classify Schlimmer’s mushrooms (Dawson et al., 2000). Dawson et al. analyzed the network’s internal structure and related it to alternative rule-based models of mushroom classification.

A production system provided one rule-based model for comparison. Dawson et al. (2000) created a set of nine different productions for correctly classifying mushrooms. A set of mushroom features defined each production’s condition. A classification (poisonous versus edible) defined a production’s action. For instance, one production was “if (odour = anise) OR (odour = almond) → edible.” Another production was “if (odour ≠ anise) AND (odour ≠ almond) AND (odour ≠ none) → poisonous.”

What relationship holds between the production system and the network? Dawson et al. (2000) analyzed their network in two different ways. The first analysis determined the features detected by each hidden unit. The second analysis assigned similar mushrooms to groups, defining similarity in terms of the activity produced by a mushroom in the hidden units. If two different mushrooms produced similar activity patterns, then Dawson et al. assigned the mushrooms to the same group. Otherwise, they assigned them to different groups. Dawson et al. required only 12 different groups to summarize the entire stimulus set.

Dawson et al. (2000) combined their two analyses to translate the network into the production system. They summarized each group of mushrooms in terms of the average activity produced in each hidden unit by group members. Furthermore, on the basis of the first analysis, they translated each average activity into a set of mushroom features. Dawson et al. translated each set of mushroom features into one of the production’s conditions. For instance, when the hidden units produced activities associated with one group of mushrooms, the hidden units detected the features representing the condition for a particular production. Furthermore, the hidden unit activities caused an output unit to generate the production’s action (e.g., to classify a mushroom as edible). In other words, we can describe the Dawson et al. model as a connectionist network, but we can also describe it as a production system, blurring the distinction between connectionist and rule-based processing.

The two accounts of the mushroom network introduce the notion of sub-symbolic networks (Smolensky, 1988). Smolensky called networks sub-symbolic because we can explain them by appealing to finely detailed properties: the signals that processors send through weighted connections. Furthermore, the finely detailed properties of sub-symbolic networks (e.g., processor activities) need not represent rules or symbols. If we view networks as being sub-symbolic, then we also believe that higher-level symbolic explanations of networks (e.g., appealing to rules) serve only as approximations. From Smolensky’s perspective, when Dawson et al. (2000) analyzed the details of hidden unit responses, they revealed the mushroom network’s sub-symbolic properties. In contrast, when they translated hidden unit activities into productions, they provided a symbolic approximation of network processing. We can interpret the network “as if” it represents a production system, but in so doing we ignore the fine details of network operations.

However, we need not adopt Smolensky’s (1988) perspective. We have no reason to believe that one of the accounts offered by Dawson et al. (2000) is more accurate, or more informative, than another. Each account depends on the other, and both accounts provide insight into the network. “The picture that emerges is of a symbiosis between the symbolic and subsymbolic paradigms” (Smolensky, 1988, p. 19). To understand a network completely, we require both sub-symbolic and symbolic accounts, implying that connectionism does not eliminate rules from cognitive explanations.

5.4 Can Connectionist Networks Provide Cognitive Theories?

In Sections 5.2 and 5.3, I explored two architectural challenges by connectionists to rule-based cognitivism. I now examine connectionism’s explanatory role in cognitive psychology. Connectionist researchers almost always develop computer simulations of cognitive phenomena because connectionism studies trained networks. In Chapter 3, I detailed rule-based cognitivism’s explanatory mission: subsumed functional analyses. In Section 5.4, I explore connectionism’s philosophy of science by asking whether connectionist networks can serve as theories or explanations.

Simulations as theories. Cognitive psychology aims to explain cognition. Psychological theories take many forms and have many purposes. Some researchers express theories mathematically (Bock & Jones, 1968; Estes, 1975; Restle & Greeno, 1970). Other researchers express theories as interactions between mechanistic components, interactions for producing predictable behaviour (Eichenbaum, 2002; Martinez & Kesner, 1998). Mechanistic components can also belong to an information processing architecture (Anderson, 1983; Carruthers, 2006; VanLehn, 1991). Researchers often express architectural theories as computer simulations (Dutton & Briggs, 1971; Dutton & Starbuck, 1971; Feigenbaum & Feldman, 1963; Simon & Newell, 1958).

Expressing theories as computer programs provides many advantages (Lewandowsky, 1993). Simon and Newell (1958, pp. 7–8) boldly predicted “that within ten years most theories in psychology will take the form of computer programs, or of qualitative statements about the characteristics of computer programs.” Simon and Newell (1958) based their prediction upon their own experience with the computer metaphor. A different kind of simulation, the artificial neural network, also arose at the same time (Rosenblatt, 1958, 1962; Widrow & Hoff, 1960). We saw in Section 5.3 that psychology’s interest in networks exploded when researchers discovered how to train multi-layer perceptrons (McClelland & Rumelhart, 1986; Rumelhart & McClelland, 1986b).

However, multi-layer perceptrons have many practical limitations. The brain has many layers of intermediate neurons, layers that can deliver the brain’s enormous computational power (Bengio, 2009). However, when we add many layers of hidden units to multi-layer perceptrons, the networks become very difficult to train with backpropagation of error. Recently, connectionists have discovered new rules to train networks with many hidden layers, called deep belief networks (Bengio et al., 2013; Hinton, 2007; Hinton et al., 2006; Hinton & Salakhutdinov, 2006; Larochelle et al., 2012; LeCun et al., 2015). Deep learning rules train deep belief networks to accomplish tasks far more complex than we can teach multi-layer perceptrons when we use traditional learning rules.

For instance, deep belief networks learn to solve complex classification problems related to language, image, and sound (Hinton, 2007; Hinton et al., 2006; Mohamed et al., 2012; Sarikaya et al., 2014). Deep belief networks have applications in agriculture, biology, chemistry, and medicine (Ching et al., 2018; Gawehn et al., 2016; Goh et al., 2017; Kamilaris & Prenafeta-Boldu, 2018; Shen et al., 2017). “Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years” (LeCun et al., 2015, p. 436).

Yet, as deep belief networks revolutionize machine learning, we rarely see them used by cognitive psychologists. In Section 5.4, I discuss why researchers have difficulty converting deep learning networks into cognitive theories.

Bonini’s paradox. Although computer simulations offer many advantages, they also face disadvantages (Lewandowsky, 1993). We call one important disadvantage Bonini’s paradox (Dutton & Briggs, 1971). A computer simulation encounters Bonini’s paradox when we have at least as much difficulty explaining the simulation as we do explaining the phenomenon that we want to model. Researchers “can easily construct a computer model more complicated than the real thing. Since science is to make things simpler, such results can be demoralizing as well as self-defeating” (Dutton and Briggs, 1971, p. 103).

Bonini’s paradox applies to any computer simulation but frequently plagues connectionist networks. I noted in Section 5.3 our difficulties understanding the internal structures of trained networks (Hecht-Nielsen, 1987; Mozer & Smolensky, 1989). As a result, connectionists rarely interpret or report network structure. Connectionism replaces one unknown (human performance on a cognitive task) with two unknowns (human and network performance on a cognitive task).

Bonini’s paradox causes connectionists problems because a network can provide a cognitive theory only if researchers can describe precisely how the network converts stimuli into responses. We cannot merely claim that networks offer new theories because we believe that networks differ qualitatively from rule-based models. We must provide details about the alternative theory that a network provides. Otherwise, we merely practise “gee whiz connectionism” (Dawson, 2009).

When connectionists fail to interpret network structure, connectionist networks fail to provide cognitive theories. McCloskey (1991) argues that networks serve as neither theories nor simulations of theories. Seidenberg (1993, p. 229) admits that “connectionist models do not clarify theoretical ideas, they obscure them.” To address such criticisms, connectionists must develop techniques for understanding exactly how networks convert stimuli into responses. We saw earlier (Section 5.3) that such techniques do exist, and when we interpret networks, they can provide theoretical contributions.

Consider one recent example in which networks learn to solve musical problems (Dawson, 2018). After training, Dawson interprets network structure by inspecting connection weights, by plotting distributions of hidden unit activities, and by performing multivariate analyses of processor activities. He provides a detailed account of each musical network. His interpretations reveal that networks represent formal musical properties, but the properties differ from those used in traditional music theory. For example, music theory assumes that we create Western music from a set of 12 different pitch classes (C, C#, D, etc.) (Forte, 1973). However, the hidden units of Dawson’s networks use smaller sets of pitch classes. His networks treat pitch classes differentiated in traditional music theory as being identical.

Dawson (2018) shows that interpreted connectionist networks can inform cognitive theory. His networks inform music theory by introducing different notions of pitch class. In turn, his results raise questions for experimentally studying musical cognition. Does human cognition use musical representations similar to the representations found in the networks? Importantly, not all types of networks can produce cognitive theory. Bonini’s paradox causes Dawson deliberately to avoid deep belief networks (Turner et al., 2018). Deep belief networks have not yet penetrated cognitive psychology because few methods exist for interpreting deep networks (Erhan et al., 2010).

Deep belief networks accomplish incredible feats but do so as black boxes. Researchers cannot explain how complex networks make decisions. However, growing legal pressure might motivate researchers to develop methods for interpreting deep belief networks (Deeks, 2019). When courts challenge decisions made by networks (e.g., rejecting a bank loan), judges demand that banks explain exactly how the networks made those decisions. Uninterpretable networks lose companies money! Interest in explainable artificial intelligence, or XAI, has grown in response to such concerns (Arrieta et al., 2020). XAI researchers aim to develop more easily understood new systems or to develop new approaches for understanding existing technologies such as deep belief networks. Perhaps, once researchers achieve the goals of XAI, we will see deep belief networks playing a larger role in cognitive psychology.

5.5 Do People Think?

In Sections 5.2 through 5.4, I explored architectural debates about cognitive psychology’s foundations related to connectionism’s challenge to the computer metaphor. Connectionism’s challenges arose from claiming that human information processing differs significantly from the processing carried out by digital computers. However, connectionists did not propose the only alternative to the computer metaphor. An alternative position, called embodied cognition, critiques both traditional and connectionist cognitive psychology. We encountered embodied cognition briefly in Section 3.13, and I now explore it in more detail.

Embodied cognitive psychologists argue that both rule-based and network-based theories pay too little attention to the roles of the world and of agents’ bodies in human cognition. These psychologists note that both traditional and connectionist theories appeal to sense-think-act processing. Embodied cognitive psychologists intend to replace sense-think-act theories with theories based upon sense-act processing.

To replace sense-think-act processing with sense-act processing is to propose an alternative architecture for cognition. As was the case with connectionism, the alternative architecture for embodied cognition leads to new debates about cognitive foundations. I now consider some debates arising from embodied cognition’s alternative architecture. In Section 5.5, I explore one immediate consequence of removing “thinking” from the sense-think-act cycle: when we assume that only human cognition involves sense-act processing, do we claim that people do not think? In Section 5.6, I consider a radical implication of embodied cognition’s emphasis on the environment: the mind extends outside the skull and into the world, a world that becomes part of cognition. In Section 5.7, I study one question raised in Chapter 1: can machines think? However, in Chapter 5, I reconsider that question by recognizing that embodied cognition defines “thinking” quite differently from traditional cognitive psychology. To begin our exploration of debates arising from embodied cognition, let us ask why embodied cognitive psychologists prefer sense-act processing over sense-think-act processing.

During a baseball game, a batter hits a fly ball to the outfield. Seeing the hit, an outfielder runs across the field to catch the ball. How does she know where to go? Perhaps the outfielder solves this problem by thinking. She mentally models the ball’s trajectory, using some initial variables, and uses the model to predict where to run to catch the ball (Saxberg, 1987a, 1987b). Alternatively, perhaps when the outfielder starts to run, she simply watches the ball. She runs in the direction that makes the ball’s trajectory look like a straight line (McBeath et al., 1995). If the trajectory does not look straight, then the outfielder changes direction, eventually reaching the location where the ball can be caught.

The two approaches differ dramatically from one another. The first appeals to thinking and mental representation, whereas the second does not. Instead, the second approach involves only sensing (watching the ball) and acting (running across the field). If the second answer works, then we might be able to solve some complex problems without thinking. We might then ask which other problems can we solve without thinking? Do people need to think at all? I now explore such questions by considering sense-act theories in cognitive psychology.

Experimental psychology arose in the 19th century to study consciousness scientifically. Psychology changed when behaviorism arrived; behaviorists argued that scientific psychology could study observable phenomena only (Watson, 1913). Behaviorism provided a stimulus-response psychology. Cognitive psychology reacted to behaviorism by adopting a competing view, the sense-think-act cycle. According to that cycle, organisms first sense information from the environment. Then they think by manipulating sensed (and represented) information. The purpose of thinking is to plan—to hypothesize actions that might achieve desirable outcomes and reject actions that do not. Finally, sense-think-act processing converts a chosen plan into action on the world. We call the sense-think-act cycle the classical sandwich, because thinking is necessarily sandwiched between sensing and acting (Hurley, 2001). In that cycle, direct connections do not exist between sensing and acting. We must think before we act.

In reacting to behaviorism, cognitive psychology appealed to information processing concepts as intervening variables (Bruner, 1990; Sperry, 1993). However, the “sense” and “act” components of the classical sandwich seemed to be too behaviorist by being too strongly linked to “stimulus” and “response.” As a result, cognitive psychologists overemphasized thinking and underemphasized both sensing and acting. “One problem with psychology’s attempt at cognitive theory has been our persistence in thinking about cognition without bringing in perceptual and motor processes” (Newell, 1990, p. 15). Which problems arise when cognitive psychologists overemphasize thinking?

A first problem is that, when we overemphasize thinking, our theories become overly complex (Braitenberg, 1984). Researchers who emphasize thinking assume that complicated actions result from intricate thought processes. Thus, cognitive psychologists explain complex behaviour by proposing complex thought processes. However, complex behaviour might arise from much simpler processes.

Consider the parable of the ant (Simon, 1969). Imagine explaining the winding path that an ant takes along a beach. If we focus exclusively on thinking, then we explain the path’s shape via complex internal processes. However, we can adopt a simpler sense-act theory: the complex path emerges from the ant’s simple reactions to obstacles. As Simon noted, “Viewed as a geometric figure, the ant’s path is irregular, complex, hard to describe. But its complexity is really a complexity in the surface of the beach, not a complexity in the ant” (p. 24).

A second problem emerges when our theories overemphasize thinking and planning. Information processing creates plans by constructing and updating a mental model of the world. We plan our actions by manipulating our mental models. However, modelling and planning require excessive time and resources (Ford & Pylyshyn, 1996; Pylyshyn, 1987). Consider the famous robot Shakey, which navigated through an environment, pushing objects to new locations to accomplish assigned tasks (Nilsson, 1984). Shakey sent sensor readings to a computer, which created a model of the robot’s world. The computer used the model to plan actions and sent the plan back to Shakey’s robot body for execution. Shakey exemplified the sense-think-act cycle. Unfortunately, the robot performed behaviours extremely slowly. Shakey required several hours to complete tasks (Moravec, 1999), spending much time idling in place while the computer modelled and planned. Shakey’s “thinking” required a great deal of time, making Shakey’s actions uselessly slow.

This slow performance inspired an alternative, behaviour-based robotics, to speed up behaviour by removing thinking (Brooks, 1991, 1999). “Models of the world simply get in the way. It turns out to be better to use the world as its own model” (Brooks, 1991, p. 139). Behaviour-based robotics replaced the sense-think-act cycle with a sense-act cycle. By removing thinking, behaviour-based robotics resembles behaviorism, and behaviour-based roboticists only consider stimuli and responses, using “highly reactive architectures with no reasoning systems, no manipulable representations, no symbols, and totally decentralized computation” (Brooks, 1999, p. 170). Behaviour-based robots sense and react; they do not think and plan.

In cognitive psychology, ideas from behaviour-based robotics appear to inspire an approach called embodied cognition (Calvo & Gomila, 2008; Chemero, 2009; Clark, 1997, 1999, 2008; Dawson et al., 2010; Lakoff & Johnson, 1999; Rowlands, 2010; Shapiro, 2014, 2019; Varela et al., 1991; Wilson, 2002).

Shapiro (2019) identifies three characteristics of embodied theories. The first characteristic is conceptualization. The concepts that an organism uses to interact with the environment depend on the form of the organism’s body. If different agents possess different bodies, then their understanding of or engagement with the world also differs. We find conceptualization in biology’s notion of the umwelt (Uexküll, 1957, 2001) and in psychology’s related idea of affordance (Gibson, 1979). Gibson called a possible action offered by the world to an organism an affordance, which depends on the shape of the world and the nature of the organism’s body. A smooth, vertical wall does not afford “climbing” to a human, but the same wall affords “climbing” to a housefly.

The second characteristic is replacement (Shapiro 2019). The environment can aid or replace cognitive resources. For instance, a student who takes lecture notes replaces her internal memory with an environmental record. When we use the environment to support cognition, the environment provides cognitive scaffolding (Clark, 1997).

And the third characteristic is constitution (Shapiro 2019). An organism’s body and environment have more than causal effects on cognition. Instead, the body and the world belong to cognition. The constitution hypothesis leads to a radical proposal, the extended mind hypothesis, which I consider in Section 5.6.

Shapiro (2019) observes that his three characteristics appear to different degrees in different embodied theories. For example, when theories exhibit different degrees of replacement, some are more comfortable than others with the existence of mental representations. Some embodied theories are hybrid because they explain some cognition with the sense-think-act cycle and other cognition with sense-act processing. If the world scaffolds cognition, then some thinking moves from inside the head to outside in the world—but not all thinking can move to the world.

Many embodied theories of language, social cognition, and mathematical reasoning propose that we use our own bodies to scaffold cognition (Dove, 2014; Fischer & Zwaan, 2008; Gallese & Goldman, 1998; Gallese et al., 2004; Gallese & Sinigaglia, 2011; Lakoff & Johnson, 1999; Lakoff & Núñez, 2000). For example, children often develop number concepts by counting with their fingers (Dehaene, 2011). Fingers offer the affordance of “countable” (Chrisomalis, 2013). Finger-based representations provide prototypical examples of replacing symbols with bodily representations (Bender & Beller, 2012; Fischer & Brugger, 2011; Tschentscher et al., 2012; Wasner et al., 2014).

However, not all theories in embodied cognition propose representations. Radical embodied cognitive scientists propose explicitly anti-representational theories (Anderson et al., 2012; Chemero, 2000, 2009; de Oliveira et al., 2019). Radical embodied cognitivists believe that cognitive psychologists err when appealing to representations. Radical embodied cognitive science avoids making such a mistake by eliminating representation and by explaining all of cognition using sense-act theories.

We can find many sense-act accounts of diverse cognitive phenomena (Shapiro, 2014). By understanding the parable of the ant, and by paying more attention to the roles of the environment and bodies in cognition, embodied cognitive psychologists can propose new and important theories. However, radical embodied cognition might not succeed in eliminating mental representations. Focusing only on sensing and acting moves embodied cognition closer to behaviorism and to the challenges that it failed to meet.

Gestalt psychologists challenged behaviorism by discovering insight—the sudden and unexpected experience of a solution to a problem (Köhler, 1925/2018; Wertheimer & Asch, 1945). Can sense-act theories explain insight? The stimulus-response theories of behaviorism provided inadequate accounts of human language (Chomsky, 1959). Do the sense-act theories of radically embodied cognition offer more explanatory power? Do people think? We have no compelling evidence to throw away the representations proposed by cognitive psychologists. However, we also have no reason to believe that every cognitive phenomenon has a representational explanation.

Embodied cognition reveals that we can explain many interesting phenomena via rich interactions between bodies and environments. Embodied cognitive researchers “get to work providing non-representational explanations of cognitive phenomena both convincing and sufficiently rich in their implications to guide further research” (Chemero, 2000, p. 646). Embodied cognition adopts a fruitful strategy likely to show how much cognition requires thinking—and how much does not.

5.6 Where Is the Mind?

Where is the mind? According to modern, materialist psychology, the mind resides inside the skull, because brains cause minds, and skulls contain brains (Searle, 1980, 1984). However, embodied cognition challenges this answer. Embodied cognition uses feedback to link or couple organisms to their environments (Ashby, 1956, 1960; Grey Walter, 1963; Wiener, 1948). Organisms act to change the world, and changes in the world influence future actions. Feedback means that we can use the world for cognitive scaffolding (Clark, 1997, 2008; Scribner & Tobach, 1997).

We can easily propose many concrete examples of cognitive scaffolding. When we write a reminder to ourselves, we use the environment to scaffold memory. Children gain insight into calculating the areas of irregular figures by cutting cardboard models with scissors (Wertheimer & Asch, 1945). When a player rearranges her tiles while playing Scrabble, the rearranged tiles scaffold word retrieval (Kirsh, 1995).

We can find more abstract examples of scaffolding. In education, Vygotsky (1986) called the zone of proximal development the difference between a child’s ability to solve problems without aid and his ability to solve problems when provided with support or assistance. Vygotsky championed educational techniques for bridging the gap using scaffolding, which involves social and cultural factors and includes language.

Cognitive scaffolding typifies an important characteristic of embodied cognition, replacement, which occurs when environmental scaffolds replace internal cognitive resources (Shapiro, 2019). Replacement frees cognitive resources, reducing “the loads on individual brains by locating those brains in complex webs of linguistic, social, political, and institutional constraints” (Clark, 1997, p. 180).

Replacement, however, also leads to questions about the mind’s location. If I scaffold my memory with written notes, then do they make up part of my memory? If I find rules to calculate area by manipulating cardboard models, then do the cardboard models make up part of my mathematical reasoning? “If, as we confront some task, a part of the world functions as a process which, were it done in the head, we would have no hesitation in recognizing as part of the cognitive process, then that part of the world is (so we claim) part of the cognitive process” (Clark & Chalmers, 1998, p. 8).

Questions about the mind’s location intensify after realizing that scaffolding consists of more than using the environment to store information. The organism’s body—its embodiment—determines the rich interaction between an organism and its world (Gibson, 1979). Embodiment defines possible actions; Gibson claims that “it is often neglected that the words animal and environment make an inseparable pair” (p. 8). The rich interactions involved in scaffolding lead to a controversial property of embodied cognition, constitution (Shapiro, 2019). If the environment can replace cognitive resources, and if embodiment and environment determine how we experience the world, then the environment does more than supply information. Constitution claims that the world belongs to cognition and does not merely provide information to cognition.

Constitution alters the definition of “mind” or “self” and questions the mind’s location (Bateson, 1972). If the environment belongs to cognition, then the mind extends into the world. What does the extended mind imply? “But what about ‘me’? Suppose I am a blind man, and I use a stick. I go tap, tap, tap. Where do I start? Is my mental system bounded at the handle of the stick? Is it bounded by my skin?” (Bateson, 1972, p. 465). Embodied cognition takes Bateson’s questions seriously by proposing the extended mind hypothesis (Clark, 1997, 1999, 2003, 2008; Clark & Chalmers, 1998; Menary, 2008, 2010; Noë, 2009; Rupert, 2009; Wilson, 2004, 2005). According to this hypothesis, no boundary exists between the mind and the world. “It is the human brain plus these chunks of external scaffolding that finally constitutes the smart, rational inference engine we call mind” (Clark, 1997, p. 180).

The extended mind hypothesis also permits more elaborate notions of mind, such as cooperative cognition, which occurs when several agents share an environment (Hutchins, 1995). More than one cognitive agent can manipulate the world, which also scaffolds the information processing of other group members. As a result, “organized groups may have cognitive properties that differ from those of the individuals who constitute the group” (Hutchins, 1995, p. 228).

Hutchins (1995) uses his idea to extend the parable of the ant (Simon, 1969). Hutchins proposes watching generations of ants at work at a beach after a storm. Later generations will appear to be smarter because they behave more efficiently. But, as Hutchins notes, “the environment is not the same. Generations of ants have left their marks on the beach, and now a dumb ant has been made to appear smart through its simple interaction with the residua of the history of its ancestor’s actions” (p. 169).

Collective cognition appears to be outside cognitive psychology, such as in entomology’s concept of the superorganism, which imparts intelligence to the colony instead of the individual (Wheeler, 1911). The superorganism describes how colonies create elaborate structures, such as nests; we cannot predict such achievements from the capabilities of individual colony members.

Furthermore, the superorganism’s intelligence emerges from cognitive scaffolding called stigmergy (Grasse, 1959; Theraulaz & Bonabeau, 1999). Stigmergy proposes that members of insect colonies do not themselves coordinate nest-building behaviour. Instead, the nest controls its own construction by stimulating insect behaviour. The nest-as-stimulus elicits particular insect actions for changing the nest in particular ways. Once changed, the nest becomes a different stimulus and elicits different nest-building actions.

The success of exploring the extended mind or collective cognition in other fields fuels new interest in these ideas within cognitive psychology. Such interest flourishes in embodied cognition, which has very different ideas about the mind and the role of the environment. However, the extended mind hypothesis faces intense criticism (Adams & Aizawa, 2008; Menary, 2010; Robbins & Aydede, 2009). Adams and Aizawa argue that embodied cognitivists do not define principled differences between cognitive and non-cognitive processing: “What the advocates of extended cognition need, but, we argue, do not have, is a plausible theory of the difference between the cognitive and the non-cognitive that does justice to the subject matter of cognitive psychology” (p. 11). They worry that the extended mind means that anything is cognitive.

Where is the mind? A psychology more open to cognitive contributions from the world, contributions also depending on the body, raises important questions about where cognition occurs. Such questions mean that researchers must reconsider how we need to study cognition. “We need a greater understanding of the ways in which the institutional setting, norms and values of the work group and, more broadly, cultural understandings of labor contribute to the reorganization of work tasks in a given community” (Scribner & Tobach, 1997, p. 373).

5.7 Can Machines Think?

Can machines think? The answer depends on our definition of “thinking.” For instance, if we believe the information processing hypothesis—if cognition is rule-governed symbol manipulation—then other symbol-manipulating devices, such as digital computers, can think (Section 1.6). However, the answer also depends on our definition of “machine.” Traditional cognitive psychologists, connectionists, and embodied cognitive psychologists propose different notions of “machine.”

The possibility of thinking machines begins with the mechanical view of human bodies. In the 17th century, philosophers described humans as machines (Descartes, 1637/1960; Hobbes, 1651/1967). Philosophers described 18th-century clockwork automata as “living machines” offering support to mechanistic philosophies (Grenville, 2001; Wood, 2002). Some 18th-century philosophers claimed that thought itself is mechanical (La Mettrie, 1750). In the 19th century, elaborations of the mechanical view heralded modern notions of machine intelligence. George Boole (1854/2003) invented mathematical logic because he wanted to study thought mathematically. He equated thinking with performing logical operations and introduced his mathematical ideas in a book titled The Laws of Thought.

Researchers soon realized that machines could perform Boole’s logical operations and invented various devices for solving problems of logic. The first such device, called the logical piano (Jevons, 1870), inspired more powerful logic machines (Marquand, 1885). Marquand even designed an electromagnetic logic machine (Mays, 1953). Boole’s logic also set the stage for the 20th century’s information age. Alan Turing’s (1936) universal machine had far more power than did the 19th-century logic machines. Claude Shannon (1938) represented electric circuits as Boolean operators. The digital computer’s invention depended on Turing’s and Shannon’s insights (Goldstine, 1993).

Digital computers appeared to be capable of thinking (Section 1.6). Early computers performed intelligent tasks, such as playing games, generating logical proofs, or solving problems (Feigenbaum & Feldman, 1963; Newell et al., 1958; Newell & Simon, 1956; Samuel, 1959; Simon & Newell, 1958). Turing (1950), convinced of the inevitability of machine intelligence, developed a test of computer intelligence. Books, aimed at the general public, described computers as thinking machines (Adler, 1961; Bell, 1962; Berkeley, 1949; Wiener, 1950). Advances in artificial intelligence strengthened the widespread belief in thinking machines. The age of intelligent machines spanned the late 1960s to the late 1980s (Kurzweil, 1990). During the age of intelligent machines, researchers developed numerous expert systems (Feigenbaum & McCorduck, 1983; Kurzweil, 1990). An expert system, a computer program, solves problems with an ability equal to, if not greater than, a human expert. Expert systems appeared in diverse domains, such as finance, manufacturing control, and medical diagnosis.

However, expert systems could only solve very narrowly defined problems and did not deliver general intelligence. “An overall pattern had begun to take shape . . . : an early, dramatic success based on the easy performance of simple tasks, or low-quality work on complex tasks, and then diminishing returns, disenchantment, and, in some cases, pessimism” (Dreyfus, 1992, p. 99). The pattern noted by Dreyfus produced harsh criticisms of the computer metaphor (Dreyfus, 1972, 1992; Winograd & Flores, 1987). AI researcher Terry Winograd (1972, 1983) pioneered computer programs for understanding language. However, by the late 1980s, he held little hope for his enterprise: “Our position . . . is that computers cannot understand language” (Winograd & Flores, 1987, p. 107).

Rising pessimism led many to argue that the computer metaphor does not provide an adequate account of human thinking. The Chinese room argument provides one influential example (Searle, 1980, 1984, 1990). In Searle’s thought experiment, we write a question in Mandarin symbols, pass the question into a room through a slot, and then receive an answer, again written in Mandarin symbols. Clearly, the room understands the symbols because the room provides intelligible answers to written questions. But concerns arise when we look inside to see the mechanisms in the room. We see boxes of Mandarin symbols and instructions for converting sequences of symbols passed into the room into new sequences (answers). Inside we also see a native English speaker who does not understand Mandarin symbols, but she can follow the room’s instructions and answer the questions even though she does not understand what the symbols mean.

The Chinese room contains the core elements of an information processing theory: the rule-governed manipulation of symbols. However, the room’s components possess no true understanding of Mandarin. Therefore, Searle uses the Chinese room to argue that the computer metaphor cannot explain intelligent acts, such as understanding language. “Understanding a language, or indeed, having mental states at all, involves more than just having a bunch of formal symbols” (Searle, 1984, p. 33). If computers cannot produce intelligence, then what kind of machine can? Searle answers “the brain” because brains cause minds and—given the Chinese room argument—must do so by doing more than running a computer program. Thus, “anything else that caused minds would have to have causal powers at least equivalent to those of the brain” (p. 40). In other words, the possibility of machine intelligence depends on the nature of the machine itself.

Concerns about the computer metaphor produced alternative approaches to information processing, including a growing interest in neural accounts of cognitive processes. I have already discussed one related topic, the rise of connectionism, in Section 5.2. Unsurprisingly, connectionism arose at the same time that serious criticisms of the computer metaphor appeared. Figure 5-4 illustrates such a trend by plotting the number of times that four different terms (“expert systems,” “neuroscience,” “connectionism,” and “cognitive neuroscience”) appear in books curated by Google in the period from 1970 to 2008. I obtained the plotted results using the Google nGram viewer. The graph shows the dramatic rise, and the equally dramatic fall, of the term “expert system.” When its usage decreases, we also see increases in using the other three terms, reflecting more brain-based approaches to mental phenomena.

Searle (1990) modified the Chinese room argument to challenge connectionism by proposing a Chinese gym filled with many native English speakers, each performing the same function as a neuron or a network processor. No one inside the gym understands Chinese. The gym can answer the same questions answered by the room. Searle concluded that, because no understanding of Chinese exists inside the gym, the gym refutes connectionism just as the room refutes the computer metaphor.

Searle’s Chinese gym provoked a connectionist rebuttal (Churchland & Churchland, 1990). The Churchlands noted that “no neuron in my brain understands English, although my whole brain does” (p. 37). They provided an example of the whole system reply to the Chinese room argument. According to that reply, the whole room (or the whole gym), not its components, understands Chinese. The whole system reply also encourages us to consider which kinds of whole systems can understand. For instance, some could argue that an intelligent Chinese room must take the form not of a room but of a humanoid. To understand a language, computers might require bodies similar to those of humans, because our physical interactions with our world define our semantics (Dreyfus, 1967).

The x-axis of the graph shows the years from 1970 to 2010. A bell-shaped curve labelled “expert system” peaks at the middle and rises above three other lines. Each of the other three lines, labelled “neruosience,” “connectionism,” and “cognitive neuroscience” are straighter and shallow. The “expert system” line drops below the line representing “neuroscience” at the year 2000. — Figure 5-4 The results from Google’s nGram viewer for the term “expert system” and for three other brain-based terms for the period from 1970 to 2008.

We frequently see roboticists claiming that humanoid intelligence depends on humanoid embodiment. The early successes in behaviour-based robotics involved machines with insect-like embodiment (Brooks, 1999, 2002). However, robots with such embodiment can achieve at best insect-level intelligence (Moravec, 1999). We must develop alternative embodiments if we want to emulate human intelligence. Such criticisms have driven recent advances in behaviour-based robotics.

For example, Brooks developed a humanoid robot called Cog (Brooks, 1997; Brooks et al., 1999; Brooks & Stein, 1994). Cog began as a torso with a single jointed arm and a head. Brooks aimed to make Cog’s interaction with the world as human-like as possible. For instance, its visual system consisted of cameras to emulate human saccadic eye movements. Brooks developed Cog to explore ideas central to embodied cognition: “All human representations are ultimately grounded in sensory motor patterns. Thus to build an artificial system with similar grounding to a human system it is necessary to build a robot with human form” (1997, p. 968).

Social robotics provides another example of emphasizing human embodiment. The embodiment of a social robot facilitates and modulates social interactions with humans (Breazeal, 2002, 2003, 2004; Breazeal et al., 2009). Social robots typically have human or animal embodiments to permit dynamic interactions with humans via verbal or non-verbal behaviours (Breazeal et al., 2016). For instance, one robot uses variations of head position, mouth shape, direction of gaze, opening of eyelids, or raising of eyebrows to influence social interactions or to communicate internal states. The robot’s behaviours influence people because humans have high sensitivity to such social signals (Breazeal et al., 2016). Socially intelligent robots require appropriate embodiment.

Can machines think? Early adopters of the computer metaphor answered affirmatively. However, the failure of expert systems to produce general intelligence produced new ideas. Some argue that machines can think if their inner workings resemble those of the brain. Others argue that, for machines to think like humans, they must have humanoid embodiment. Ideas about intelligent machines have evolved as different approaches to human information processing have emerged.

Interestingly, Searle dismisses such approaches. His Chinese gym argument attacks connectionists (Searle, 1990). As for humanoid embodiment, Searle (1984) argues that a robot cannot understand language using a computer “brain.” Searle believes that only biological brains can think. We will see in Section 5.8 that many other researchers share Searle’s belief.

5.8 What Is the “Cognitive” in Cognitive Neuroscience?

Sections 5.2 through 5.7 explored debates arising from challenges to the computer metaphor. Both connectionism and embodied cognition reject similarities between human and computer information processing. The new views of information processing provided by connectionism and embodied cognition represent architectural challenges to traditional cognitive psychology. However, the new views also lead to philosophical challenges. Both connectionism’s appeal to brain-like processing and embodied cognition’s appeal to physical embodiment react to cognitive psychology’s functionalism because the new views place particular emphasis on physical substrates or mechanisms.

I now explore a particular rejection of functionalism, cognitive neuroscience. Since the 1990s, the so-called decade of the brain, researchers have increasingly proposed cognitive theories related to brain areas or neural functions. Such theories reject functionalism. How might non-functionalist theories contribute to cognitive psychology? In Section 5.8, I argue that cognitive neuroscientists should contribute to cognitive psychology by supporting functional analyses and not by reducing cognition to brain operations. In Section 5.9, I elaborate the argument in Section 5.8 by considering recent philosophical criticisms of cognitive neuroscience. I begin by relating cognitive psychology and cognitive neuroscience using many-to-one relationships.

Theories in cognitive psychology reveal many-to-one relationships. Different algorithms can produce the same behaviour. Different architectures can perform the same algorithm. Different physical systems can create the same architecture. We call the last relationship, from the physical to the architectural, multiple realization (Polger & Shapiro, 2016). Multiple realization recognizes that different physical substrates can produce identical functions (Putnam, 1975a, 1975b). “We could be made of Swiss cheese and it wouldn’t matter” (Putnam, 1975b, p. 291). Cognitive psychologists endorse multiple realization by appealing to functions instead of physical states (Polger, 2012). Multiple realization permits computer simulations of cognitive theories. Provided that we use the correct functions, we need not concern ourselves with the physical differences between computers and brains.

Although early cognitive psychology adopted functionalism, modern cognitive psychology appeals more and more to cognitive neuroscience. The rise of cognitive neuroscience began in the 1990s with the so-called neuro-turn in the social sciences (Cooter, 2014; Pedersen, 2011; Vidal & Ortega, 2017). The neuro-turn grew with increased funding for brain research during the “decade of the brain” (Jones & Mendell, 1999). Cognitive psychology’s neuro-turn appears in many textbooks, which display brain imaging pictures and include chapters about cognitive neuroscience (Section 5.10). Modern texts also include cognitive neuroscience in definitions of cognitive psychology (Section 5.11). Given the functionalism of cognitive psychology, how can cognitive psychology include cognitive neuroscience? What is the “cognitive” in cognitive neuroscience (Figdor, 2013)?

Cognitive neuroscientists assume that brains cause minds and explore their assumption scientifically. “Cognitive neuroscience is an experimental investigation that aims to discover empirical truths concerning the neural foundations of human faculties and the neural processes that accompany their exercise” (Bennett & Hacker, 2013, p. 238). Cognitive neuroscientists adopt three additional assumptions to guide their investigations (Frisch, 2014). The first is localizationism: assuming an association between mental functions and localized brain areas. The second is internalism: assuming that neural mechanisms produce localized mental functions. And the third is isolationism: assuming that we can use biological causes to explain how mental functions are generated by local brain regions.

The history of the three assumptions dates back to the early 18th-century hypothesis associating specific mental functions with specific, local brain regions. Gall’s now discredited phrenology popularized localizationism (Simpson, 2005). Proper scientific support for localized brain functions arose in the middle of the 19th century. Paul Broca and Carl Wernicke discovered that damage to different regions of the brain produced different types of aphasia (Bennett & Hacker, 2013). Thus, we find historical links between cognitive neuroscience and the study of the relationships between brain injuries and mental deficits.

We can also link cognitive neuroscience to more recent invasive studies of animal brains. David Hubel and Torsten Wiesel (1959, 1962) recorded the responses of visual neurons in cats and pioneered techniques used to provide a modern, detailed, functional map of the primate visual system (van Essen et al., 1992). The modern map has 32 different cortical areas, each understood as detecting different visual features. Thus, invasive explorations of the visual system also provide evidence to associate local brain areas with particular psychological processes.

Modern cognitive neuroscience distinguishes itself from older traditions by using non-invasive techniques to study the normal brain (Kok, 2020). Cognitive neuroscience’s older techniques measured the brain’s electrical activity and included the electroencephalogram (EEG), a technique developed in the 1920s (Stone & Hughes, 2013). An EEG measures electrical activity in different parts of the brain via electrodes attached to the scalp. A related technique, the event-related potential (ERP), uses the EEG to measure brain responses to specific sensory, cognitive, or motor events. We measure ERPs by combining multiple EEG recordings (Cox & Evarts, 1961; Dawson, 1954). More modern techniques create brain images from magnetic fields. Researchers invented magnetic resonance imaging (MRI) in the early 1970s (Lauterbur, 1973). Functional MRI (fMRI), developed in the early 1990s (Bandettini, 2012), measures brain activity by detecting changes associated with cerebral blood flow. fMRI’s key metric is BOLD: blood-oxygen level-dependent.

How do cognitive neuroscientists use such modern methods? We can describe one common approach as a new version of the subtractive method established by Donders (Sections 3.9 and 4.2). The modern subtractive method measures brain activity during a “reference” task as well as during a “target” task (Cabeza & Nyberg, 1997, 2000). Researchers presume that the target task differs from the reference task by a single cognitive process of interest. By subtracting reference task brain activity from target task brain activity, cognitive neuroscientists can correlate the cognitive process of interest with brain regions exhibiting higher activity (“activations”) or lower activity (“deactivations”).

One imaging study of the Stroop effect provides an example of cognitive neuroscience’s subtractive method (Langenecker et al., 2004). The study compared the neural processing of older adults with that of younger adults. Older adults tended to produce larger Stroop effects. Langenecker et al. hypothesized an association between the brain’s frontal lobes and age-related differences in Stroop effects, and they used three different Stroop task conditions. The first, the congruent condition, had the names of colours printed in correct ink colours. The second, the incongruent condition, had the names of colours printed in incorrect ink colours. The third, the neutral condition, had non-colour words printed in various ink colours. Langenecker et al. examined brain activations using variations of the subtraction method. In the first variation, they treated the incongruent condition as the target condition and the neutral condition as the reference condition. In the second, they treated the incongruent condition as the target condition and the congruent condition as the reference condition. And in the third, they treated the congruent condition as the target condition and the neutral condition as the reference condition.

Langenecker et al. (2004) found results to support the hypothesis that older adults needed to recruit more inhibitory mechanisms in order to perform the task. For instance, the incongruent-congruent comparisons revealed higher activations in frontal lobes for older adults compared with younger adults. Langenecker et al. obtained similar results for the congruent-incongruent comparison. They concluded that frontal lobes (in particular, the left inferior frontal gyrus) control inhibition.

Brain imaging studies like the one conducted by Langenecker et al. (2004) provide insights into the relationship between the brain and cognition. One paper reviewed the results of 275 different imaging studies conducted between 1988 and 1998 (Cabeza & Nyberg, 2000). The review covered many cognitive domains (attention, perception, imagery, language, and several different aspects of memory). Cabeza and Nyberg found that specific regions of the brain demonstrate consistent activation patterns for each cognitive domain.

In spite of cognitive neuroscience’s apparent success, researchers express persistent and growing concerns about its utility. Some concerns focus on brain imaging methodology. Others question the inferences that we can make about relations between the brain and cognition. Let us briefly consider some criticisms of cognitive neuroscience.

One extensive literature review challenges cognitive neuroscience’s localizationism (Uttal, 2011). Uttal argues that brain imaging results fail to demonstrate functional localization. Instead, the results provide overwhelming evidence of distributed representations and processing. As Uttal notes, “Brain imaging meta-studies show that when the results of a number of experiments are pooled, the typical result is to show activations over most of the brain rather than convergence on a single location” (p. 365).

Uttal (2011) also questions the subtractive method. When it reveals no activation differences between tasks, different microscopic processes can still occur. Different processes might produce the same activity. Worries about the subtractive method, coupled with concerns about localizationism, lead Uttal to promote a new approach emphasizing distribution, interconnectedness, poly-functionality, and microscopic processing.

Uttal (2011) also charges that cognitive neuroscientists appeal to poorly defined cognitive terms: “The reference of such terms as learning, emotion, perception, and so on [is] . . . not precisely defined by either these words or the experimental context in which they arise. There is a major disconnect between our understanding of what cognitive processes are and the brain measures we try to connect to them” (pp. 367–368). Similar concerns underlie criticisms of how cognitive neuroscientists present their work to the public (Figdor, 2013). Other scholars question whether cognitive neuroscience adds any understanding to pre-existing cognitive concepts (Vidal & Ortega, 2017). The public sees cognitive neuroscience as offering a naive reductionism. When relating some cognitive function X and some brain area Y, cognitive neuroscientists imply that we can explain X with Y or that we can reduce X to Y: a false implication. Knowing (from brain imaging studies) that the medial temporal lobe relates to consolidating memories (Squire & Wixted, 2011) is quite different from knowing how the brain performs consolidation. Knowing where is not the same as knowing how.

A related issue concerns relating cognitive functions and cognitive neuroscience. Functions precede localizationism. For instance, creating target and reference tasks depends on a pre-existing functional analysis of cognitive processing. Furthermore, cognitive neuroscience cannot use neural observations to generate functional descriptions. “Trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers: It just cannot be done” (Marr, 1982, p. 27).

Bennett and Hacker (2003, 2013) offer one final concern about how cognitive neuroscience relates the brain to cognition. They recognize that the brain causes psychological states but argue that we make a logical error when we attribute psychological states to the brain. They claim that we can only ascribe psychological states to whole organisms and not to parts of whole organisms (e.g., the brain). In Section 5.9, I examine Bennett and Hacker’s critique in more detail.

Given the various concerns raised above, what can cognitive neuroscience contribute to cognitive psychology? Cognitive neuroscience encounters problems when trying to reduce cognition to brain processes. We can view cognitive neuroscience more pragmatically by claiming that it contributes to cognitive psychology’s functional analyses. Cognitive psychologists cannot directly observe cognitive processes, so they must infer functions from empirical observations. Cognitive neuroscience offers new observations to support functional analysis.

Cognitive neuroscientists frequently use ERPs to study attention (Woodman, 2010). ERPs can measure changes in brain activity occurring 1 millisecond to the next. We can reliably associate different ERP components with different kinds of attentional processes, such as shifting attention during a visual search. Participants do not have to shift attention away from a task to respond to it, another advantage of measuring attentional processing with ERPs. We can also combine ERPs with other spatial measures of brain processing, such as fMRI (Heinze et al., 1994; Hopfinger et al., 2000; Mangun et al., 1998).

Functional analysis requires more than merely identifying potential functions. When we perform functional analysis, we decompose higher-order functions into organized systems of sub-functions. Cognitive neuroscience can help to guide functional decomposition, as illustrated by cognitive neuroscience’s study of human memory. Some of the earliest evidence for analyzing memory into different subsystems arose from studies of memory deficits associated with brain lesions (Scoville & Milner, 1957; Squire, 2009). More recent brain imaging studies associate different patterns of brain activity with different kinds of memory (Cabeza & Nyberg, 2000; Milner et al., 1998; Squire & Wixted, 2011). Cognitive neuroscience’s ability to guide functional decomposition does not require localizationism. We need not correlate a cognitive function with a local brain region. We only need to observe reliable differences in activation patterns, realizing that multiple brain regions might produce the differences.

Cognitive neuroscience can also aid functional decomposition by exploring domains that cognitive psychologists might not ordinarily study. For instance, we will see in Section 5.10 that cognitive psychologists rarely study emotion. However, cognitive neuroscientists have studied the relationship between emotion and memory (Eichenbaum, 2002). Thus, cognitive neuroscience can expand the typical domain of a functional analysis conducted by cognitive psychologists.

We cannot complete a functional analysis without evidence of primitive sub-functions. Because biological mechanisms bring to life the primitives of human or animal cognition, cognitive neuroscience can help to subsume a functional analysis. In all likelihood, cognitive neuroscience’s support for the subsumption of functional primitives requires a microscopic perspective of the sort favoured by Uttal (2011). We can find one example in current accounts of Hebbian learning. In his theory of cell assemblies, Hebb (1949) proposed that, if two neurons generated action potentials at the same time, then the excitatory connection between them would strengthen. His idea inspired many models of associative learning (Hinton & Anderson, 1981; Milner, 1957; Rochester et al., 1956).

We can subsume Hebbian learning. The biological phenomenon of long-term potentiation, dependent on N-methyl-D-aspartate (NMDA) receptors in hippocampal neurons, provides a biological account of Hebb’s learning rule (Bliss & Lomo, 1973; Brown, 1990; Lynch, 1986; Martinez & Derrick, 1996). NMDA receptors only permit ions to pass through membranes when both pre- and post-synaptic activity occur at the same time. Furthermore, blocking NMDA receptor sites prevents the hippocampus from establishing memories, indicating that NMDA receptors relate to learning (Morris et al., 1986). NMDA receptors explain Hebbian learning (Brown & Milner, 2003; Klein, 1999; van Hemmen & Senn, 2002).

Many controversies emerge when we treat the goal of cognitive neuroscience as reducing psychological functions to neural processes. We find fewer controversies when we view the goal as providing additional observations to support functional analysis. The methods of cognitive neuroscience complement the methods of experimental cognitive psychologists, aiding the broader effort of performing functional analyses of cognition.

5.9 Do Brains Think?

Do brains think? Almost every modern cognitive psychologist would answer “Yes.” However, other scholars would not (Bennett et al., 2007; Bennett & Hacker, 2003, 2013). Bennett and Hacker argue that brains do not think and that cognitive neuroscientists should not attribute psychological properties to the brain. Bennett and Hacker argue that brains do not think because thinking is a characteristic of whole organisms and not of their parts: “We deny that it makes sense to say that the brain is conscious, feel[s] sensations, perceives, thinks, knows or wants anything—for these are attributes of animals, not of their brains” (2013, p. 242).

Bennett and Hacker (2013) take inspiration from Ludwig Wittgenstein’s Philosophical Investigations, drawing from a particular passage in that book: “Only of a human being and of what resembles (behaves like) a living human being can one say: it has sensations; it sees, is blind; hears, is deaf; is conscious or unconscious” (1953, §281). Bennett and Hacker argue that claiming brains think provides an example of the mereological fallacy. We commit that fallacy when we attribute a property to a part that we should attribute to the whole organism.

Bennett and Hacker’s (2013) position seems to return to behaviorism. For instance, on what basis would they attribute consciousness to a whole organism? “The concept of consciousness is bound up with the behavioral grounds for ascribing consciousness to the animal” (p. 245). They say that we can only attribute psychological states to organisms when organisms display proper behaviour. Brains do not behave, and therefore we cannot say that brains have psychological properties.

The mereological fallacy challenges widely accepted views of brains and minds. Not surprisingly, the fallacy faces considerable criticism (Churchland, 2005; Dennett, 2007; Searle, 2007), some of which arises from the computer metaphor. Churchland (2005, p. 470) describes digital computers as devices “deliberately built to engage in the ‘rule-governed manipulation of complex symbols.’” Computer engineers explain how computers work by appealing to information processing. To computer designers, computers store, manipulate, and retrieve information by following formal rules. “Such talk now makes perfect sense, at least to computer scientists” (p. 470).

Bennett and Hacker respond by applying the mereological fallacy to digital computers:

It is true that we do, in casual parlance, say that computers remember, that they search their memory, that they calculate, and sometimes, when they take a long time, we jocularly say that they are thinking things over. But this is merely a façon de parler. It is not a literal application of the terms “remember,” “calculate” and “think.” Computers are devices designed to fulfil certain functions for us. We can store information in a computer, as we can in a filing cabinet. But filing cabinets cannot remember anything, and neither can computers. We use computers to produce the results of a calculation—just as we used to use a slide rule or a cylindrical mechanical calculator. Those results are produced without anyone or anything literally calculating—as is evident in the case of a slide rule or a mechanical calculator. (2013, p. 248)

However, their criticism of digital computers abandons the behaviorism that they apply to the brain by admitting that computers generate the right behaviour: “Computers are devices designed to fulfil certain functions for us.” Bennett and Hacker dismiss behavioural evidence, however, because computers do not generate behaviour in the right way:

Computers were not built to “engage in the rule-governed manipulation of symbols,” they were built to produce results that will coincide with rule-governed, correct manipulation of symbols. Further, computers can no more follow a rule than can a mechanical calculator. A machine can execute operations that accord with the rule, provided all the causal links built into it function as designed and assuming that the design ensures the generation of a regularity in accordance with the chosen rule or rules. But for something to constitute following a rule, the mere production of a regularity in accordance with the rule is not sufficient. (2013, p. 256)

What, in addition to behaviour, do Bennett and Hacker require to support the claim that computers follow rules? As they suggest,

A being can be said to be following a rule only in the context of a complex practice involving actual and potential activities of justifying, noticing mistakes and correcting them by reference to the rule, criticizing deviations from the rule, and, if called upon, explaining an action as being in accordance with the rule and teaching others what counts as following a rule. (2013, p. 256)

Similarly, “In order literally to calculate, one must have a grasp of a wide range of concepts, follow a multitude of rules that one must know, and understand a variety of operations. Computers do not and cannot” (p. 248).

Thus, when the internal causal links of a computer cause it to perform calculations, it does not perform “true” calculation. True calculation requires additional, semantic properties: grasping concepts or understanding operations.

Bennett and Hacker’s (2013) response to Churchland restates the Chinese room problem (Section 5.7). When Bennett and Hacker look inside computers, they fail to see the expected calculation processes. Similarly, when they look inside the brain, they fail to see the expected thinking processes. However, they have incorrect expectations. Functional analysis decomposes behaviour into an organized system of primitives, which themselves neither resemble nor reveal the whole system’s behaviour. Although we might describe a whole brain as understanding English, we cannot describe individual neurons in the same way (Churchland & Churchland, 1990). When we look inside a system, we can explain it using functional analysis; we should not see whole behaviour. We should see instead the primitives for bringing the whole behaviour into being.

Churchland’s (2005) rebuttal damages Bennett and Hacker’s (2013) position because we can explain computers using functional analysis, which dictates a computer’s design (Kidder, 1981). When a designer explains how she engineers computer behaviour, the explanation takes a different form from what Bennett and Hacker would like. Churchland’s critique demonstrates that the mereological fallacy does not always have problems. By hypothesis, cognitivists argue that we can explain thinking, and ultimately brain function, in the same way that we can explain computers. The mereological hypothesis will apply to neither computers nor brains when the hypothesis is true.

One further puzzle created by Bennett and Hacker’s position concerns the purpose of cognitive neuroscience. Bennett and Hacker admit that we require brain processes for thinking to occur, and they describe cognitive neuroscience as aiming “to illuminate those mechanisms in the brain that must function normally in order for us to be able to exercise our psychological faculties, such as perception and memory” (2013, p. 1). However, the mereological fallacy dictates that we cannot accomplish the stated aim by ascribing psychological states to brain states. How, then, can cognitive neuroscience relate brain function to psychological faculties?

Functional analysis provides an answer. We explain information processing systems at different levels (e.g., computational, algorithmic, architectural, and implementational; see Section 1.7) (Dawson, 2013; Marr, 1982; Pylyshyn, 1984). We can describe electric circuits physically or as computing a complex Boolean function (Shannon, 1938). However, we expect differences between explaining a system at one level and explaining the same system at another level. The biological account of centre-surround cells in the lateral geniculate nucleus differs dramatically from the mathematical derivation of a difference of Gaussians function.

Nevertheless, we can relate different levels to one another. Both centre-surround cells and differences of Gaussians describe edge detection. The empirical successes of cognitivism show that we can sensibly claim that brains think. However, we must expect differences between accounting for a system at the implementational level and accounting for a system at the algorithmic level. The brain mechanisms for thinking differ from thinking itself.

5.10 Which Topics Are Important to Cognitive Psychology?

In Sections 5.2 through 5.9, I discussed architectural challenges to traditional cognitive psychology from connectionism, architectural challenges from embodied cognition, and philosophical issues raised by cognitive neuroscience. In the final two sections of this chapter, I step back to consider broadly the nature of cognitive psychology. How can we define cognitive psychology? In Section 5.10, I attempt to define it by exploring how textbooks have presented the discipline in different decades. We will see that the definition of cognitive psychology seems to change over time. In Section 5.11, I propose a more general, but hopefully more lasting, definition by moving away from cognitive psychology’s topics and by moving toward its methods. I begin by exploring how cognitive psychology textbooks present the discipline as the study of specific topics and by observing changes in such topics over decades.

Which topics are important to cognitive psychology? Let us explore how textbooks have introduced students to the discipline over the past several decades. Thomas Verner Moore (1939) wrote the first book, titled simply Cognitive Psychology. Moore discussed topics often seen in modern texts: perception, imagery, memory, judgment, and reasoning. However, his book had little impact (Knapp, 1985; Surprenant & Neath, 1997). Surprenant and Neath note that other important books, aligned with more prominent schools of psychology (functionalism and behaviorism) overshadowed Moore (Hilgard & Marquis, 1940; Hull et al., 1940; McGeoch, 1942; Woodworth, 1938). Moore did not spark the cognitive revolution.

Important texts appeared after the cognitive revolution. Cognition and Thought (Reitman, 1965) introduced information processing to psychologists and included an appendix on how to program computers to simulate psychological models. Cognitive Psychology (Neisser, 1967), usually described as the field’s founding text, defined cognitive psychology as the study of “all the processes by which the sensory input is transformed, reduced, elaborated, stored, recovered and used” (p. 4).

Cognitive psychology texts became more common in the 1970s. An Introduction to Cognitive Psychology (Manis, 1971) discussed topics ranging from learning and memory to cognitive consistency and social judgment. Cognitive Psychology: The Study of Knowing, Learning and Thinking (Anderson, 1975) placed cognitive psychology into an idiosyncratic context of cybernetics, systems theory, and control theory. The variety of topics covered by early texts suggests that a unified understanding of cognitive psychology had not yet emerged. However, by the late 1970s, cognitive psychology texts had become more standardized and adopted an organization still seen in modern books (Reynolds & Flagg, 1977; Solso, 1979).

For example, Cognitive Psychology (Reynolds & Flagg, 1977) begins by placing the cognitive approach in a historical context. Early chapters discuss peripheral processes (sensory memory, pattern recognition), middle chapters describe memory, and final chapters focus on language. The cognitive psychology textbooks of the 1980s elaborate Reynolds and Flagg’s organization by adding later chapters on higher-order processing, such as problem solving, reasoning, and judgment and decision making (Anderson, 1980, 1985; Dodd & White, 1980; Reed, 1982, 1988). From the 1990s on, such organization becomes the norm, although more modern texts also include an early chapter on neuroscience.

To gain insight into the contents of cognitive psychology textbooks, as well as into the changes in contents over time, let us explore the contents of several books. We can examine chapter titles and lengths, and classify each chapter as covering a general topic, adopting a methodology similar to earlier analyses of cognitive psychology texts (Lewandowsky & Dunbar, 1983; Marek & Griggs, 2001). I consider two texts from the 1960s (Neisser, 1967; Reitman, 1965); five from the 1970s (Bourne et al., 1979; Lachman et al., 1979; Manis, 1971; Reynolds & Flagg, 1977; Solso, 1979); six from the 1980s (Anderson, 1980, 1985; Dodd & White, 1980; Reed, 1982, 1988; Solso, 1988); and six from the 1990s (Haberlandt, 1994; Kellogg, 1995; Martindale, 1991; Medin & Ross, 1992; Reed, 1996; Solso, 1995). The remaining 12 appeared in 2000 or later (Anderson, 2000, 2020; Braisby & Gellatly, 2012; Eysenck & Keane, 2020; Farmer & Matlin, 2019; Goldstein, 2011, 2015; Groome, 2014; McBride & Cutting, 2019; Reisberg, 2013, 2018; Sinnett et al., 2016).

I process each book as follows. First, I record the title of each chapter and the total number of pages. Second, I examine the contents of each chapter and then classify the chapter as presenting one of the 22 finer-detailed topics in the left column of Table 5-1. When I could conceivably assign a chapter to more than one category, I use only one category (e.g., I code “language acquisition” as “language,” not as “learning”); I use a consistent coding. Third, I calculate the total number of pages for each category by summing up the number of pages for all chapters belonging to that category.

After I code each book, I collapse topics into a coarser set of categories by combining several finer categories to define a more general category. For instance, category 5 in the coarser scheme (“Problem Solving, Reasoning”) was created by combining four finer categories (“Problem Solving,” “Reasoning,” “Judgment, Decision Making,” “Intelligence, Creativity”). The layout of Table 5-1 shows the finer categories that I combined into each coarser category. The coarser coding scheme is very similar to the one used by Marek and Griggs (2001). The processing represents each textbook in terms of the number of pages devoted to both the finer and the coarser topics listed in Table 5-1. I convert the number of pages into proportions by dividing each by the total number of pages in a text.

This textbook representation permits us to see easily which topics are important to cognitive psychology, which topics are more important than others, and how topic coverage changes over time. For example, Figure 5-5 presents a treemap of the text contents from Table 5-1’s coarser topics. We can create the treemap by averaging the proportion of pages devoted to a topic over the books representing four different time periods: the 1970s, 1980s, 1990s, and 2000s (any book published after 2000). A treemap represents proportions hierarchically. Figure 5-5’s first level in the hierarchy represents book topics; the second level organizes topics by book eras.

**Table 5-1** Two sets of categories used to classify the contents of book chapters
Finer Topics	Coarser Topics
1. Foundations, History	1. Foundations, History
2. Neuroscience	2. Neuroscience, Physiology
3. Sensation	3. Perception, Attention, Consciousness
4. Attention
5. Perception
6. Consciousness
7. Primary Memory	4. Memory
8. Secondary Memory
9. Levels of Processing
10. Representational Format
11. Mental Imagery
12. Problem Solving	5. Problem Solving, Reasoning
13. Reasoning
14. Judgment, Decision Making
15. Intelligence, Creativity
16. Language	6. Language
17. Development	7. Development
18. Learning	8. Other
19. Emotion
20. Social
21. Models, Simulation
22. Other

In Figure 5-5, I represent the upper part of the hierarchy (the set of eight topics from the coarse coding scheme) using a large rectangle of a uniform colour. For instance, the large white rectangle at the top left of the treemap represents the topic “Memory.” The rectangle’s size represents the proportion of pages in texts that we can classify as covering “Memory.” I represent the hierarchy’s next level (book era) by dividing a large rectangle into components. For instance, I divide the large white rectangle for the topic “Memory” into four smaller rectangles. I label each smaller rectangle by era; each rectangle’s size represents a topic’s coverage by a subset of books (i.e., all books belonging to the same era).

A large square, carved into many smaller squares filled with white or with grey. Each square represents a topic and an era; the size of the square reflects the importance of a topic in a cognitive psychology textbook. — Figure 5-5 A treemap of how cognitive psychology textbooks cover the coarse set of topics from Table 5-1. The books belong to the 1970s (1970–1979), the 1980s (1980–1989), the 1990s (1990–1999), or the 2000s (2000–2020).

When we inspect the large rectangles in Figure 5-5, we find that the topic “Memory” receives the most coverage, because the rectangle for “Memory” has the largest area. The next most covered topics are “Perception,” “Language,” and “Thinking,” each of which has roughly equal coverage. We find less coverage for “Foundations” and “Other.” “Neuroscience” has little coverage, followed by “Development.”

We can also see, from Figure 5-5, how topic coverage changes over the four different eras. “Memory” and “Perception” receive equal coverage over all four eras, but “Language” receives less coverage beginning in 2000. Most coverage of “Neuroscience” occurs after 2000 and receives little coverage prior to the 1990s. The coverage of “Development” has decreased since the 1980s. “Thinking” receives more treatment after the 1980s. The 1980s stand out for having less coverage of “Foundations.”

We should also consider topics absent from Figure 5-5 (unless they belong to “Other”). Social cognition has its own textbooks (Fiske & Taylor, 2020; Kunda, 1999) and receives little coverage in cognitive psychology texts. The same is true for comparative cognition, a field absent from Figure 5-5 but covered by its own texts (Menzel & Fischer, 2012; Olmstead & Kuhlmeier, 2015; Shettleworth, 2013). Figure 5-5 reveals a discipline that focuses on cognition in individual adult humans. The figure also reveals that the coverage of unique topics (“Other”) decreases after 2000, suggesting a growing uniformity of topic coverage in modern texts.

We can consider the similarities between individual texts in more detail by computing correlations between texts. We can calculate using the values for each text for the finer set of topics from Table 5-1. (This analysis excluded an extreme outlier, the Reitman [1965] text, which produced negative correlations with all other texts.)

We can use our correlations to conduct a multi-dimensional scaling (MDS) analysis. MDS, a statistical tool, positions different objects in a map. MDS places similar objects near one another and dissimilar objects farther apart. Figure 5-6 plots a three-dimensional MDS solution derived from the textbook correlations. The MDS solution provides an excellent fit to the data; the solution produces a correlation of 0.975 for distances among the books in Figure 5-6 and the original correlations. What does Figure 5-6 reveal about the relationships among individual textbooks? Each dimension of the graph represents different topic combinations.

The first dimension (“Perception/Neuroscience vs Language/Memory”) arranges books in terms of their combined treatment of four different topics. Books having a more positive position along this dimension (Braisby & Gellatly, 2012; Goldstein, 2011; Groome, 2014) have more coverage of both perception and neuroscience and less coverage of both language and memory. In contrast, books having a more negative position along this dimension (Anderson, 2020; Haberlandt, 1994; Reed, 1996) have less coverage of both perception and neuroscience and more coverage of both language and memory.

We can provide a similar account for the second dimension (“Language/Neuroscience vs Memory/Problem Solving”), which provides the y axis of the top plot, and the x axis of the bottom plot, of Figure 5-6. Books with a more positive position along this dimension (Dodd & White, 1980; Goldstein, 2011, 2015) have more coverage of both language and neuroscience and less coverage of both memory and problem solving. In contrast, books with a more negative position along this dimension (Lachman et al., 1979; Manis, 1971; Neisser, 1967; Reynolds & Flagg, 1977) have less coverage of both language and neuroscience and more coverage of both memory and problem solving.

Two multidimensional scaling plots. Each plot contains 31 different dots; each dot represents one cognitive psychology textbook. Similar books are plotted closer to one another in the graph. — Figure 5-6 A plot of the three-dimensional MDS solution for correlations among textbooks based upon the 22 finer topics from Table 5-1. The top plot uses the first and second dimensions as the coordinates of the books. The bottom plot uses the second and third dimensions as the coordinates of the books.

We can also provide a similar account for the third dimension (“Neuroscience/Memory vs Problem Solving/Language”), which provides the y axis of the bottom plot of Figure 5-6. Books with a more positive position along this dimension (McBride & Cutting, 2019; Reisberg, 2013, 2018) have more coverage of both neuroscience and memory and less coverage of both problem solving and language. In contrast, books with a more negative position along this dimension (Dodd & White, 1980; Goldstein, 2011; Manis, 1971) have less coverage of both neuroscience and memory and more coverage of both problem solving and language.

Which topics are important to cognitive psychology? Figure 5-5 indicates that the core topics are memory, followed by thinking, perception, and language. However, a topic’s importance changes over time. For instance, modern texts, but not earlier texts, have high coverage of neuroscience. Figure 5-6 indicates that different textbooks emphasize different topic combinations. Books with more coverage of neuroscience and perception, or of neuroscience and memory, have less coverage of language and problem solving.

In summary, cognitive psychology texts cover similar topics but still differ noticeably from one another. The average correlation used for the MDS analysis is 0.431. A correlation so large suggests a strong commonality of topics in texts through the decades. However, a correlation so small also suggests great variability of topic coverage. Some variation suggests that the definition of cognitive psychology has changed over time, as I discuss in Section 5.11.

5.11 What Is Cognitive Psychology?

What is cognitive psychology? To answer that question, we might consider textbook definitions. The definition of cognitive psychology has evolved over the decades. Moore (1939, p. v) provided the first textbook definition of cognitive psychology: “Cognitive psychology is the branch of general psychology which studies the way in which the human mind receives impressions from the external world and interprets the impressions thus received.” Moore’s definition highlights a common theme of cognitive psychology’s later reaction to behaviorism: replacing the passive responder with the active information processing agent.

Neisser (1967) defines cognitive psychology in two parts. He first restates Moore’s (1939) definition but introduces information processing ideas: “The term ‘cognition’ refers to all the processes by which the sensory input is transformed, reduced, elaborated, stored, recovered, and used” (p. 4). Neisser then lists cognitive psychology’s prototypical topics: “Such terms as sensation, perception, imagery, retention, recall, problem-solving, and thinking, among many others, refer to hypothetical stages or aspects of cognition” (p. 4). Let us call his definition the information processing definition.

We find that definition in many cognitive psychology textbooks published after Neisser’s (1967) (Anderson, 1980; Haberlandt, 1994; Reed, 1982, 1988, 1996; Reynolds & Flagg, 1977). For example, Reynolds and Flagg write that “cognitive psychology is defined partly by what it does (the information processing approach to be described shortly) and by its subject matter, the higher mental processes” (p. 11).

Importantly, an alternative, broader definition appears in more modern texts. It places less emphasis on information processing and more emphasis on research methods or approaches. Let us call this the methodological definition. One example is provided by Eysenck (2020, p. 37):

Cognitive psychology used to be unified by an approach based on an analogy between the mind and the computer. This information-processing approach viewed the mind as a general-purpose, symbol processing system of limited capacity. Today there are four main approaches to human cognition: cognitive psychology, cognitive neuropsychology, cognitive neuroscience, and computational cognitive science. These four approaches are increasingly combined to provide an enriched understanding of human cognition.

We can find Eysenck’s methodological definition in several modern textbooks (Farmer & Matlin, 2019; Groome, 2014; McBride & Cutting, 2019). That definition includes other fields’ contributions to studying cognition. Cognitive neuropsychology studies deficits in cognitive performance associated with brain injuries. Cognitive neuroscience uses brain imaging techniques to explore the relationship between normal brain function and cognition. Computational cognitive science produces computer models of cognitive phenomena. The methodological definition modernizes cognitive psychology by including the latest methodologies.

Unfortunately, the methodological definition fails to consider any overarching approach to cognition. Defining cognitive psychology only in terms of methodological approaches seems to be too inclusive. For example, Skinner (1957) studied a core cognitive topic, language, but his theory was not cognitive (Chomsky, 1959), and Skinner (1977) was not a cognitive psychologist. Modern simulations, such as deep belief networks, successfully solve many tasks involving images or language (LeCun et al., 2015), but they do not produce cognitive theory. The cognitive neuroscience promoted by some (Bennett & Hacker, 2003, 2013) provides details about brain processes but excludes psychological terms. Such examples conform to the methodological definition but do not belong to cognitive psychology.

A better definition must include the nature of cognition. It must also include the kind of explanations that cognitive psychologists seek. For instance, cognitive psychology is the branch of general psychology which explains psychological phenomena by using functional analysis to describe information processing. This definition appeals to a theory about cognition (information processing) and refers to the type of explanation sought (i.e., functional analyses). Neither the information processing hypothesis nor the practice of conducting functional analysis restricts the variety of theories or topics characterized by the definition.

Furthermore, by emphasizing the kind of explanation that cognitive psychologists seek, the definition forces Eysenck’s (2020) four approaches to be included in cognitive psychology only by contributing to functional analysis. For instance, the definition includes computer simulations only compared to human performance in the search for strong equivalence. Similarly, the definition includes studies from cognitive neuropsychology or cognitive neuroscience only for guiding functional decomposition or for providing evidence for causal subsumption.

Some might argue that this candidate definition excludes too much from cognitive psychology. But cognitive psychology proceeds by adopting a strong theory—the information processing hypothesis—and then by exploring the topics that the hypothesis can explain (Pylyshyn, 1980, 1984). “It is no less true of cognitive science than of other fields that we start off with the clear cases and work out, modifying our view of the domain of the theory as we find where the theory works” (Pylyshyn, 1980, p. 119; italics added). The theoretical perspective persists while cognitive psychology’s topics and methods evolve.

References