“Chapter 7: Classifying Extended Tetrachords” in “Connectionist Representations of Tonal Music”
7
Classifying Extended Tetrachords
7.1 Extended Tetrachords
7.1.1 Extended Chords
Chapter 6 described a multilayer perceptron for classifying four different types of tetrachords, and detailed its internal structure. In this chapter, I turn to a more complicated musical problem, one that involves a larger set of different types of tetrachords. Because this problem is more complex, the multilayer perceptron that solves it requires more hidden units. However, these hidden units also organize inputs into a variety of strange circles that assist in interpreting the network’s internal structure. The four tetrachords explored in Chapter 6 were all examples of added note tetrachords. That is, each tetrachord started as a triad built from three different notes that belonged to a musical scale. I created a tetrachord by adding a fourth note, which also belonged to the scale, to the triad.
Figure 7-1 Musical notation for 12 different types of tetrachords, each using C as the root note.
A different approach to building tetrachords produces a greater variety of chord types. One begins with a triad formula. For instance, if one takes the first, third, and fifth notes of the C major scale (C, E, G), the result is the C major triad. Therefore, the formula for the C major triad is 1-3-5. Adding the seventh note of the scale, B, produces the C major seventh tetrachord, which follows the formula 1-3-5-7.
More chords can be created by manipulating formulae similar to the one provided in the previous paragraph. For instance, one could flatten the third and the fifth note in the formula 1-3-5-7. This produces the formula 1-♭3-5-♭7; if C is the root then this formula produces the set of notes [C, E♭, G, B♭], which defines the C minor seventh tetrachord. Note that the flattened third and seventh notes do not belong to the C major scale.
In jazz, one often finds extended chords that use formulae that add notes that fall beyond the octave range of a major scale. For example, if one adds the D that is an octave higher than the second note in the C major scale to the C major triad, then one produces the Cadd9 tetrachord [C, E, G, D]. The formula for this chord is 1-3-5-9.
Figure 7-1 provides the musical notation, and the musical chord symbol, for 12 different types of tetrachords. Each of these example tetrachords uses C as the root note of the chord. We saw four of these tetrachord types earlier in Chapter 6. The other eight are new; Table 7-1 provides the formula for each.
Table 7-1. The names and formulas for twelve different types of tetrachords.
Tetrachord type | Formula | Example and notation | Forte number |
Major seventh | 1-3-5-7 | Cmaj7 | 4-20(12) |
Dominant seventh | 1-3-5-♭7 | C7 | 4-27 |
Minor, major seventh | 1-♭3-5-7 | Cm(maj7) | 4-19 |
Sixth | 1-3-5-6 | C6 | 4-26(12) |
Minor sixth | 1-♭3-5-6 | Cm6 | 4-27 |
Seventh, flat five | 1-3-♭5-♭7 | C7flat5 | 4-25(6) |
Minor seventh | 1-♭3-5-♭7 | Cm7 | 4-26(12) |
Augmented seventh | 1-3-♭5-♭7 | Caug7 | 4-24(12) |
Diminished seventh | 1-♭3-♭5-♭♭7 | Cdim7 | 4-28(3) |
Added ninth | 1-3-5-9 | Cadd9 | 4-22 |
Minor added ninth | 1-♭3-5-9 | Cm(add9) | 4-14 |
Seventh, suspended fourth | 1-4-5-♭7 | C7sus4 | 4-23(12) |
Note. An example of each chord is provided in Figure 7-1. The final two columns provide the notation of an example chord that belongs to the type, as well as the classification number for the chord type from Forte’s (1973) set theory.
The formulae provided in Table 7-1 work in the context of any major scale. The numbers in each formula refer to a note’s position in a particular scale. That is, 1 is the first note in a particular scale, 3 is the third note in a particular scale, and so on. This means that there are 12 different versions of each of the chord types listed in Table 7-1: one for each of the 12 possible major scales.
When I use these formulae to create tetrachords in different keys, some interesting relationships between chords arise. Consider the 6 chord whose formula is 1-3-5-6. In the context of the C major scale this produces the C6 chord whose notes are [C, E, G, A]. Now consider applying the formula for the minor seventh tetrachord (1-♭3-5-♭7) in the context of the A major scale. This produces the Am7 chord whose notes are [A, C, E, G]. Note that these notes are identical to those of C6; musically speaking, Am7 is identical to an inversion of C6. Similarly, the dominant seventh chord is the inversion of a minor sixth tetrachord in a different key.
In other words, the same set of four pitch-classes can have more than one chord name. If I train a network to identify tetrachord types, then it must generate both of these chord names to one set of four input pitch-classes. Table 7-1 also provides the Forte numbers of each of these chord types. Forte numbers are a system for classifying different musical entities that is derived from using mathematical set theory (Forte, 1973). Note that different tetrachord names for the same set of input pitch-classes have the same Forte number in Table 7-1, indicating that the chords have the same basic structure in spite of the fact that they have different names.
When I train a multilayer perceptron to classify the 12 different types of tetrachords in Table 7-1, I will again use pitch-class representation. Because of this, notes in extended chords like the added ninth chord are moved back into the range of a single octave. It is therefore useful to represent the various tetrachords in a visual format. One can illustrate a tetrachord in a circle of minor seconds by drawing in four spokes that represent the four pitch-classes present in a particular chord. Drawing such a diagram will illustrate a particular chord in the context of a specific major key. However, this diagram represents the structure of a tetrachord type for any key: if one rigidly rotates the spokes to a different position in the circle, then it will provide the notes for the same type of tetrachord, but relative to some other musical key. Figure 7-2 provides pitch-class diagrams for the 12 tetrachords from Figure 7-1.
Figure 7-2 Pitch-class diagrams of the 12 tetrachords from the score in Figure 7-1.
All of the chords presented in Figure 7-2 are defined with respect to the C major scale. The structure of the spokes in the diagrams provides an interesting perspective on the similarities and differences between various tetrachord types. For instance, it is immediately apparent that both the diminished tetrachord and the seventh flattened fifth tetrachord include two pairs of notes that belong to the same circle of tritones, because both diagrams include two long spokes that bisect the circle. Similarly, one can see the similarity in spoke structure between the minor seventh and the sixth tetrachords, as well as between the seventh and the minor sixth tetrachords. In the next section, we will describe training a multilayer perceptron to identify these 12 different types of tetrachords.
7.2 Classifying Extended Tetrachords
7.2.1 Task
Our goal is to train an artificial neural network, when presented with four notes that define a tetrachord, to identify the type of tetrachord, ignoring the tetrachord’s key. The difference between the current network and those described in Chapter 6 is that the current network learns to classify input chords into 12 different categories instead of only four. After being trained on this task the multilayer perceptron typically turns one output unit “on” to identify tetrachord type, and turns the remaining 11 output units “off,” when presented a tetrachord. The exception to this occurs when two different tetrachord types (e.g., 6 and m7) apply to the same four input pitch-classes. In this situation, the network turns on both of the appropriate output units, and turns the remaining ten output units off.
Figure 7-3 The architecture of the multilayer perceptron trained to identify 12 different types of tetrachords.
7.2.2 Network Architecture
Figure 7-3 illustrates the architecture of the current network. It uses 12 input units to represent input pitch-classes. It requires 12 output units to identify all of the tetrachord types from Figure 7-1. The network requires seven hidden units to find a solution to the extended tetrachord problem. All of the output units and all of the hidden units in the network are value units.
7.2.3 Training Set
The training set consists of 144 stimuli: the 12 different tetrachords that can be created in the context of a particular major scale (see Figure 7-1). I create these tetrachords for each of the 12 different major scales. Each is encoded as an input pattern in which four input units are activated with a value of one, and the remaining eight input units are all activated with a value of zero. I pair each input pattern with an output pattern that indicates the tetrachord type to which the input pattern belongs. I train the network to turn on the output units that represent the input patterns type(s), and to turn all other output units off.
7.2.4 Training
The multilayer perceptron in Figure 7-3 is trained with the generalized delta rule developed for networks of value units (Dawson & Schopflocher, 1992) using the Rumelhart software program (Dawson, 2005). During a single epoch of training each pattern is presented to the network once; the order of pattern presentation is randomized before each epoch.
All connection weights in the network are set to random values between −0.1 and 0.1 before training begins. In the network described in detail below, each µ is initialized to zero but is then modified by training. A learning rate of 0.01 is employed. Training proceeds until the network generates a “hit” for every output unit for each of the 144 patterns in the training set. Again, a “hit” is defined as activity of 0.9 or higher when the desired response is one or as activity of 0.1 or lower when the desired response is zero. A network that contains seven hidden value units solves this problem readily, typically converging after between 7000 and 10,000 epochs of training. The example network described in more detail in the next section converges after 7236 epochs of training.
7.3 Interpreting the Extended Tetrachord Network
This section provides an analysis of the connection weight structure of each of the hidden units in the trained network. This analysis reveals a number of interesting musical regularities in this network’s structure. However, it is very detailed. The reader, who is less interested in these details, and more interested in a general summary of these results, will find this summary in Section 7.4.
7.3.1 Jittered Density Plots
The extended tetrachord network is the most complicated one that we have encountered in this book. This is because it has seven hidden units, making it very difficult to orient an interpretation by graphing the hidden unit space. For this reason, I will begin to interpret the network by examining two different characteristics of each hidden unit: the weights of the connections that feed into a hidden unit and the activity produced by the hidden unit when it is presented each of the 144 input patterns.
With respect to patterns of connectivity, each of the hidden units organizes input pitch-classes into some of the strange circles from Chapter 6. This is particularly helpful for interpreting this more complicated network. Instead of considering the effect of the 12 different pitch-classes on the hidden unit, we can consider smaller sets of pitch-classes that are treated as being equivalent. For example, we will see that an account of Hidden Unit 1’s role in the network can be achieved by considering input pitch-classes as belonging to one of the two circles of major seconds, or as belonging to one of the six circles of tritones.
With respect to hidden unit activity, I will take advantage of a characteristic that is frequently exhibited by value units (Berkeley et al., 1995), although in some cases it may be found in other types of processors (Berkeley & Gunay, 2004). When the activities of a hidden value unit are graphed using a jittered density plot, this plot is often organized into different bands. Each band contains a subset of input patterns that share certain properties which, when identified, help in understanding the features being detected by the hidden unit. Let us describe the general use of banded jittered density plots in more detail before using them to interpret the extended tetrachord network.
A jittered density plot can be thought of as a one-dimensional scatter plot. Consider producing a jittered density plot for the activities generated by one hidden unit to each of the patterns of a training set. Each pattern is represented by one dot in the plot. The position of the dot along the x-axis of the graph provides the activity produced in the hidden unit by that pattern. The position of the dot along the y-axis is a random number that has no meaning; this random jittering prevents different dots in the plot from overlapping as much as possible.
An example jittered density plot for Hidden Unit 1 of the current network is provided in Figure 7-4 below. Note that the x-axis ranges from zero to one, because this is the activity range of a value unit. There are 144 different dots in this plot, one for each of the 144 tetrachords in the training set.
Figure 7-4 The jittered density plot for Hidden Unit 1 in the extended tetrachord network.
Berkeley et al. (1995) discovered that in many cases the jittered density plots of hidden value unit activities organize themselves into distinct bands. This is true of the jittered density plot in Figure 7-4. It is organized into three different bands: in Band A, 24 of the input patterns generate zero activity in this unit; in Band B, 48 of the patterns generate activity that ranges between 0.11 and 0.20, and in Band C, the remaining 72 patterns generate activity between 0.99 and 1.
Berkeley et al. (1995) discovered that patterns that belong to the same band in a jittered density plot share certain properties. By examining the characteristics of just the subset of patterns that fall into one band, one can interpret the features they share and use these features to determine the unit’s function in the network (Dawson et al., 1997; Dawson et al., 2000b; Dawson & Piercey, 2001). Figure 7-4 demonstrates that distinct banding is present when the activities of one of the extended tetrachord’s hidden units are graphed in a jittered density plot. Fortunately for us, banding is present for almost all of the hidden units of this network. I will take advantage of this banding by taking just those input patterns that fall into a particular band and determining what features these tetrachords have in common. Furthermore, this interpretation is informed by our understanding of the strange circles found in the connection weights in each hidden unit. Together these two properties will lead to a detailed understanding of the internal structure of the extended tetrachord network. As was the case earlier in Section 6.6, I will discuss the hidden units out of order so that the units that are easier to interpret are described before those that are more complicated.
7.3.2 Hidden Unit 1
Figure 7-5 provides a graph of the connection weights that feed into Hidden Unit 1 from the 12 input pitch-class units, combined with the jittered density plot from Figure 7-4. It is obvious from Figure 7-5 that this hidden unit organizes input signals in terms of strange circles.
First, all of the positive weights come from pitch-classes that belong to one of the circles of major seconds, and all of the negative weights come from pitch-classes that belong to the other circle of major seconds. Second, if one examines the set of six positive weights, then it becomes apparent that there is some variation in strength. This variation occurs because this hidden unit assigns the identical weight to pairs of pitch-classes that belong to the same circle of tritones. This variation in weights permits the hidden unit to distinguish one circle of tritones from another. This is also true of the six negative weights.
What does this hidden unit detect? To begin, let us note that at the end of training this unit’s µ has a value of −0.01, indicating that it turns on when it receives a near-zero net input. With this fact in mind, and recognizing that Hidden Unit 1 appears to use equivalence classes involving circles of minor seconds and circles of tritones, let us consider the patterns that fall into each of the three bands of the jittered density plot.
First, consider the subset of patterns that belong to Band A in Figure 7-5. There are only two types of tetrachords in this subset: all of the aug7 chords and all of the 7♭5 chords. What do these tetrachords have in common? Each chord includes four pitch-classes that all belong to only one of the circles of major seconds. As a result, all four of the signals sent to Hidden Unit 1 by one of these chords pass through weights that all have the same sign. These signals cannot cancel one another out; the hidden unit will receive either an extreme positive or an extreme negative net input which causes it to turn off because of its near zero µ.
Now consider the subset of patterns that fall into Band C in Figure 7-5. These patterns consist of all the 6, 7sus4, dim7, m (add9), m7, and maj7 tetrachords. What does this large collection of different types of chords have in common?
Figure 7-5 The connection weights and the jittered density plot for Hidden Unit 1.
First, all of these tetrachords have two pitch-classes that belong to one circle of major seconds and two others that belong to the other circle of major seconds. This permits the signals sent from these chords to cancel each other out, producing a near-zero net input, and turning Hidden Unit 1 on. Second, the tetrachords that belong to this band (with the exception of the º7 chords, which are a special case) include pitch-classes that each belong to a different circle of tritones. As a result, four different circles of tritones are represented in each chord. For any chord, which four circles of tritones are represented is important: two of the sampled circles have negative weights, while the other two have positive weights.
As a result, one finds in these tetrachords two specific patterns of tritone sampling. These are illustrated in Figure 7-6A and Figure 7-6B. In these figures, each tritone circle is a line that bisects the pitch-class diagram; there are six in each figure. Tritone circles that have a pitch-class that belongs to the pitch-classes of these tetrachords are represented as solid lines; dashed lines indicate tritone circles that are not represented. In the first pattern exhibited by the chords that belong to Band C (Figure 7-6A), the tetrachords contain pitch-classes from four adjacent tritone circles. Note that because weights in the network are organized by circles of major seconds, two negative and two positive weights are involved in these chords, producing zero net input. The same is true for the second pattern (Figure7-7B): a tetrachord contains pitch-classes from two adjacent tritone circles, not from the next, and then contains pitch-classes from the next two adjacent tritone circles. Only the diminished seventh (º7) tetrachords fail to exhibit this pattern, but this is because they represent a special case of Figure 7-6B: they sample two circles of tritones twice, and these two samples are from circles that are 90° apart in the diagram (see Figure 7-2).
The importance of which circles of tritones are represented by a tetrachord’s pitch-classes emerges when we consider the final band of patterns that produce weak activity in Hidden Unit 1 (Band B, Figure 7-5). This band includes all of the remaining types of tetrachords (7, add9, m (maj7), m6). Half of these chords fall into this band because they represent three different circles of tritones, not four. In other words, they contain one pitch-class each from two different circles of tritones, and contain two pitch-classes from a third. As a result, the input signals do not cancel one another out.
However, the remaining tetrachords that belong to this band sample pitch-classes from four different tritone circles. Why do these chords not turn Hidden Unit 1 on? The answer to this question is that they represent these tritone circles following a different pattern than the two discussed above. As shown in Figure 7-6C, they contain pitch-classes from three adjacent tritone circles, skip the next, and then contain a pitch-class from the next. This pattern of sampling produces an unbalanced signal, generating weak activity in this hidden unit.
Figure 7-6 Three patterns of tritone sampling for tetrachords. A and B are patterns that turn Hidden Unit 1 on; C is a pattern that generates weak activity in Hidden Unit 1.
7.3.3 Hidden Unit 2
Figure 7-7 provides the connection weights and the jittered density plot for Hidden Unit 2 of the extended tetrachord network. This hidden unit organizes input pitch-classes into circles of minor thirds, assigning a weight of 0.79 to those pitch-classes that belong to the first circle, a weight of −0.07 to those pitch-classes that belong to the second, and a weight of −0.50 to those pitch-classes that belong to the third. At the end of training, the value of µ for this unit is −0.13.
This jittered density plot is similar to the one for Hidden Unit 1, as it is organized into three distinct bands. The first is near zero, the second is between 0.2 and 0.4, and the third is between 0.8 and 1.0. The bands for Hidden Unit 2 are slightly more dispersed than those observed for Hidden Unit 1.
Figure 7-7 The connection weights and the jittered density plot for Hidden Unit 2.
Let us first consider the patterns that belong to Band C in Figure 7-7. There are 52 such patterns, representing 7sus4, add9, aug7, dim7, m (add9), m (maj7), and maj7 tetrachords. Interestingly, the band does not capture all instances of each chord type: it captures four instances of the diminished seventh chord and eight instances of each of the other chord types. Whatever property belongs to the chords in this band does not characterize all 12 instances of each chord type.
What properties do the tetrachords that belong to this band share? All of these tetrachords (except the diminished sevenths, which are a special case) select pitch-classes from each of the three circles of minor thirds. That is, they select one pitch-class from each of two of these circles, and select two pitch-classes from the third circle. Furthermore, 24 of the tetrachords in Band C include one pitch-class associated with a weight of 0.79, a second associated with a weight of −0.07, and two pitch-classes associated with a weight of −0.50. This results in a net input of about −0.30, which is close enough to µ to produce activity of about 0.90. Another 24 of the tetrachords include two pitch-classes associated with a weight of −0.07, and two others associated with each of the other two weights. This produces a net input of 0.14, resulting in activity of just over 0.80.
The diminished seventh chords that fall in this band are a special case, because they are composed of all four pitch-classes that are associated with a weight of −0.07, which all belong to the same circle of minor thirds. These four weights sum to −0.28, a net input that produces activity of 0.88 in Hidden Unit 2.
Why do we only find subsets of different tetrachord types in this band? The structure of the four diminished seventh tetrachords provides an answer to this question. The other eight diminished seventh chords are composed of four pitch-classes that all belong to one of the other two circles of minor thirds. When I sum these weights, the resulting net input is too extreme to produce high activity in Hidden Unit 2. This removes them from this band.
A similar story holds for the other types of tetrachords in this band. Recall that the band captures eight instances of each type, but four other instances do not belong to the band. This is because the specific set of weights for Hidden Unit 2 is such that these subsets of tetrachords generate an extreme net input that removes them from the band. For example, Gmaj7, Cmaj7, Fmaj7, and G♯maj7 are similar to all of the other major seventh chords in that they include two pitch-classes from one circle of minor thirds and one from each of the other two. However, given the weights for Hidden Unit 2, their particular combination of notes produces a net input that removes them from the band.
In particular, each of these chords includes two pitch-classes from the circle of minor thirds assigned a weight of 0.79 by this unit, and one pitch-class from each of the other two circles. As a result, these four major seventh chords generate a net input of one, which turns Hidden Unit 2 off. This separates these four tetrachords from the other eight that fall in the high band. A similar account holds for all of the other chords that belong to a tetrachord type captured by the band, but which are not part of the band.
Let us next consider the band of patterns that produce weak activity (ranging between 0.2 and 0.4) in Hidden Unit 2 (Band B). There are 24 such patterns, representing m6, 6, m7, 7, and 7♭5 tetrachords. Again, the band does not capture all instances of each chord type. All of the chords that fall in this band share one property: they do not include a pitch-class from one of the three circles of minor thirds. Either they include three pitch-classes from one circle and a fourth from one other, or they include two pitch-classes from one circle and two others from another. In either case, the weights associated with these sets of pitch-classes cannot cancel each other out; these chords produce net inputs of either −0.72 or 0.58.
From the discussion above, it appears that high activity in Hidden Unit 2 indicates that it detects a tetrachord characterized by one of two different patterns. One pattern involves four pitch-classes associated with a particular combination of connection weights (one strong positive, one weak negative, two strong negatives). The second pattern involves four pitch-classes each of which is associated with a weak negative connection weight.
The patterns that belong to Band A in Figure 7-7 produce zero activity in Hidden Unit 2 because they fail to exhibit either of these combinations of weights. As a result, the 68 patterns that belong to this band represent all 12 different types of tetrachords in the training set.
When banding in the jittered density plots of value units was first discovered (Berkeley et al., 1995), it was noted that patterns associated with a band associated with near-zero activity were patterns that did not share any defining positive feature. Instead, they shared a negative feature: they all lacked the features that the hidden unit detects, and which produce higher activity. As a result, in many cases a detailed interpretation of the features of patterns that belong to a “zero band” is neither informative nor possible. Band A in Figure 7-7 is an example of this situation.
7.3.4 Hidden Unit 4
Band C for the jittered density plot of Hidden Unit 2 (Figure 7-7) indicates that this unit generates high activity to a number of different types of tetrachords. However, for each of these different types, it generates this high activity to only eight of the 12 possible instances. What does the network do to the four instances of each chord type omitted from this band in Hidden Unit 2? They are the only chords that produce high activity in Hidden Unit 4!
Figure 7-8 provides the connection weights and the jittered density plot for Hidden Unit 4. Examining the weights indicates that this hidden unit, like Hidden Unit 2, organizes input pitch-classes into circles of minor thirds, assigning a weak negative weight to those pitch-classes that belong to the first circle, a more negative weight to those pitch-classes that belong to the second, and a strong positive weight to those pitch-classes that belong to the third. At the end of training, the value of µ for this unit is −0.06. The weights also indicate that pitch-classes are also organized into equivalence classes based upon circles of tritones: pitch-classes that are in the same circle of tritones are assigned identical weights. Indeed, this organization is cleaner than the organization in terms of circles of minor thirds, because there is some variation of weight values assigned to pitch-classes in the same circle of minor thirds.
The lower part of Figure 7-8 indicates that the jittered density plot of Hidden Unit 4 is organized into two fairly broad bands: patterns that belong to Band A generate activity that ranges between 0.00 and 0.50, while patterns that belong to Band B generate activity that ranges between 0.80 and 1.00. I consider these as different bands because there is a large space in the graph between them.
Band B in Figure 7-8 consists of 24 patterns, representing four instances each of 7sus4, add9, aug7, m (add9), m (maj7), and maj7 tetrachords. Importantly, these are exactly the same types of tetrachords found in Band C of Hidden Unit 2, with one exception: Band B does not include any diminished seventh chords. More importantly, the four instances of each type of tetrachord found in Band B are precisely the four instances not found in Band C of Hidden Unit 2.
What do all of the tetrachords in Band B have in common? Each chord includes two pitch-classes associated with a small negative weight, one pitch-class associated with a strong negative weight, and one pitch-class associated with a strong positive weight. Variation in the weights (for instance, the small negative weight could be either −0.12 or −0.33) produces variation in net input, which is why Band B is wide. On average, a pattern that belongs to this band generates a net input of −0.18, which is close enough to µ to produce strong activity in Hidden Unit 4.
Why does this band capture a different subset of tetrachord instances when Hidden Unit 4 and Hidden Unit 2 organize input pitch-classes according to the same strange circles? Compare the weights in Figure 7-8 to those in Figure 7-7. Note that different weight values are assigned to the same strange circles in the two hidden units. For instance, Hidden Unit 2 assigns a strong positive weight to pitch-classes that belong to the first circle of minor thirds, while Hidden Unit 4 assigns a weak negative weight to the same pitch-classes. These differences cause some instances of a tetrachord type to generate strong activity in one hidden unit, but also to generate weak activity in the other.
Figure 7-8 The connection weights and the jittered density plot for Hidden Unit 4.
What about Band A in Figure 7-8? None of these patterns is defined with the same combination of pitch-classes (two small negative weights, one large negative weight, and one large positive weight) that signals membership in Band B. Of course, some other combinations of weights produce moderate Hidden Unit 4 activity, but none is as optimal as the Band B combination. High activity in Hidden Unit 4 represents the detection of this particular combination, which serves to capture 24 tetrachords that (musically) should have been in Band C of Hidden Unit 2, but were not.
7.3.5 Hidden Unit 7
Figure 7-9 presents the connection weights and the jittered density plot for Hidden Unit 7 of the extended tetrachord network. Importantly, at the end of training the value of µ for this hidden unit was −0.02. Thus in order for this unit to generate high activity, the four signals being sent to it from input units must cancel each other out to provide a near-zero net input.
The connection weights for this network indicate that it organizes input pitch-classes into equivalence classes defined by the four circles of major thirds. All pitch-classes that belong to the first circle of major thirds have a strong negative weight; those that belong to the second have a weak positive weight; those that belong to the third have a strong positive weight; and those that belong to the fourth have a weak negative weight.
In addition to organizing pitch-classes in terms of circles of major thirds, the connection weight values of Hidden Unit 7 provide an interesting balancing of pairs of pitch-classes. Pairs of pitch-classes that are a major second (e.g., A, B), a tritone (e.g., A, D♯), or a minor seventh (e.g., A, G) apart are balanced, because they are assigned weights that are equal in magnitude but opposite in sign. Pairs of pitch-classes separated by any other musical interval will not cancel each other’s signal out because of differences in magnitude or sign of their respective connection weights.
As was the case for Hidden Units 1 and 2, the jittered density plot for Hidden Unit 7 is organized into three distinct bands. Two of these bands (Band A and Band B in Figure 7-9) are associated with low activity in Hidden Unit 7, while patterns that belong to Band C turn Hidden Unit 7 on. Band C in Hidden Unit 7’s jittered density plot contains 36 input patterns that comprise all 12 instances of just three different types of tetrachords: 7flat5, 7sus4, and dim7. What do these three different types of chords have in common?
Figure 7-9 The connection weights and the jittered density plot for Hidden Unit 7.
All three of these different types of tetrachords include four pitch-classes that are completely balanced because pairs of these pitch-classes are separated by a major second, a tritone, or a minor seventh. For instance, a diminished seventh chord is composed of two pairs of pitch-classes that are both a tritone apart (Figure 7-2). The two pitch-classes in each pair cancel each other’s signal out, producing a net input of zero, which turns Hidden Unit 7 on. Similarly, a 7sus4 chord can be described as two pairs of pitch-classes with each pair separated by a major second (Figure 7-2). As well, a 7flat5 can be described either as two pairs of pitch-classes with each pair separated by a major second, or as two pairs of pitch-classes with each pair separated by a tritone (Figure 7-2). These different descriptions amount to the same effect: two balanced signals from each pair of pitch-classes, generating near-zero net input and turning Hidden Unit 7 on.
The balancing described above is not true of any of the patterns that belong to the other two bands in Figure 7-9. Band B consists of 24 different input patterns comprised of six instances each of 6, aug7, m (add9), and m7 tetrachords. Each of these patterns generates small activity in Hidden Unit 7 (ranging between 0.09 and 0.19) because each is partially balanced in the sense described above. That is, each of these tetrachords contains one pair of pitch-classes that balance because they are separated by a major second, a tritone, or a minor seventh. However, the other pair of tones is not balanced. Interestingly for each of these chords the balanced pair of pitch-classes always involves one weight that is an extreme negative and one that is an extreme positive. They produce some activity in Hidden Unit 7 because the unbalanced pitch-classes involve smaller weights, making net input slightly less than for the remaining tetrachords.
The remaining tetrachords all belong to Band A in Figure 7-9 and all fail to exhibit the kind of balancing discussed above. Eighty-four different patterns belong to this band. Sixty are completely unbalanced tetrachords: a major second, a tritone, or a minor seventh separates none of their pitch-classes. The remaining 24 are the “cousins” of those that belong to Band B. That is, one of their pitch-class pairs is balanced, but the other is not. The difference between these 24 patterns and the 24 that belong to Band B is that they all involve balancing of a weakly negative and a weakly positive weight. As a result, their unbalanced weights are both either extremely positive or extremely negative. As a result, these chords generate an extreme net input, which turns Hidden Unit 7 off.
7.3.6 Hidden Unit 6
Figure 7-10 provides the connection weights and the jittered density plot for Hidden Unit 6. At the end of training, this hidden unit has a value of µ equal to 0.07. The weights presented in this figure indicate that this hidden unit groups pitch-class inputs into equivalence classes based upon the six different circles of tritones. That is, pairs of pitch-classes that are a tritone apart have the same connection weight.
Figure 7-10 The connection weights and the jittered density plot for Hidden Unit 6.
It is also obvious from the weights illustrated in Figure 7-10 that this hidden unit appears to balance, or nearly balance, adjacent triplets of pitch-classes. For instance, consider the first three pitch-classes (A, A♯, B). The pattern of weights assigned to these three inputs seems nearly identical in magnitude but opposite in sign to the pattern of weights assigned to the next three pitch-classes (C, C♯, D) or to the last three pitch-classes (F♯, G, G♯).
Table 7-2 below provides a more accurate indication of which pairs of input pitch-classes cancel each other out given the particular connection weights in Figure 7-10. It is created by only turning on two of the input units that feed into Hidden Unit 6 at a time. The resulting net input is simply the sum of the weights associated with each of the activated input units.
Table 7-2 The activity produced in Hidden Unit 6 by all possible pairs of different input pitch-classes.
A | A# | B | C | C# | D | D# | E | F | F# | G | G# | |
A | — | 0.03 | 0.00 | 0.58 | 0.05 | 0.00 | 1.00 | 0.03 | 0.00 | 0.58 | 0.05 | 0.00 |
A# | 0.03 | — | 0.00 | 0.28 | 0.98 | 0.27 | 0.03 | 0.00 | 0.00 | 0.29 | 0.98 | 0.27 |
B | 0.00 | 0.00 | — | 0.01 | 0.26 | 0.99 | 0.00 | 0.00 | 0.00 | 0.01 | 0.26 | 0.99 |
C | 0.58 | 0.28 | 0.01 | — | 0.00 | 0.00 | 0.58 | 0.28 | 0.01 | 0.10 | 0.00 | 0.00 |
C# | 0.05 | 0.98 | 0.26 | 0.00 | — | 0.00 | 0.05 | 0.98 | 0.26 | 0.00 | 0.00 | 0.00 |
D | 0.00 | 0.27 | 0.99 | 0.00 | 0.00 | — | 0.00 | 0.28 | 0.99 | 0.00 | 0.00 | 0.00 |
D# | 1.00 | 0.03 | 0.00 | 0.58 | 0.05 | 0.00 | — | 0.03 | 0.00 | 0.58 | 0.05 | 0.00 |
E | 0.03 | 0.00 | 0.00 | 0.28 | 0.98 | 0.28 | 0.03 | — | 0.00 | 0.28 | 0.98 | 0.28 |
F | 0.00 | 0.00 | 0.00 | 0.01 | 0.26 | 0.99 | 0.00 | 0.00 | — | 0.01 | 0.26 | 0.99 |
F# | 0.58 | 0.29 | 0.01 | 0.10 | 0.00 | 0.00 | 0.58 | 0.28 | 0.01 | — | 0.00 | 0.00 |
G | 0.05 | 0.98 | 0.26 | 0.00 | 0.00 | 0.00 | 0.05 | 0.98 | 0.26 | 0.00 | — | 0.00 |
G# | 0.00 | 0.27 | 0.99 | 0.00 | 0.00 | 0.00 | 0.00 | 0.28 | 0.99 | 0.00 | 0.00 | — |
Note. For a particular activity in the table, the pitch-class label for the row provides one member of the pair, and the pitch-class label for the column provides the other member. Pairs that cancel each other’s signal out, producing high activity in the hidden unit, are indicated by the dark grey cells. Pairs that weakly cancel each other out, producing moderate activity, are indicated by the lighter grey cells.
Each net input in this table can be fed into a Gaussian activation function (with µ = 0.07) to determine the activity produced in Hidden Unit 6. This activity is reported in each cell in Table 7-2. In this table, the column label indicates one of the activated input units, and the row label indicates the other. (Pairs that correspond to the diagonal of the matrix were not presented, because in this multilayer perceptron it is not possible to send two signals simultaneously from one input unit.) If the Gaussian activity produced is 0.90 or higher, then this indicates that the signals from the two input units cancel each other out, turning Hidden Unit 6 on. The input pairs that cancel each other out have grey cells in Table 7-2.
If tritone balancing were the only kind of balancing evident in Table 7-2, then only six different pairs of pitch-classes would cancel each other’s signal out. An inspection of Table 7-2 indicates that nine different pairs of inputs cancel one another out. (Note that the table is symmetric, and that each pair occurs twice in the table.) Each is highlighted in a dark grey cell in the table. In addition, four other pairs of pitch-classes nearly cancel one another out, because they produce activity of 0.58. The weaker activity produced by these pairs of inputs is highlighted with lighter grey cells in the table. The pattern of grey cells in Table 7-2 is very regular, consistent with the regular pattern of alternating connection weights in Figure 7-10. In general, pitch-class pairs that are separated by a minor third or by a major sixth cancel one another out. There are two caveats to add to this general description. First, in some instances (e.g., A paired with C) the two weights are different enough in magnitude that they do not completely cancel one another out, but cancel each other out enough to produce moderate activity. Second, even though A and D♯ are a tritone apart, their connection weights are so close to zero that this pair produces high activity in Hidden Unit 6 too.
With this understanding of the connection weight structure in Figure 7-10, let us now consider the nature of the bands in the jittered density plot for Hidden Unit 6.
The jittered density plot for Hidden Unit 6 reveals five different bands. Excluding Band A (which again appears to be a “zero loading” band with no interpretable structure), these bands share one interesting qualitative characteristic: all of the tetrachords that belong to the same band are missing a pair of pitch-classes. Patterns in Band E are missing both A and D♯; patterns in Band D are missing both D and G♯. Each of these missing pairs defines a tritone circle (i.e., [A, D♯] or [D, G♯]). Patterns in Band C are all missing both A♯ and G, which are separated by a minor third. The two patterns that belong to Band B (C♯m6 and Gm6) are missing A and D♯, B and F, and C and F♯. Each of these pairs defines a tritone circle.
Quantitatively all of the bands in the Figure 7-10 jittered density plot can be explained in terms of the balancing of adjacent pitch-classes. Let us use the connection weights in Figure 7-10 to identify four different sets of three pitch-classes: let Subset 1 be [A, A♯, B], let Subset 2 be [C, C♯, D], let Subset 3 be [D♯, E, F], and let Subset 4 be [F♯, G, G♯]. Our previous discussion of the connection weights for each of these subsets (see Figure 7-10 and Table 7-2) suggested that if the same pattern of input activity is present in two of these subsets, then their activities will all cancel out, producing high activity. For instance, imagine an input pattern that includes both A and B as pitch-classes. This corresponds to the pattern of activity [1, 0, 1] in Subset 1. If this same pattern of activity is present in Subset 2 or Subset 4, then the signals from the two different subsets will cancel out, producing high activity in Hidden Unit 6. However, if this same pattern of activity is present in Subset 3, the activities will not cancel out, because these two subsets of pitch-classes have the same connection weights.
We can analyze each input-pattern that belongs to a band in terms of the patterns of activity present in each of the four pitch-class subsets for that pattern. We can perform this analysis both qualitatively (e.g., is the pattern of activity in Subset 1 the same as the pattern in Subset 2) and quantitatively (e.g., what is the contribution to net input from Subset 1 or from Subset 2). These analyses indicate that band membership can be explained by patterns of activity balancing across the four different subsets. For example, consider Band E in Figure 7-10. It consists of 14 different tetrachords, including dim7, aug7, m7, and 6 chords. All but two of these input patterns are completely balanced in the sense that they have the same pattern of activity in both Subsets 1 and 2, and have the same pattern of activity in both Subsets 3 and 4. This produces net inputs near 0.07, producing high activity in Hidden Unit 6.
The only exceptions to this are the two augmented seventh chords (F♯aug7 and Caug7) found in Band E. These two tetrachords have identical patterns of activity in Subsets 1 and 3, which do not balance, and which produce a net input of 1.09 from each subset. However, they also have patterns of activity that produce a net input of −0.39 from a third subset, and a net input of −1.66 from the fourth. When all four net input components are combined, the final net input for both chords is 0.13, producing Hidden Unit 6 activity of 0.99.
Band D in Figure 7-10 contains 20 different input patterns, representing a variety of different types of tetrachord [dim7, aug7, m7, 6, m6, add9, and m (maj7)]. Of these 20 patterns, 12 are similar to those described for Band E: Subsets 1 and 2 have the same pattern, as do Subsets 3 and 4. However, because the tetrachords in this band include A and D♯, these two pitch-classes do not completely cancel out corresponding pitch-classes in the other subsets (see Table 7-2). As a result, the net inputs for these patterns are slightly larger, producing slightly lower Hidden Unit 6 activities. This is true even when the patterns of activities in complementary subsets are identical.
The remaining eight patterns in Band D have less balance between subsets but still produce small enough net inputs to generate high Hidden Unit 6 activity. Two of these chords are augmented sevenths that include either an A or a D♯ (Aaug7, D♯aug7). Their patterns of activity across subsets are similar to the two augmented seventh chords in Band E, but their net input is slightly more extreme (around −0.17) because A or D♯ are involved with weaker balance (Table 7-2). The remaining six input patterns in this band involve balance between two of the subsets, but the other two are not balanced. Again, the weights of the particular pitch-classes involved are such that net input is low enough to generate strong activity in Hidden Unit 6.
The remaining bands in the Figure 7-10 jittered density plot involve less balance between subsets and more extreme net inputs, decreasing Hidden Unit 6 activity even further. For instance, Band C consists of eight tetrachords, half of which are sixths and half of which are minor sevenths. None of the subsets balances any of the others for any of these input patterns. However, each of these eight tetrachords has one subset that has a zero net input. The net inputs from the remaining three subsets sums to either −0.34 or 0.34, producing activity of about 0.60
Band B consists of only two tetrachords, C♯m6 and Gm6. Both of these tetrachords have the same pattern of activity in Subsets 1 and 3, producing net input of 1.09 in each. This is the same situation we observed for the two augmented seventh chords that belong to Band E. The difference emerges in terms of the net inputs produced for these two minor sixth chords for the other two subsets, which are −1.66 and −0.94 respectively. In sum, these two chords generate a net input of −0.42, which results in only moderate Hidden Unit 6 activity.
The remaining 98 tetrachords belong to Band A. These are instances of nine of the 12 different types of tetrachord, including all of the 7flat5, 7sus4, m (add9), and maj7 chords. Only the 6, m7, and dim7 tetrachords are not found in this band. In general, there is less and less balance among the four subsets of inputs as one inspects the chords that belong to this band. When balance does occur, it is typically between only two of the subsets; the remaining two subsets are so unbalanced that extreme net input is the result. The net inputs found for the patterns in this band range from −4.11 to 3.43. There is substantial variability in this range, and sometimes net input is small (e.g., around −0.57. This explains why this band is moderately broad in Figure 7-10.
7.3.7 Hidden Unit 5
Figure 7-11 provides the connection weights and the jittered density plot for Hidden Unit 5. At the end of training, its µ is equal to −0.03.
Figure 7-11 The connection weights and the jittered density plot for Hidden Unit 5.
Unlike the previous hidden units that I have analyzed, Hidden Unit 5 does not appear to organize pitch-classes into equivalence classes based upon musical intervals. Instead, it exhibits tritone balance: pairs of pitch-classes that are a tritone apart have weights that are equal in magnitude but opposite in sign.
Although it is less evident than was the case in Figure 7-10, Figure 7-11 indicates that Hidden Unit 5 is also structured to produce balance between patterns of activity defined over subsets of three adjacent input pitch-classes. Again, let Subset 1 be [A, A♯, B], let Subset 2 be [C, C♯, D], let Subset 3 be [D♯, E, F], and let Subset 4 be [F♯, G, G♯]. An inspection of Figure 7-11’s connection weights indicates that two pairs of these subsets appear to balance one another: Subset 3 balances Subset 1, while Subset 4 balances Subset 2.
A quantitative examination of this pattern of connection weights reveals a tremendous amount of balancing or near balancing within its structure. As was done with Hidden Unit 5, I present every possible pair of input pitch-classes to this hidden unit. The net input for each pair is the sum of the weights of the two pitch-classes. I then compute the activity produced in Hidden Unit 5 by passing each of these net inputs through a Gaussian activation function (with µ = −0.03). Table 7-3 presents the results.
Table 7-3 indicates that there is a great deal more balancing possible with the set of connection weights for Hidden Unit 5 than there was for Hidden Unit 6. There are 19 different pairs of pitch-classes that generate activity of 0.9 or higher, indicating near-perfect balance. These cells are highlighted in grey in the table. (Again, this table is symmetric, so that each of these pairs is represented twice.) An additional 28 different pairs of pitch-classes nearly balance, and generate activity that ranges between 0.5 and 0.9.
With this degree of balancing and near balancing between pairs of connection weights, and with the potential for balancing between pairs of subsets of input patterns, it is perhaps not surprising that the jittered density plot in Figure 7-11 exhibits a large number of fairly narrow bands. In order to understand the nature of this banding, we can examine Hidden Unit 5 in terms of the relationships between patterns of activity among the four different subsets of input pitch-classes. Again, this analysis is both qualitative (do the subsets have the same input pattern) and quantitative (what is the net input generated by each subset).
Perhaps not surprisingly, the account of banding for Hidden Unit 5 is very similar to the account detailed for Hidden Unit 6 in the preceding section. For the 40 input patterns that belong to Band G, two different situations emerge. In one, the pattern for both Subsets 1 and 3 is identical, as is the pattern for both Subsets 2 and 4. As a result, near-perfect balance is achieved and Hidden Unit 5 turns on. In the other, the patterns in the various subsets do not balance. However, specific pairs of pitch-classes—from the large number available given Table 7-3—are combined to balance, again turning this hidden unit on.
Table 7-3 The activity produced in Hidden Unit 5 by all possible pairs of different input pitch-classes.
A | A# | B | C | C# | D | D# | E | F | F# | G | G# | |
A | — | 0.72 | 0.37 | 0.14 | 1.00 | 0.23 | 1.00 | 0.21 | 0.51 | 0.85 | 0.03 | 0.68 |
A# | 0.72 | — | 0.93 | 0.98 | 0.18 | 1.00 | 0.18 | 1.00 | 0.81 | 0.47 | 0.71 | 0.65 |
B | 0.37 | 0.93 | — | 0.73 | 0.45 | 0.88 | 0.46 | 0.86 | 1.00 | 0.83 | 0.36 | 0.95 |
C | 0.14 | 0.98 | 0.73 | — | 0.79 | 0.55 | 0.80 | 0.53 | 0.87 | 1.00 | 0.13 | 0.97 |
C# | 1.00 | 0.18 | 0.45 | 0.79 | — | 0.63 | 0.03 | 0.66 | 0.32 | 0.11 | 1.00 | 0.20 |
D | 0.23 | 1.00 | 0.88 | 0.55 | 0.63 | — | 0.64 | 0.69 | 0.97 | 0.95 | 0.22 | 1.00 |
D# | 1.00 | 0.18 | 0.46 | 0.80 | 0.03 | 0.64 | — | 0.67 | 0.33 | 0.12 | 1.00 | 0.21 |
E | 0.21 | 1.00 | 0.86 | 0.53 | 0.66 | 0.69 | 0.67 | — | 0.95 | 0.96 | 0.21 | 1.00 |
F | 0.51 | 0.81 | 1.00 | 0.87 | 0.32 | 0.97 | 0.33 | 0.95 | — | 0.69 | 0.50 | 0.85 |
F# | 0.85 | 0.47 | 0.83 | 1.00 | 0.11 | 0.95 | 0.12 | 0.96 | 0.69 | — | 0.83 | 0.51 |
G | 0.03 | 0.71 | 0.36 | 0.13 | 1.00 | 0.22 | 1.00 | 0.21 | 0.50 | 0.83 | — | 0.67 |
G# | 0.68 | 0.65 | 0.95 | 0.97 | 0.20 | 1.00 | 0.21 | 1.00 | 0.85 | 0.51 | 0.67 | — |
Note. Pairs that cancel each other’s signal out, producing high activity in the hidden unit, are indicated by the grey cells.
Proceeding through the various bands associated with less activity in Hidden Unit 5, the general story that emerges is the same as that for Hidden Unit 6: there is a growing imbalance between the various pitch-classes that are combined in the patterns that belong to a band, producing more extreme net inputs and lower activity in Hidden Unit 5.
There are some interesting parallels between the contents of some of the bands in Figure 7-10 and the contents of some of the bands in Figure 7-11. For example, Band E for Hidden Unit 6 contains only two augmented seventh chords; another two augmented seventh chords are the only members of Band B for that unit. For Hidden Unit 5, Band G contains only two m (maj7) chords; the two patterns that belong to Band E of the Figure 7-11 jittered density plot are also chords of this type.
Another similarity is that almost all of the bands for Hidden Unit 5 include a diversity of tetrachord types. Indeed, this property seems to be true of almost all of the bands for each of the hidden units for the extended tetrachord network. This property—as well as a detailed listing of the tetrachord types in each band—will be the subject of Section 7.4 later in this chapter.
One difference between the bands for Hidden Unit 5 and the bands for Hidden Unit 6 is that the former set does not contain patterns defined by the absence of specific pairs of pitch-classes. This property is a consequence of the specific patterns of connection weights, and their possible balances, associated with each hidden unit.
7.3.8 Hidden Unit 3
Figure 7-12 provides the connection weights and the jittered density plot for Hidden Unit 3. The connection weights in Figure 7-12 indicate that, like Hidden Unit 5, it exhibits tritone balance. Furthermore, if we consider the weights in terms of the same four subsets that have been applied to the previous two hidden units, Subset 1 balances Subset 3, and Subset 2 balances Subset 4. This pattern was previously observed in Figure 7-11.
At the end of training, the value of µ for this hidden unit is −0.01. I again compute the activity generated by every possible pair of input pitch-classes. The results are shown below in Table 7-4. Table 7-4 indicates that there are 15 different pairs of inputs that cancel one another out perfectly (each pair is represented twice in the table). These pairs produce activity of 0.99 or higher, and have their corresponding cells highlighted in grey in the table. In addition to these pairs, there are 30 other pairs that when combined nearly balance each other’s signal, producing activity in Hidden Unit 3 that ranges between 0.5 and 0.9.
It is particularly interesting to compare the pattern of connection weights in Hidden Unit 3 (Figure 7-12) to those for Hidden Unit 5 (Figure 7-11). At first glance, the two patterns seem very similar. However, a closer inspection reveals important differences between the two. Consider the weights for Subset 1 (A, A♯, B) in Figure 7-12, which has a strong negative followed by a moderate positive and a weak negative. This pattern is also evident in Figure 7-11—but for Subset 4 (F♯, G, G♯). Similarly the pattern for Subset 2 in Figure 7-12 is found instead for Subset 1 in Figure 7-11; the pattern for Subset 3 in Figure 7-12 is found for Subset 2 in Figure 7-11; and the pattern for Subset 4 in Figure 7-12 is found for Subset 3 in Figure 7-11. In short, it would appear that both Hidden Units 3 and 5 use the same patterns of connection weights (defined over the four subsets of input units), but assign these same patterns to different subsets of input pitch-classes.
Figure 7-12 The connection weights and the jittered density plot for Hidden Unit 3.
This raises the question: What is the relationship between the responses of Hidden Units 3 and 5 to the set of input patterns, given that there are both interesting similarities and differences between their patterns of connection weights?
Table 7-4 The activity produced in Hidden Unit 2 by all possible pairs of different input pitch-classes.
A | A# | B | C | C# | D | D# | E | F | F# | G | G# | |
A | — | 0.87 | 0.44 | 0.86 | 0.46 | 0.83 | 1.00 | 0.17 | 1.00 | 0.17 | 0.99 | 0.76 |
A# | 0.87 | — | 0.83 | 0.05 | 0.81 | 0.43 | 0.16 | 1.00 | 0.19 | 1.00 | 0.20 | 0.52 |
B | 0.44 | 0.83 | — | 0.82 | 0.51 | 0.88 | 0.99 | 0.21 | 1.00 | 0.20 | 1.00 | 0.81 |
C | 0.86 | 0.05 | 0.82 | — | 0.80 | 0.42 | 0.15 | 1.00 | 0.19 | 1.00 | 0.19 | 0.50 |
C# | 0.46 | 0.81 | 0.51 | 0.80 | — | 0.89 | 0.99 | 0.22 | 1.00 | 0.21 | 1.00 | 0.82 |
D | 0.83 | 0.43 | 0.88 | 0.42 | 0.89 | — | 0.73 | 0.55 | 0.79 | 0.53 | 0.80 | 1.00 |
D# | 1.00 | 0.16 | 0.99 | 0.15 | 0.99 | 0.73 | — | 0.89 | 0.42 | 0.88 | 0.44 | 0.81 |
E | 0.17 | 1.00 | 0.21 | 1.00 | 0.22 | 0.55 | 0.89 | — | 0.85 | 0.06 | 0.84 | 0.46 |
F | 1.00 | 0.19 | 1.00 | 0.19 | 1.00 | 0.79 | 0.42 | 0.85 | — | 0.83 | 0.49 | 0.86 |
F# | 0.17 | 1.00 | 0.20 | 1.00 | 0.21 | 0.53 | 0.88 | 0.06 | 0.83 | — | 0.82 | 0.45 |
G | 0.99 | 0.20 | 1.00 | 0.19 | 1.00 | 0.80 | 0.44 | 0.84 | 0.49 | 0.82 | — | 0.87 |
G# | 0.76 | 0.52 | 0.81 | 0.50 | 0.82 | 1.00 | 0.81 | 0.46 | 0.86 | 0.45 | 0.87 | — |
Note. Pairs that cancel each other’s signal out, producing high activity in the hidden unit, are indicated by the grey cells.
To answer this question, we can correlate the activities of these two hidden units produced by each of the entire set of input patterns. We can also correlate the activities of each of these units with the activities of Hidden Unit 6. This is because, like the other two, it is sensitive to tritones, and it groups input signals into four different subsets.
Table 7-5 provides the resulting correlations. This table reveals very low correlations between the activities of different hidden units. This means that while there are definite similarities among these units in terms of general patterns of connectivity, their connection weights are arranged in different orders. As a result, the tetrachords that cause high activity in one hidden unit do not do so for the other two. This will be important in our consideration of coarse coding in the next section of this chapter.
Table 7-5 Correlations among activities of three hidden units to the 144 input patterns.
HID3 | HID5 | HID6 | |
HID3 | 1.00 | ||
HID5 | 0.11 | 1.00 | |
HID6 | 0.00 | 0.02 | 1.00 |
Given that Hidden Unit 3 uses patterns of connection weights similar to those found in Hidden Units 5 and 6, but assigns the weights to different subsets of inputs, I expect that an account of the various bands in Figure 7-12 would be very similar to the accounts provided earlier for both Hidden Unit 5 and Hidden Unit 6. This is indeed the case. For the 38 input patterns that belong to Band G, two different situations emerge. For 24 of the patterns in this band, the pattern of activity for both Subsets 1 and 3 is identical, as is the pattern for both Subsets 2 and 4. As a result, near-perfect balance is achieved and Hidden Unit 3 turns on. In the other, the patterns in the various subsets do not balance. However, specific pairs of pitch-classes (for the potential, see Table 7-4) combine to balance, again turning this hidden unit on.
Proceeding through the various bands associated with less activity in Hidden Unit 3, the general story that emerges is the same as for Hidden Units 5 and 6: there is a growing imbalance between the various pitch-classes that are combined in the patterns that belong to a band, producing more extreme net inputs and lower activity in Hidden Unit 3.
Of course, each of the bands in Figure 7-12 picks out a variety of different tetrachords. A detailed list of those for the various bands in Hidden Unit 3’s jittered density plot is presented in the next section’s discussion of coarse coding in the extended tetrachord network.
7.4 Bands and Coarse Coding
7.4.1 Hidden Unit Structure
Section 7.3 presented a detailed examination of the structure of each of the seven hidden units in the extended tetrachord network. This revealed many details about the connection weight structure of each hidden unit, as well as the types of tetrachords that produced varying degrees of activity in each hidden unit. These details reveal three general points. First, the connection weight structure of each hidden unit is highly regular, and this structure relates to musical intervals. Four of the hidden units assign input pitch-classes to equivalence classes based upon strange circles. Hidden Unit 1 groups pitch-classes using the two circles of major seconds and the six circles of tritones. Hidden Unit 2 assigns pitch-classes to equivalence classes based upon the three circles of major thirds. Hidden Unit 4 organizes pitch-classes using both the three circles of minor thirds and the six circles of tritones. Hidden Unit 6 uses equivalence classes based on circles of tritones. Hidden Unit 7 assigns pitch-classes to equivalence classes based upon the four circles of major thirds. The remaining two hidden units (Hidden Units 3 and 5) employ tritone balance, assigning weights that are equal in magnitude but opposite in sign to pitch-classes separated by a tritone.
Second, the connection weight structure of each hidden unit produces distinct banding when hidden unit activities are graphed in a jittered density plot. The hidden units that organize pitch-classes using strange circles exhibit either two or three distinct bands, while the hidden units that employ tritone balance exhibit five or six distinct bands. For all hidden units these bands emerge because signals from different pairs of pitch-classes are assigned connection weights that produce balance or near balance for some pairs but not others.
Third, almost all of the bands in each hidden unit’s jittered density plot are heterogeneous. That is, almost every band includes instances of more than one type of tetrachord. This is apparent in Table 7-6, which lists each tetrachord type found in each band of the seven jittered density plots.
Table 7-6 The types of tetrachords found in each band in each jittered density plot that was presented in Section 7.3.
Hidden unit | Band | Tetrachords in band |
1 | A | aug7, 7flat5 |
B | 7, m6, m(maj7), add9 | |
C | 6, m7, maj7, dim7, 7sus4, m(add9) | |
2 | A | 6, 7, m6, m7, maj7, dim7, aug7, m(maj7), 7flat5, 7sus4, add9, m(add9) |
B | 6, 7, m6, m7, 7flat5 | |
C | maj7, dim7, aug7, m(maj7), 7sus4, add9, m(add9) | |
3 | A | 6, m7, maj7, 7sus4, add9, m(add9) |
B | 7, m6, m(maj7), add9 | |
C | 6, m7, maj7, aug7, 7sus4, m(add9) | |
D | 7, m6, m(maj7), add9 | |
E | 6, m7, maj7, dim7, aug7, 7flat5 | |
4 | A | 6, 7, m6, m7, maj7, dim7, aug7, m(maj7), 7flat5, 7sus4, add9, m(add9) |
B | 6, 7, m7, maj7, 7flat5, 7sus4, m(add9) | |
C | maj7, aug7, m(maj7), 7sus4, add9, m(add9) | |
5 | A | 6, 7, m6, m7, maj7, m(maj7), 7sus4, add9, m(add9) |
B | 7, m6 | |
C | aug7, 7sus4, add9, m(add9) | |
D | 7, m7, maj7, aug7, 7sus4, m(add9) | |
E | m(maj7) | |
F | 7, m6, m(maj7), add9 | |
G | 6, m7, maj7, dim7, aug7, m(maj7), 7flat5 | |
6 | A | 7, m6, maj7, aug7, m(maj7), 7flat5, 7sus4, add9, m(add9) |
B | 7, m6, dim7, aug7 | |
C | 7, m(maj7), add9 | |
D | m6 | |
E | 6, m7 | |
F | 6, m6, m7, dim7, aug7, m(maj7), add9 | |
G | 6, m7, dim7, aug7 | |
7 | A | 6, 7, m6, m7, maj7, aug7, m(maj7), add9, m(add9) |
B | 6, m7, aug7, m(add9) | |
C | dim7, 7flat5, 7sus4 |
Table 7-6 indicates that only four of the 31 different bands are pure in the sense that they pick out only one type of tetrachord. Hidden Unit 5 Band B contains only m6 chords (which are identical to the 7 chords that it also contains). Hidden Unit 5 Band E contains only m (maj7) chords. Hidden Unit 6 Band D only contains m6 chords; and Hidden Unit 6 Band E contains only 6 chords (which are identical to the m7 chords that it also contains). Every other band contains at least two different types of tetrachords. Many of these bands contain six or more different types of tetrachords.
Table 7-6 indicates some additional properties concerning the similarities and differences between different bands and their contents. For instance, several different types of chords appear to be similar to one another because they are frequently seen together in the same band. For instance, tetrachords that belong to the four types 7, add9, m (maj7), and m6 are found together in eight of the 31 different bands in Table 7-6.
There are also substantial differences between individual hidden units in terms of their sensitivity to such groups of chords. For instance, bands that contain 7, add9, m (maj7), and m6 chords are associated with different levels of activity when bands from different hidden units are compared (compare Hidden Unit 1 Band B to Hidden Unit 5 Band F). As well, some units that have bands that contain these four chord types contain other tetrachord types as well, and these typically differ from one another. For instance, Hidden Unit 7 Band A groups these four chord types along with instances of 6, maj7, aug7, m (add9), and m7 chords. In contrast, Hidden Unit 4 Band A includes these four types along with instances of 6, m7, aug7, 7♭5, dim7, m (add9), maj7, and 7sus4 tetrachords. Furthermore, these four chord types are not always found in the same band. For example, Hidden Unit 6 Band C contains 7, add9, and m (maj7) chords, but does not contain any m6 tetrachords.
7.4.2 Bands and Coarse Coding
The general summary of network structure provided above indicates two key facts: the hidden units of the extended tetrachord network are highly structured, but individual hidden units do not detect the presence or absence of particular tetrachord types. How then do the output units of the extended tetrachord network process hidden unit activity to identify an input pattern’s chord type?
The general answer to this question is that the extended tetrachord network is another example of coarse coding, a concept that was introduced in Chapter 5. In coarse coding, individual hidden units serve as inaccurate detectors of input pattern properties. However, particularly if each hidden unit views the inputs from a different perspective, when different inaccurate representations are combined, an accurate classification emerges. Fortunately, the summary of band contents in Table 7-6 helps provide an explanation of coarse coding in this particular network.
Imagine presenting one input chord to the trained extended tetrachord network and only observing the activity that it produces in each of the seven hidden units. The chord produces the following activity pattern, given in ascending order of hidden units from H1 to H7: [0.14, 0.02, 0.01, 0.99, 0.32, 0.05, 0.00]. Given this activity pattern, what type of tetrachord was presented?
To answer this question, let us re-label each hidden unit activity value with the jittered density plot band to which that activity value corresponds. When this is done, the set of hidden unit bands to which the pattern belongs (in the same order as before) is: [B, A, A, C, C, A, A]. With this pattern of bands in hand, let us take Table 7-6 and delete any bands that are not present in this set. The result is presented as Table 7-7:
Table 7-7 The types of tetrachords found in each band to which the first single pattern presented to the network belongs.
Hidden unit | Band | Tetrachords in band |
1 | B | 7, m6, m(maj7), add9 |
2 | A | 6, 7, m6, m7, maj7, dim7, aug7, m(maj7), 7flat5, 7sus4, add9, m(add9) |
3 | A | 6, m7, maj7, 7sus4, add9, m(add9) |
4 | C | maj7, aug7, m(maj7), 7sus4, add9, m(add9) |
5 | C | 7, m6, m(maj7), add9 |
6 | A | 7, m6, maj7, aug7, m(maj7), 7flat5, 7sus4, add9, m(add9) |
7 | A | 6, 7, m6, m7, maj7, aug7, m(maj7), add9, m(add9) |
Note. The only chord type that is found in each band is the added ninth (add9), which is indicated in bold font.
Note that each band in Table 7-7 is inaccurate, in the sense that it contains four or more types of tetrachords. However, only one tetrachord type is present in all seven of these bands: add9. [The 7, m6, and the m (maj7) are all absent from Hidden Unit 3 Band A, while none of the other chords (apart from add9) are present in Hidden Unit 1 Band B.] This means that the tetrachord presented to the network must have been an add9.
Consider a second example, a chord that produces the following pattern of hidden unit activity: [0.16, 0.24, 0.20, 0.00, 0.12, 0.14, 0.00]. In terms of band labels, this pattern is equivalent to [B, B, B, A, A, B, A].
Table 7-8 The types of tetrachords found in each band to which the second single pattern presented to the network belongs.
Unit | Band | Tetrachords in band | ||||||||||
7 | add9 | m(maj7) | m6 | 6 | 7flat5 | m7 | dim7 | m(add9) | maj7 | 7sus4 | ||
1 | B | x | x | x | x | |||||||
2 | B | x | x | x | x | x | ||||||
3 | B | x | x | x | x | |||||||
4 | A | x | x | x | x | x | x | x | x | x | x | x |
5 | A | x | x | x | x | x | x | x | x | x | ||
6 | B | x | x | x | ||||||||
7 | A | x | x | x | x | x | x | x | x |
Note. The only type of chord found in each band is the dominant seventh (7), which is indicated in bold font.
Table 7-8 represents each of these bands in terms of their component tetrachord types. This table indicates that the only tetrachord type found in every band is the dominant seventh. Therefore, the stimulus presented to the network was a 7 chord. The two examples of coarse coding illustrated in Tables 7-7 and 7-8 were chosen deliberately. We noted earlier that in terms of band contents add9 and 7 chords were similar because they are often in the same band. However, the coarse coding examples show that there are indeed differences between the two chord types; there are some bands where we find one but not the other. This band intersection technique takes advantage of this property, which explains how the messy contents of the 31 hidden unit bands permit identification of the type of chord presented to the extended tetrachord network.
The output units of the extended tetrachord network do not themselves literally identify chord types by determining intersections between sets of features captured by different hidden unit activities. Instead, the output units operate geometrically: hidden unit activities provide coordinates that arrange particular types of tetrachords along a plane, and the output units then carve this plane out of the hidden unit space (e.g., Figures 6-31 and 6-32). Functionally speaking, however, this geometric process of identifying tetrachord types is equivalent to finding intersections between bands. Tetrachord types that belong to the same band will have nearly the same coordinate along one of the dimensions of the hidden unit space. One type of tetrachord is separated from the other types in this dimension by being located at a different coordinate from the others in one or more of the other dimensions.
7.5 Summary and Implications
Chapter 7 has provided an account of a network trained on a third version of a chord classification task, identifying members of a set of extended tetrachords. This task is more complicated than those tasks described in Chapter 6 because the training set contained a broader variety of chord types. As a result, a more complex multilayer perceptron, one that used seven hidden value units, was required for this problem.
Interestingly, when I interpreted this more complex network, we discovered some properties that were very similar to those discovered in the networks discussed in Chapter 6. In particular, many of this network’s hidden units employed strange circles to solve this problem. Recall that a strange circle picks out 1) pitch-classes that belong to a particular interval cycle and 2) assigns all of the pitch-classes within this subset nearly identical connection weights. In other words, the hidden units use interval cycles to define equivalences between different pitches. Hidden Unit 1 did this with circles of major seconds, Hidden Units 2 and 4 organized inputs using circles of minor thirds, Hidden Unit 6 applied circles of tritones, and Hidden Unit 7 illustrated circles of major thirds. The other two hidden units balanced particular subsets of intervals, but not in a way that reflected perfect use of strange circles.
The interpretation of the extended tetrachord network also provided an opportunity to explore the use of distributed representations. The jittered density plots for each hidden unit revealed that different levels of activity in each unit picked out a particular subset of input patterns. Musically speaking, each of these subsets was very hard to understand. However, by combining the subsets picked out by each of the seven hidden units, and looking for the chord types that belonged to each, it became clear how the network used this distributed representation to solve the extended tetrachord classification problem.
The extended tetrachord network reveals two interesting properties that also appeared in Chapter 6. First, its hidden units have a marked tendency to use a construct from post-tonal music theory—interval cycles—to classify entities that belong to tonal music. Furthermore, it uses these cycles in a strange way, by creating equivalence classes so that pitch-classes that belong to the same interval cycle are all treated as being the same pitch-class. In other words, rather than using the 12 pitch-classes that are the foundation of Western music theory (be it tonal or atonal), this network’s hidden units operate in a musical world in which there are fewer than 12 pitch-classes. This is indeed an alien music theory.
This network also reveals an interesting example of an alternative representation for musical cognition, the distributed representation. Is it possible that when human cognition processes musical stimuli similar coarse codes are employed? One way to answer this question would be to explore the kinds of errors made by humans when learning to perform chord classification, or when classifying chords (post-learning) under additional attentional demands. We could compare these to network errors made during learning, or network errors made after particular hidden units are ablated from the network. Similar patterns of errors might point toward the discovery of a coarse code for human musical cognition.
We use cookies to analyze our traffic. Please decide if you are willing to accept cookies from our website. You can change this setting anytime in Privacy Settings.