Nature’s Phonemes
By understanding the different evolutionary roles for vision and audition, we just saw that audition is the appropriate modality to harness for language: sound is nature’s standard event stream, and language therefore wants to utilize sound to make sure language utterances get received. But what kinds of sounds, more specifically, should language use to best harness our brains? The sounds of nature, of course. But the natural world has a large portfolio of sounds it can make, and people are good at mimicking a fair share of these sounds, mostly with their mouths, but sometimes with the help of their hands and underarms. Saying that a well‑designed language will use sounds from nature is like saying one had “a sandwich” in a deli. Which sounds from nature? Wind blowing, water splashing, trees falling (when someone is around), leaves rustling, thunder, animal vocalizations, knuckle cracks, eggs breaking? Where is language to begin?
Although nature’s sounds are all over the map, there’s order to the cacophony. Most events we hear are built out of just three fundamental building blocks: hits, slides, and rings.
Hits happen whenever a solid object bumps into another object. When you walk, your feet hit the ground. When you knock, your knuckles hit the door. A tennis match is a game of hits–ball hits racket, ball hits net, ball hits ground. Hits make a distinctive sound. They happen suddenly, and the auditory signal consists of an almost instantaneous explosive burst of energy emanating from the impact.
Slides are the other common kind of physical interaction between solid objects. Slides occur whenever there is a long duration of friction contact between surfaces. If you drag your finger down the page of this book, you’re making a slide. If you push a box along the floor, that’s a slide. The auditory structure of slides differs from that of hits: Rather than a nearly instantaneous release of energy, slides have a non‑sudden start and a white‑noise‑like sound that can last for a more extended period of time. Slides are less common than hits. First, they require a special circumstance, the extended interaction of two surfaces; hits, on the other hand, are what perception scientists call “generic,” because no special coincidences are needed to carry off a hit. Second, when slides do happen their friction tends to significantly lower the energy in the event, and therefore they commonly occur at the tail ends of events. Third, whereas a long sequence of hits is possible (with intervening rings, as discussed in a moment)–as when a ping pong ball bounces lower and lower, for instance–a long sequence of distinct slides is not typically possible; something would have to stop one slide to allow another one to start, but any such interference with a slide is likely to involve a hit.
Hits and slides are the only physical interactions among solid objects that we regularly experience, and they are certainly the primary ones our ancestors would have experienced. We are land mammals. Splashes, involving a solid and a liquid, are neither hits nor slides, and although they could shape the auditory system of otters, seals, and whales, they’re unlikely to be of central significance to our auditory system.
With the two kinds of solid‑object physical interaction out of the way, we are left with the final fundamental constituent of these natural events: rings. A ring is what happens to a solid object after a physical interaction, that is, after a hit or a slide. When a solid object is physically impinged upon, it vibrates and wobbles, and although one can almost never see these vibrations, one can hear them. You can tell from the sound whether your pen is tapping your desk, your computer, or your coffee mug, because the same pen hit leads to different rings; you may also be able to tell that it is the same pen hitting the three different objects.
Different objects ring in distinct “timbres,” a word (pronounced “TAM‑ber”) that refers to the overall perceptual nature of the sound. For example, a piano C and a violin C have the same pitch, or frequency, but they differ in the quality or texture of their sound, and timbre refers to this. Most objects have very short‑lived rings–unlike the long‑drawn‑out ring of a gong–but they do ring, and once you set your mind to noticing, you’ll be amazed to hear these rings everywhere. And it is not just hits that ring, but slides as well. The vibrations that occur when any two objects hit each other will have many similarities to the vibrations resulting from the same two objects sliding together, so that we can tell that a coffee mug is being dragged along the desk because the ring possesses certain features also found in the ring of a pinged coffee mug.
Hits, slides, and rings are, therefore, nature’s primary phonemes (see Figure 3). They are a consequence of how solid physical objects interact and vibrate. Although these three kinds of sound are special in the lexicon of nature, there is nothing requiring language to carve sounds at these joints. Dog woofs, cat calls, horse neighs, whale song, and bird song do not carve at these joints. Neither does the auditory communication of a fax machine. But if a language is to be designed to harness the human auditory system, then it will be built out of the sounds of hits, slides, and rings.
Figure 3 . The three principal constituents of physical events: (a) hits, (b) slides, and (c) rings. They sound suspiciously similar to plosives, fricatives, and sonorant phonemes in human languages.
Are human languages built out of these constituents? Yes. In fact, the most fundamental universal of human speech is that phonemes, the “atoms” of speech, come in three primary types, and these types match nature’s phonemes! Language’s hits, slides, and rings are, respectively, plosives, fricatives, and sonorants.
Plosives–like b , p , d , t , g , and k –are found in every language, and consist of sudden, explosive, high‑energy inceptions. Plosives sound like hits (even embedding their explosive hitlike starts in the name). Figure 4a shows the time‑varying frequency distribution for the sound made when I hit my desk with a small plastic cup, and one can see that the hit begins with a sharp vertical line indicating the presence of a wide range of frequencies at the instant of the collision. That same figure shows, on the right, the same kind of plot when I made a “k” sound. Again one can see the sharp edge at the beginning of the sound, characteristic of a hit. (Also note that, in English, at least, one finds many plosive‑filled words with meanings related to hits: bam, bang, bash, blam, bop, bonk, bump, clack, clang, clink, clap, clatter, click, crack, crush, hit, klunk, knock, pat, plunk, pop, pound, pow, punch, push, rap, rattle, tap, and thump.)
Languages have a second principal kind of consonant called the fricative, such as s , sh , th , f , v , and z . They are extended and noisy, and sound like slides. (In fact, the very word “fricative” captures the friction nature of a slide.) And just as slides are rarer than hits, fricatives are less common than plosives. All languages have plosives, whereas many languages (especially in Australia) do not have fricatives. Figure 4b, on the left, shows the frequencies of sound emanating from a small cup that I slid on my desk, and one can see that there is no longer a crisp start to the sound as there was for hits. There is also a longer duration of sound, all of it with a wide range of frequencies. On the right of Figure 4b is the same kind of plot, this one generated when I made a “sh” sound. One sees the signature features of a slide in fricatives. (Also note that in English, at least, one finds many fricative‑filled words with meanings related to slides: fizzle, hiss, rustle, scratch, scrunch, shuffle, sizzle, slash, slice, slip, swoosh, whiff, whiffle, and zip.)
The third principal phoneme type used across human languages is the sonorant, including vowels like a , e , i , o , u , but also sonorant consonants like l , r , y , w , m , and n . Each of these phonemes has strongly periodic vibrations, and has a complex spectral shape. Sonorants sound like rings. Figure 4c, left, shows the ringing after tapping my coffee mug. Only certain frequencies occur during the quickly decaying ring, and these frequency bands are characteristic of the shape and material properties of my mug. To the right of that in Figure 4c is the signal of me saying “ka.” (The plosive “k” sound corresponds to the tap.) As with the coffee mug, there are certain frequency bands that are more active, and these patterns are what characterize the sound as an “a.”
Lo and behold! The principal three classes of phonemes in human speech sound just like nature’s three classes of phonemes. We speak in hits, slides, and rings!
Before getting overly excited by the realization that language’s phonemes are like nature’s phonemes, we must, however, address a worry: How else could we speak? What if human vocalization can’t help but sound like hits, slides, and rings? If that were the case, then the observations made in this section would have little significance for harnessing; culture would not need to design language to sound like hits, slides, and rings, because our mouths would make these sounds by default. We take this up next.
Figure 4 . Illustration that plosives, fricatives, and sonorants sound like hits, slides, and rings, respectively. These plots show the frequencies on the y‑axis, and time on the x‑axis. Comparison of (a) hits and plosives, (b) slides and fricatives, and (c) rings and sonorants.
Дата добавления: 2015-05-08; просмотров: 916;