thumbnail of Public Affairs; The Responsibility Of Television
Transcript
Hide -
This transcript was received from a third party and/or generated by a computer. Its accuracy has not been verified. If this transcript has significant errors that should be corrected, let us know, so we can add it to FIX IT+.
Oliver Strauss a lot of things have happened to the human voice in the last 30 years I don't mean physically. We still produce you know vocal sounds the way we always have. But in what is done to the voice and I wonder if we could consider some of those. Alterations changes and from what they arrived from toward what they point by take it you have in mind the fact that we're opposed electronic machinery and tape recorders and telephone channels and radio stations between the speaker and the hearer or even just a public address system. Is this what you have in mind. Yes that is a good starting point. This would have begun say 30 years ago or so with the introduction of amplifying the human voice so that one voice could be heard by housings and thousands of people. Well I would put the commencement of this really back to the time of Alexander Graham Bell. But we'll see.
You need to skip to a slightly more modern frame of reference a more modern frame of reference and a wider spread application. Well something that is very much along this line was reported in the papers very recently. This had to do with the wartime conversations by radio telephone across the Atlantic between Winston Churchill and Franklin D Roosevelt. The paper reported. I wonder if you saw it is paper reports stated that the Germans learned how to unscramble a secrecy scrambler used by them on speech communication I know well just Mom would you know first what do you mean by scrambling. Well very often the I think what most of us who are familiar with James Bond movies and other spy stories are familiar with the fact that to preserve secrecy and yet have telephonic communication there are some horrors that one can do to the voice to make it totally unintelligible but which are reversible on the other end.
To restore the intelligibility to restore it however you have to have some tricky code or trick device which exactly does the scrambling of the egg. If it's not accurate it will still be unintelligible and these in general are called scramblers and are used for secrecy purposes. And it turns out then that the Nazis had their own scrambler for the what was the machine with Churchill and Roosevelt used. It's a machine that was developed by the Signal Corps during the war and was called of vocoder short for a vocal or voice coda. Which had approximately 10 channels that chopped up the frequencies existing in ordinary speech into 10 narrow bands and coded them so that there was a special code for the loudness of each of these 10 frequency bands and then reassemble them at the other end and recreated a certain kind of speech which was intelligible. Though I imagine rather Donald Duck sounding them and not at all bad it could
be made to be Donald Duck. In fact as a perhaps first example of what can be done to the human voice it was able to take the pitch out of the pitch channel out of the voice entirely. Pitch raises up and goes down exaggerated not official as I did it but quite important in conveying a certain kind of emotional information and a simple monotone rasp could be substituted for it and speech under those circumstances lost a good deal of its emotional content. It derives from the tone the tone of the voice and the pitch. Wouldn't that require a good deal of care in the wording or phrasing of the message. Well really yes and no. We have been coding speech for a good read a good deal longer than Alexander Graham Bell. In fact the invention of writing goes back to some millennia of the sea and after all we do remove the tonal quality in writing
hieroglyphics or Sanskrit or Chinese ideographs or indeed our written English. So this is not such a cosmic thing to do simply as an exam one example of the change that can be made and was indeed made by the what time vocoder. Which did not reproduce this pitch very well. There was another reason of course for using that besides secrecy which was broached reached I think is a better word by the Germans. To the extent that they circulated the original English and translation copies of the Churchill Roosevelt some of the Churchill Roosevelt conversations to the water on general staff and other politicos in Germany. The other aspect of this of course is that speech is a very very redundant sort of a communication device by redundant I mean that if we say air or
we are actually repeating the same thing several hundred times if that ass sound or song lasts for a second because that is how many times the waveform which makes up speech is repeated in a second and consequently in principle it should be possible to transmit speech with very many fewer Wiggles or fewer dots and dashes off your communicational elements. Then the actual high fidelity speech which we are accustomed to in air to air communication face to face. So one of the things that these devices can do is to get 10 conversations perhaps over a telephone line which normally treated as a high fidelity line would only carry one and much of the speech and coding is in order to conserve wire and radio channels.
So we can get more transatlantic conversations on the table. There are two or three principal ways of doing this. Each of them involving some kind of processing of the speech. One is this you may have noticed that there are pauses such as the one I just made a very interesting trick which the telephone company now does on transatlantic cables is to sample the speech with perhaps 10000 times a second. Pick up all the little poses pauses and jam somebody else's conversation into the mains in-betweens of ours. This can only be done if you have many conversations and are sampling them all and have a rather fancy computer to assign conversations to the other conversations for the empty spaces. It sounds incredible and yet it is indeed done and has approximately doubled the
capacity of existing telephone table communication across the Atlantic. Well I'd heard about this increase of capacity but had never had any inkling as to how it was done. Well this is only one of the several devices they use. There's one which is involved in actually processing the speech as opposed to processing the. Radio waves running over the wire or over the channel. So this is is one very important way of processing speech. But there are of course quite a few others and some of them we are aware of when we hear them and some of them we are not. Well this leads me directly back to something we were talking about earlier and that is the non-verbal communication of information that goes all the way back to primitive times when man I suppose we can call the man growled like animals to convey the fact that this is
my hunting territory you stay away. And is preserved in every language certainly including our own right to the present day because we convey emotional information which is a very large part of our communication with tongues rather than words. Yes I know what you mean here but I think I'm going to disagree with you at least in part as to our capacity to do very much with this electronically this is not technically this is not one of the areas in my opinion the way we can do much. And it's because of this. We can make speech sound like Donald Duck or we can by technical means all we can. Make a male voice sound like more like a female voice or more like a child's voice and we can make it sound as though you were talking through a pillow. But we are so coded up in our heads so
that it is extremely difficult to hide the emotional factors which stem not just from how loud or soft we talk or how high pitched a low pitched our voice may be but from all kinds of things such as the pace of speech. Is it fast. Is it slow. Are the words distinctly articulated or are we slurring them as when we are drunk and we conclude about sobriety or drunkenness. Good deal. You know but I'm only going Woodward street. The speech gets thick and confused and this is a signal which gets through almost any speech which is still intelligible. So to distort the emotional feeling in the spoken word is still something which I don't think we would fool many people with no matter how we process the speech. We can however increase the intelligibility of speech by
some of the devices that are available to us and I think these are some interest. Well this whole business of intelligibility of course is almost central to this whole conversation. You can't communicate without intelligibility. Well that's true enough but now let's be more specific about it. Obviously the perfect in quotation marks. Speech system is one which sounds as though the person you are talking to listening to were in the same room as we are now. This involves a absence of any processing or any resulting. Change of quality and putting that putting out into the receiver. Everything that was originally transmitted transmitted by the speaker. There is however there are however some rather wild things you can do to make speech more
intelligible. Let's say to an airplane pilot who is a helicopter pilot still were subjected to a terribly high level of surrounding noise or somebody operating a jackhammer or a jackhammer or people who have walkie talkies on an aircraft maintenance line where Jet engines are warming up and these are extremely important because many of our environments are noisy such as aircraft and it's a little bit similar to some of the problems that the deaths have. This leads to a very interesting observation about speech. Most speech is a remarkable device for piercing noise largely because it repeats itself. I suppose you know this redundancy that the sound after all says the same thing in terms of the Wiggles of the airwaves from 50 to several hundred times in the course of a second. In the course of a second or actually in the course of the length of time it
takes to make a vowel. With these are the loud sounds these vile sounds the soft sounds in speech that are still important. Of the six hisses the s's and the deaths and the peas and the bees and the teas and these in general are not as loud as the bells but they carry more information than the vowels do. For example if you have the word the bad you can make more words with changes of the initial and final B and D than you can with changes of the vowel of course change bad to bed and bed to bid. But there are many other things you can do with the initial final letters which however are softer. So if we had a device to make the soft letters loud and the loud letters I used that I really thought of what should properly be called phonemes their meaningful sound make the vowels
soft. Then we would increase the intelligibility of speech and it would pierce noise and other adverse situations and perhaps deafness to that then unprocessed speech and this is indeed a possibility now there is something called Infinite clipping. Which is a little too complicated to go into in great detail but which does just this takes a soft sounds and amplifies them. The ones indeed that carry the largest fraction of information and quiets down the loud sounds and the net result is not only intelligent intelligible speech but speech which is reasonably pleasing to listen to it's not grating on the ear too badly if it's properly done. So this is another kind of processing we've talked about coding and scrambling of for secrecy or to get more speech over a given channel or more different conversations over a
given cable let us say we've spoken about some of the emotional elements which are hard to disguise. Now here we have pure intelligibility which can be done by improving the ratio of loudness of the different elements of speech. Are there mechanical or electronic means for this clipping and this increase of intelligibility that could be applied to hearing aids for the for the extremely deaf. It has been suggested and it would be probably. A rather marked improvement because in addition to cutting off the extreme loudness of the loud sounds and other loud noises of speech it would have the good effect that if someone dropped a china plate on the floor next to a deaf person he wouldn't go through the roof with a loud report of the breaking china. It would also quiet
the ambient random sounds that are so objectionable to people using hearing aids which are impersonal things they simply today for the most part make everything louder without discrimination as to whether it's speech or the report of a gun. There is another way besides AP clipping. That's called automatic volume control Automatic Gain Control which has been exploited to some extent though to a very limited extent. How would this. These techniques affect the learning of a language it seems to me that attempting to learn of a foreign language that we frequently find keywords less than intelligible. Well I think you are. You are striking out on new ground here because as far as I know this is not been done. But if I am following you will you run into the same experience I have namely that
a rapid speaker or an ordinary speaker in a foreign language drops his voice very often in familiar elements either of sound within a word or occasionally several words and one of the hardest things to do is to register when you're in an unfamiliar language these softly spoken sound elements or softly spoken words. And I think you have something here which would be worth exploring that this automatic gain control or influence clipped speech might very well be quite a good educational aid in the familiarization of a newcomer to a language. Let me give you the example that I had in mind when I brought this up when I was first. Attempting to learn to speak German I found great difficulty in decipher sentences as spoken conversationally by native speakers frequently because of the German rule that a sentence containing a conjugated form of a verb and a past participle or an infinitive
has the infinitive thing at the at the end of the sentence being at the end of the sentence. If it were declarative it was dropped in pitch and frequently inviting. Well one thing I know we couldn't do would be to cut up a German sentence and put it into English word order. I would take a computer to carry around with you and even then I think that it might have some difficulty. But in terms of the dropping of the voice within a word or with a whole word or phrase. This is something I believe language teaches might well become aware of. I'm afraid we can't on this during this conversation. Invent the whole device. But it really is necessary since it has been invented or several such devices have been invented. It would be an interesting application project for some of the speech processing I'd like to bring up another aspect of how the speech recording and processing is a bit of a problem. We've had a lot of.
Talk in the press and in governmental circles in the last year or so last few years indeed about wiretapping and about the use of recorded speech as evidence and working in a broadcasting station we're familiar every day with cutting out the coughs and other extraneous noises on the broadcast by simply taking a scissors cutting a piece out of a recording tape and splicing the two ends together. Now there is no more cough. The same problem comes up in the law. How do we know that this tape represents the truth the whole truth and also nothing but the truth namely that nothing has been cut out and nothing has been added to the original utterance of the speaker whom we wish to. Accuse or vindicate as the case may be. And I have a feeling that this is why tape recording and other kinds of recordings but particularly tape which is the convenient way today
has not had more standing in court. It is possible to add or subtract. You never quite feel sure. That. It's all there. Well you can always inspect the tape physically to see if there are spices that have been made but then of course there's a way around that we can make another tape without lies from the original tape. No I think there is a way to do cure this defect and I personally have never heard of it being done. Supposing Let's invent something I will go right because this is a very real legal problem. Supposing we have some relatively complicated series of dots and dashes or other signal. Which is repetitive but too complicated for somebody to dream up arbitrarily And while we're recording the voice frequencies on the tape switch don't go above perhaps three four five thousand cycles per second for ordinary
recording we also record up to 20000 or 30000 cycles. This complicated pattern. I mean this is beyond the range of human here on the range of hearing. But nevertheless intrinsic in the tape. Now if someone cut out the piece it would be astronomically unlikely that the ends you would put together would match. Oh I see what you mean. There'd be a kind of a gap or like someone drawing the line and then it would jump to another place. The line of continuity of the line corresponds to the continuity of the code. And of course anything added would not have the same code on it at all. Wouldn't would either wouldn't have the code at all or would have something else on it that didn't match. So it would seem to me that there is another field where the speech processing could indeed serve the law better than it does now. Well another aspect of speech processing and one that
we need to explore I think on this program we've talked about. Increasing intelligibility By be clipping in these early as well as volume compression in general and scrambling and all sorts of things that we haven't touched on one thing and that is this. The speed of hearing and comprehension. It's possible I know in working with a tape recorder to take a tape that's been recorded at seven and a half inches per second. Play it back at 15 inches per second and provided the speaker is speaking at a reasonably normal rate. Still to be able to understand his words doubling the speed yes it's less intelligible of course but. Isn't there work being done on speeding up playback without changing pitch. Of course when you change the speed of the tape such as on the machinery using now you jump on the pitch by an octave.
Yes for double E of the speed of speech raise the pitch an octave It sounds rather peculiar. Yes yes there is work being done on it now and in fact the technical problems of speeding up speech without changing the pitch had been very nicely overcome and there is very good evidence that instead of. The normal rate of talking which is our conversational mode of speech might be a hundred twenty five two hundred seventy five words a minute. We could double that to two hundred fifty three three hundred fifty words per minute. And still it would be intelligible. This would have great advantage for instance in talking books for the blind. I'll call you read. Anything from two hundred to a thousand words per minute as so that our brain is perfectly capable of taking in the language through the eyes. At this rate there's no defect in the central part of the central nervous system and
preliminary evidence indicates that we can understand this time compressed speeded up speech at least to something over two hundred fifty words a minute maybe faster with practice. With all the radio broadcasting and news reports and so on that go on it might be a little bit fearsome to consider this possibility. But on the other hand for the blind for instance who have talking books there is indeed a project going on of that sort now and certain kinds of things which in themselves are not like a play where you wish to have pace and time and emotion. But things like news reports and factual reports. It has been suggested that this be tried not simply to laboratory groups but over a broadcast station to see whether a 10 minute news broadcast could be comfortably absorbed in five minutes. And there seems to be a real place for this too. It's a rather exciting new possibility
which techniques for which techniques have been developed. This is when your aim is primarily to convey information. Yes rather than to create a dramatic effect. Yes. Well how did how was this done. Can you explain without being too technical. How do you speed this up without changing page. Well there are two ways basically of doing it. One involves chopping small pieces out of speech. The simplest way to explain would be this. I referred twice now to the fact that speech is very redundant. Yes that it is. We don't say it once we say it 50 times we don't say it was speaking of the speech sounds we say it many many times the shortest things are the sounds where you sort of click your tongue against the teeth of the top of your mouth like a T. And even that last for a number of thousands of seconds a number of milliseconds. Now supposing we were
to sample the speech. 100 times a second and cut out every alternate one one one hundredth of a second and then put it in put the slices together again. This would not change the pitch and it would double the speed of the speech. This is basically the most straightforward way to do. Technically it's a little complicated but in concept it's the simplest and this would then shorten your T's to a still shorter thing and they would still be t's still BTs and will shorten your A's nose your eyes and the vowel sounds. And this is one practical way to do it. Now there are other ways but. It actually would permit one to shorten speech to perhaps three times its normal rate or one third is the whole of the aeration. The question is Where do you begin to get a traffic jam either in the ear or in the brain behind the.
And I'm not familiar enough to know how fast it could go but I suspect that a limit somewhere around 400 words per minute might be achieved in train people maybe maybe faster. When I was going to say this would require some practice I'm sure. I think this would. It's a little bit imposing the think of what would happen if political speakers began to compress a one hour political speech in 15 minutes. But I was rather a be better applications for it that. Well I think we've. Covered a good deal of information on this subject thank you very much Oliver Strauss.
Series
Public Affairs
Episode
The Responsibility Of Television
Producing Organization
WGBH Educational Foundation
Contributing Organization
WGBH (Boston, Massachusetts)
AAPB ID
cpb-aacip/15-89d51w58
If you have more information about this item than what is given here, or if you have concerns about this record, we want to know! Contact us, indicating the AAPB ID (cpb-aacip/15-89d51w58).
Description
Description
Hallock Hoffman
Created Date
1967-06-21
Topics
Film and Television
Public Affairs
Media type
Sound
Duration
00:28:05
Embed Code
Copy and paste this HTML to include AAPB content on your blog or webpage.
Credits
Producing Organization: WGBH Educational Foundation
Production Unit: Radio
AAPB Contributor Holdings
WGBH
Identifier: 67-3021-00-00-001 (WGBH Item ID)
Format: 1/4 inch audio tape
Generation: Dub
Duration: 00:28:16
If you have a copy of this asset and would like us to add it to our catalog, please contact us.
Citations
Chicago: “Public Affairs; The Responsibility Of Television,” 1967-06-21, WGBH, American Archive of Public Broadcasting (GBH and the Library of Congress), Boston, MA and Washington, DC, accessed October 23, 2024, http://americanarchive.org/catalog/cpb-aacip-15-89d51w58.
MLA: “Public Affairs; The Responsibility Of Television.” 1967-06-21. WGBH, American Archive of Public Broadcasting (GBH and the Library of Congress), Boston, MA and Washington, DC. Web. October 23, 2024. <http://americanarchive.org/catalog/cpb-aacip-15-89d51w58>.
APA: Public Affairs; The Responsibility Of Television. Boston, MA: WGBH, American Archive of Public Broadcasting (GBH and the Library of Congress), Boston, MA and Washington, DC. Retrieved from http://americanarchive.org/catalog/cpb-aacip-15-89d51w58