1 00:00:01,800 --> 00:00:03,350 - [Instructor] Next let's take a look at the 2 00:00:03,350 --> 00:00:07,380 text to speech function which we call twice in this app. 3 00:00:07,380 --> 00:00:11,980 Once to convert Spanish text into spoken Spanish 4 00:00:11,980 --> 00:00:16,070 and once to convert English text into spoken English. 5 00:00:16,070 --> 00:00:18,570 In both cases we're going to take advantage 6 00:00:18,570 --> 00:00:23,570 of the Watson sdk's TextToSpeechV1 class 7 00:00:24,160 --> 00:00:26,180 and when you create an object of this class 8 00:00:26,180 --> 00:00:28,590 like the other objects we showed you up above 9 00:00:28,590 --> 00:00:32,540 you will need your API key as an argument. 10 00:00:32,540 --> 00:00:34,758 Now we're calling this variable tts 11 00:00:34,758 --> 00:00:37,830 which is shorthand for text to speech 12 00:00:37,830 --> 00:00:41,910 and we're going to use that object to create the audio 13 00:00:41,910 --> 00:00:44,560 and the stream of bytes that it gives us back 14 00:00:44,560 --> 00:00:49,020 we will simply write into a file on the local system 15 00:00:49,020 --> 00:00:51,690 which we will then play back in the app. 16 00:00:51,690 --> 00:00:53,250 We use a with statement here 17 00:00:53,250 --> 00:00:56,950 to open up the specified file name argument 18 00:00:56,950 --> 00:00:58,910 for writing in binary format 19 00:00:58,910 --> 00:01:01,040 because audio is a binary format. 20 00:01:01,040 --> 00:01:03,291 We call that object audio file 21 00:01:03,291 --> 00:01:07,120 and you can see here the call to audio file.write 22 00:01:07,120 --> 00:01:09,580 is going to be outputting, excuse me, 23 00:01:09,580 --> 00:01:11,870 outputting this content. 24 00:01:11,870 --> 00:01:15,570 We use the text to speech objects synthesize method 25 00:01:15,570 --> 00:01:17,570 to actually create the audio. 26 00:01:17,570 --> 00:01:20,630 We give it the text that we want it to speak 27 00:01:20,630 --> 00:01:22,550 and we also give it the format 28 00:01:22,550 --> 00:01:25,410 in which we want to receive back the audio. 29 00:01:25,410 --> 00:01:30,410 So in this case the media type is audio/wav 30 00:01:30,420 --> 00:01:33,410 which means we're going to get a wav file back 31 00:01:33,410 --> 00:01:35,680 from the Watson service. 32 00:01:35,680 --> 00:01:38,080 And the voice that we want to use 33 00:01:38,080 --> 00:01:42,340 is going to be used to synthesize that speech. 34 00:01:42,340 --> 00:01:43,830 And as you may recall 35 00:01:43,830 --> 00:01:45,750 we had a couple of different voices 36 00:01:45,750 --> 00:01:47,560 that we specified up above. 37 00:01:47,560 --> 00:01:51,200 One was the U.S. English voice called Allison Voice 38 00:01:51,200 --> 00:01:55,140 and one was the Spanish voice for the U.S. 39 00:01:55,140 --> 00:01:57,560 called Sophia Voice. 40 00:01:57,560 --> 00:02:01,886 Now this method call will generate the actual audio 41 00:02:01,886 --> 00:02:04,320 and what we will get back once again 42 00:02:04,320 --> 00:02:07,720 is a detailed response object from Watson 43 00:02:07,720 --> 00:02:11,011 and inside that detailed response object 44 00:02:11,011 --> 00:02:15,200 we can get the result that was returned to us. 45 00:02:15,200 --> 00:02:18,320 The result objects content property 46 00:02:18,320 --> 00:02:21,830 is going to contain the actually bytes of the audio 47 00:02:21,830 --> 00:02:24,280 that we want to write out to disk. 48 00:02:24,280 --> 00:02:28,620 So we get that content and then we simply blast it out 49 00:02:28,620 --> 00:02:31,430 into a file that we can then play back 50 00:02:31,430 --> 00:02:33,683 in a subsequent function call.