Screenshot of ngrok running in a terminal and displaying a forwarding URL of the form “”. You will need the forwarding URL when configuring your Nexmo application and number in the next steps. Once you run the above command, your terminal should look similar to the screenshot below. You can read more about connecting ngrok with Nexmo here. You will need to keep ngrok running in the background for this to work. ngrok will allow us to expose our localhost at port 5000 to incoming requests. # if you are running python3 please run the following insteadĪt this point, please start ngrok in a separate terminal window by running the command below. Pip install -r requirement.txt # installs our dependencies Source venv/bin/activate # activates the environment To create and activate your virtual environment, run the following commands in your terminal: virtualenv venv # sets up the environment Go ahead and create a directory for this project and copy the following list of dependencies into a file in your project directory named requirements.txt: nexmo Virtualenv allows us to isolate the dependencies of this project from our other projects. Let’s get started with our DIY Babel fish solution by setting up a virtual environment for this project using Virtualenv. We will list all the commands you need to install everything else as you follow along. You will need to have both Python 2.x or 3.x and the HTTP tunnelling software ngrok installed to be able to follow along. If you would prefer to just see the code, it is available on GitHub here. At this point the British caller hears the translated message in English. The Python server then sends a request to Nexmo to speak the English message to the British caller. The Speech API responds by sending the English translation as text to the Python server. The Python server sends the German audio to the Microsoft Speech API. A German caller speaks a message in German which Nexmo passes through to a Python server. Note that throughout this tutorial I will use the example of a German/British English conversation.ĭiagram that shows how a message passes through the system. Below, you can see a high-level system diagram of how an instance of speech from either side gets processed. For ease of implementation, both parties will have to call our service's Nexmo number. On top of this, we will implement logic to manage a bi-directional dialogue and to instruct the Nexmo number to speak the translations. We will use the Translator Speech API to handle the transcription and translation. Following this, we will implement a Python server which will receive speech via a WebSocket and route the incoming speech from the Nexmo number to the Microsoft Translator Speech API. Then, we will set up a Nexmo number for handling incoming calls. In this blogpost, we will go over how this Babel fish system works step by step starting with the required setup and configuration. We went on to create a Nexmo Babel fish that lets two people talk on the phone with either party hearing a translated version of what the respective other party says. The Google Pixel Buds come at a price of course - so why not build our own?! That’s what Danielle and I thought at the latest hackference. A technology quite like the Babel fish in The Hitchhiker’s Guide to the Galaxy that can translate any sentient speech for its wearer thus enabling them to communicate with virtually every being. If you were on the internet in these past few months chances are you saw Google's real-time translation Pixel Buds.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |