“Alexa what’s for Lunch” — an Alexa skill for lazy students of the FAU (pt.1)
At my university, there are two dining halls which serve food around noon. Their daily menu can be viewed online. Me and many of my friends check the menu on a daily basis, so I decided to automate the process and save some precious time.
Edit: This post documents the development of the Alexa skill “Mensa Spion”. Once its ready for public use, I will include an installation guide for interested members of FAU ;)
Voice Controlled Applications consist of two parts:
- the voice controlled User Interface defines the interaction dialog between Alexa and the user. This is done by defining so-called intents which are invoked by a set of specified utterances
- a backend service. As developers, we need to and we need to provide a backend service, which contains the application logic. This can be anything from information about a topic to switching on a light-bulb to
With the updated Alexa Developer console, the backend service can be deployed serverless. So instead of maintaining a server, which answers to the requests of the Alexa application, we can implement Lambda cloud functions. Amazon Web Services take care of the resource allocation and with the generous Free Tier program, you will likely not even pay for it. This option can be chosen by selecting an “alexa-hosted skill” template.
Setting up the dialogue interface (voice — UI)
The first step you will want to do is to register as a Alexa Developer in the Developer Console.
On the console start page, choose “create new skill” then select the “start from scratch” option with an Alexa hosted backend. You can choose between a pre-set node.js or a python template; we will use the python version.
For users to activate your skill, set an invocation name. We name the skill “Mensa Spion” (German for “dining hall spy”). The skill can now be activated by saying: “Alexa start the Mensa Spion”.
Next, we have to define the interaction dialogue of the applicaiton. An Alexa skill is structured into intents and utterances.
An Intent is something the user wants (intents) to do.
Utterances are the expressions he might use to activate those intents.
For our simple Mensa Spy skill, we define one intent called getTodaysMenu. To invoke this intent, we define a couple of phrases and keywords which will activate it.
This might for example be “Whats for lunch today” or “What is on the menu in the dining hall”.
When the Mensa Watcher skill is started, you can now activate this intent by using any of the defined phrases. You can add any amount of utterances and intents. To test that there are no conflicting phrases and your model works as desired, you can launch the test mode after hitting the “deploy” button.
Alexa will now react to the utterances you defined and send a request to the backend for every detected intent. Lets see how we can react to these events on the backend side.
Designing the Backend (using AWS Lambda)
The backend service is where the Alexa skill get its intelligence from. When an intent is triggered from the defined voice patterns, a request is sent to the backend. This can be implemented in many ways. If you want full customisation you can even use a raspberry pi to host an http-server which listens to incoming requests. You just have to specify the address of your server under the sidebar tab called “endpoint”. The easiest way however is to fill in the lambda cloud functions generated by the template.
In the Alexa console, switch to the tap “code”. If you chose the python option upon skill creation, you should see an editor with an opened file called “lambda_function.py”.
First, lets change the skill launch response. In the code, look for the
LaunchRequestHandler. Replace the default text with something suitable, in case of our Mensa Spion:
“Dies ist der Mensa Spion! Frag mich was!”.
For every intent that was defined in the skill, a handler has to be defined and added to the skill builder. In our case, we need to define what happens for intents of the ‘getTodaysMenu’ type that we defined earlier.
The code below returns a voice response saying that Chilli-Cheese Burgers are the dish of the day:
For the sake of this post, we will always return the same dish. In the next post, we will see how we can pull some data from a website every time the lambda function is triggered.
Testing the skill
After building the interactions model and deploying the lambda backend, lets test the skill to verify it is working. To so, click on the “test” tab in the Alexa console. Alternatively, if you have linked your echo to the developer account, you can just go ahead and test from your device.
Start the skill by saying its invocation name “Mensa Spion”. Alexa will answer with the answer we defined in the
Then we can start the
getTodaysMenu intent by saying any of the utterances we defined earlier. Alexa will answer with the (currently static) answer we defined in the Handler.
Alexa skills consist of two components. The voice-based user interface and the backend. We design the user interface by defining intents with corresponding utterances. For every intent we define, we create a handler in the backend, which processes the request and returns a spoken response.
Thanks for reading and stay tuned for part 2 where we use web-scraping to obtain updated information.