Automate Alexa voice skill testing

Hi,

I am new to automation.

I would like to automate the testing of Alexa voice skill.

These skills are written using the Python SDK.

Any suggestion which tool I can use?

1 Like

Hi Archi,
In UiPath Studio, voice calls can be automated by simply integrating with respective activity package.

2 Likes

Thanks for the response @vijitha but the Alexa skill is not a voice call.

You can consider it as an app that follows your question and responds either with a question or answer.

Example, "Alexa, what’s the weekend forecast for New Delhi?

1 Like

I’m was also assuming you were talking about the skills themselves, not the voice part, this is all done at an API level. The voice component is probably only useful as an end-2-end for sanity I guess.
If I were trying to do this, I would explore using the ASK commandline tool first, and then decide if using the amazon Skill API is a good surface to test with. I am assuming the skill API is going to be useful for enabling and disabling the skill as part of the test environment though.

You will also want to cover voice command using a WAV file later, to verify that publishing to production works as a simple daily-smoke-test. This sounds like a totally cool project, do let us know what you learn.

Hi @conrad.braam

The Voice component is important because people speak the same word in different dialects and accents. Alexa may understand the same word differently because of different dialects and accents.
For example, as per the model the question should have “here”, but Alexa listens “hear” or “ear” then it will check for the same in the model and upon not getting a matching question it might exit the skill or ask once more the same question.
We have to check and provide as many samples as possible to Alexa to train its model.

I just wanted automation for checking the dialog flow for sanity.

When I was doing this in 2019, I used the ASK CLI to autmoate.
I created the script ones and after every deployment replayed it.

I didn’t try but I think we can record and replay the questions from WAV files in music player for Alexa instead of speaking.

Speaking from personal experience, before the ask-cli testing I was talking with Alexa for 8 to 9 hours after every small deployment. And after that my throat would be sore for few hours rendering me speechless which I didn’t like a bit :laughing:

1 Like

I don’t have any IOT or alexa and similar devices, so my knowledge of the platform is limited to what I hear in IOT podcasts. I’m also assuming you are developing skills, not developing the engine.

Yeah I would never be talking to the mic on the device as a way of testing it, so your approach of going with the CLI tool first sounds like the best validation from a development perspective. As for the speech recognition of your skill keywords, that is really a thing that Amazon control and I would have thought is a constraint on the product, as in that’s not a thing your team worries about. But your team would worry about the default skill word choices, but that’s a static analysis exercise where a linguist will advise surely? It must have been fun speaking to a robot for a whole day solidly, I sympathize.

As for wav files, I’m going to assume you know that MP3 is the worst place to start, MP3 has compression artifacts, but if it was me, I would be skipping WAV files and looking for ogg/vorbis supporting tooling for composing snippets and playing them to get better fidelity. I am guessing the alexa has a mic input 3.5mm jack, that’s a good place to start, just make a custom audio cable with some visit to an electronics shop parts and plug that into a USB audio dongle so that you have a audio output on your PC that you can play sounds to without loosing the built-in sound ports… I mean these are all suggestions, it’s not something I have seriously done in this kind of configuration.

1 Like

Yes, we are developing skill.

We take linguist advice when we can otherwise we like to talk to Alexa and try what we think to be homophonic.

Your advice for ogg files are worth try!

Next time I get to test a voice skill I am definitely trying it!

I’m only mentioning lossless audio because the Alexa has a stonkingly powerful DSP chip, so if you play it an MP3 file, it’s often going to actually detect compression artefacts. I’m assuming you want to mix noise into the samples, worth looking to see if ffmpeg can do that for you as a commandline tool if you do intend to “test” the phrases for detectability.

I used this rig to explore something once, any old USB audio dongle should do. (The attached circuit has a gain/volume control built in, but I would be adding a external 47K logarithm potentiometer 47K Log 24mm [VR012] - £1.08 : Bitsbox Electronic Components, Electronic Component Suppliers UK to allow me to drop the input level for your experiments. You don’t need a DC blocking cap most of the time for audio experiments like this. I did check for DC offsets before I plugged it in though. Hazzard disclaimer: If you build this, just be aware that the Alexa mic ground might not be the same as the USB ground, that’s probably when you will want a 220nF DC blocking capacitor on both lines else you will overload something.

1 Like