TL/DR: The Alexa skill ecosystem is evolving, with different outcomes for large brands vs. independent developers.
When you ask Alexa for a something that request can be processed in several ways. First Party responses are those created by the Alexa team itself and are in general what you get when you ask, “what’s the weather” or “what time is it”. Third Party responses are those handled by independent developers who write “skills” (think apps). These are generally what you get when you mention a skill or brand name, such as “ask Uber to get me a ride”.
The skill ecosystem can be thought of as similar to Apple’s App Store. Developers create skills, get them approved (certified) and then users request them. Until recently all skills were free to use but could include the ability to charge for things while running the skill. You could charge for an extra sleep sound, or an extra life in a game, or for freemium features. Recently Amazon added the ability to require a payment just to use a skill.
Another similarity to the App Store is that discoverability is a challenge. With several hundred thousand apps or skill it’s a real challenge to be discovered. Amazon will promote your skill once it becomes popular but that first step is up to you.
A significant difference from the App Store is that most Alexa users are unaware that skills exist or that there actually is an Amazon Skill store. As you can see, the vast majority of skills receive almost no user reviews.
In the attempt to make first vs third party responses transparent to the end user they forgot to educate the users that there was such a thing as skills. Part of the problem is that most consumers either didn’t understand skills or didn’t even know they existed. Most people view the Alexa device as a black box that answered their questions. The intricacies of first vs third party responses were irrelevant to most users.
This became more of an issue (or perhaps less of one depending on your point of view) when they added the ability to run a skill without first installing it. It used to be that to run one of the Sleep Sounds skills (a very popular category) you had to first enable the skill in the Skill store. Later it became possible to just say “Alexa tell sleep sounds to play” and it would transparently enable the skill. While this is good at reducing friction in the short term it contributed to the lack of end user education about skills.
It also gave Amazon more control over the ecosystem because there are now several hundred Sleep Sounds skills from which Amazon gets to select from when a user asks for sleep sounds. Amazon further muddied the waters via what is called a Name Free Interaction. Rather than saying “Alexa ask Dominos to order me a pizza” you could just say “Alexa order me a pizza” which, invisibly to you, would talk to the Dominos skill. This was great but opened the door to question: suppose Pizza Hut also had a skill, which one would get the pizza order? The answer is that skill selection is an Amazon black box.
I have a skill that answers questions about Premier League Football (British soccer for Americans). You can say “Alexa ask Premier League about upcoming fixtures”. I also have Name Free Interaction that allows you to say, “Alexa what are the upcoming premier league games”. However, its entirely up to Amazon if they send that request to my skill, to another soccer skill, or just handle it themselves.
Until last year Amazon had something called the Developer Rewards program in which they would pay developers a nominal amount if their skill had become popular. The idea was that great skills helped the entire ecosystem and so developers should be rewarded. My own soccer skill got over a million invocations and I used to get $300-$400 a month from Amazon for making Alexa more helpful for all these users. The program was however cancelled last year so unless your skill charged for things you made no money for it. This left many of us wondering why we were investing time into skill development.
To replace Developer Rewards Amazon added In-Skill-Purchases. From within a skill, you could make purchases, similar to in-game purchases on your phone. Just like Apple, Amazon takes a percentage of the sale. They reduced their cut from 30% to 20% when the Developer Rewards program was eliminated. They also added the ability to make purchases of physical items from the Amazon retail store from within a skill. As an example, you can purchase various team scarfs from with my soccer skill.
Some skill developers have used these new features to make significant revenue. A hypnosis skill lets you purchase hypnosis sessions for smoking cessation, weight loss, etc. Leaving aside the question of whether being hypnotized by the voice assistant is a Good Thing, this skill’s developer makes a nice income. However, for many skill developers this emphasis on monetization flies in the face of the “Delight The Customer” mantra that Amazon espouses.
Just this week Alexa announced a set of ChatGPT/Large-Language-Model inspired features to make interactions more natural and context aware. You might not even need to say the “Alexa” word because the device could detect your body language and tell that you were speaking to the device. This will be game changing in many ways, not the least of which in that it should eliminate the accidental activations of the system when another person (or zoom call or TV show) says “alexa”. Basically, the notion of having to say the Alexa word or say a skill name seems to be going away. Is this a Good Thing?
For most users of the device, it probably is a very good thing. It should reduce some of the friction currently impeding voice interactions. If the system gains more context, it may reduce the incidence of Smart Home errors. For example, I have smart lights in my bedroom and the last thing I say at night is “Alexa turn lights off”. However, the system seems to find “on” and “off” hard to distinguish and so often believes I’ve asked for the lights to be turned on. If the system had access to the smart home context and knew that the lights were already on it could (perhaps) draw the conclusion that it was more likely that I was saying “lights off”.
While I’m not sure how Dominos and Pizza Hut are going to fight it out for the “I want a pizza” utterance I’m hard pressed to see how this helps the independent developer. Just as the Apple App Store fills in gaps in the native capabilities of the phone, skills do things that Alexa can’t do by itself. As Alexa gains the ability to do more things that almost certainly reduces the room for other skills. We must convince Alexa to pick our skill when a user says something that our skill could handle. I know there are lots of places other than my skill where one could find soccer scores. On the other hand, my skill does offer several custom metrics and graphs that really aren’t available elsewhere. The trick, as always, is discovery. If a developer created a skill that searched local pizza shop prices to find the best deal, would Alexa ever select that skill? That’s hard to say. I think small developers are going to have to learn new tricks to survive.
Top comments (2)
Nice article, Brian.
For me, Alexa is useless, as for some reason, I find it stupid to talk to a device in English, not my mother tongue.
Also, if it has problems understanding a native speaker (On vs. Off) it would probably be terrible at recognizing my thick accent.
For years, Google has supported Polish. Recently, ChatGPT showed the World that languages are not the problem anymore.
I would expect AWS to add support for many more languages, then we can talk. Until that moment comes, my Echo Dot will remain unpacked in the closet.
Better to have Amazon process our queries using LLM than to have a rigid query model defined by the "skill" developer.. It is much to easier to just ask Alexa to order a pizza than enable the Dominoes skill and having to make sure to use the correct 'intent' keywords.
I find Alexa skills dull and even though I have myself created a few Alexa skills just for trying out my dev skills and to score some free t-shirts, I myself never used them. So glad to hear that Alexa skills are evolving. Thanks for sharing!