Skip to main content

Our acronym-hopping journey from B2C, B2B to B2C2B and everything in between.

Part 1: Vitamins

When launching a #voicefirst startup a key consideration in a successful go-to-market strategy is how to balance where the market is today on the hype cycle, and where it will be in 3–5 years time. Many VC’s and entrepreneurs are currently trying to solve this equation, as today’s customers are, most probably, not like tomorrow’s customers.

Like many in the voice space, we began our journey with lucid dreams of building a consumer-facing solution on Alexa, that could claim universal adoption and adoration. However, enthusiastically developing a profitable consumer voice product, on the Amazon Alexa platform, quickly proved to be problematic.

The challenges of consumer voice products

There is much buzz around voice right now, and Amazon has done a good job of pioneering the space through the popularization of the Alexa voice assistant. Google, pursuing a fast follower strategy, seems to be one step behind in releases, but as a result, one step ahead in quality.

The developer and enthusiast communities alike have assumed many things about how this space would shape up. Analogies from the early app store days, while popular, have proven to be false equivalencies in many respects. Most notably in terms of their early economic opportunity.

One thing is clear: It’s not the app store

The first 30 days of the Apple App Store saw around 60 million downloads and paid out $30 million to developers. Conversely, the Alexa Skill Store only officially introduced in-Skill monetization capabilities in 2017, two years after its launch. There is still no clear path to an ad model in voice. The question looming over it remains; how can we force people to listen to ads on something that must obey their verbal commands?

Perhaps it is telling, that not only has Amazon not published any similar revenue statistics, it also, seemingly, touts the exorbitant number of third-party Skills (70,000+) as its hero metric. While this may sound impressive at first glance, it is actually one of the bigger downsides of trying to run a business on Alexa. As your ability to be discovered is severely impaired.

Enables (their version of the download) and revenue numbers are not publically available. What we do know, however, is that retention rates, and general engagement, is quite low on these platforms.

The Discoverability Crisis and Cannibalization

It should be noted, while Steve Jobs clearly saw third-parties as vital to iPhone adoption. Today’s voice assistant manufacturers, still do not have as a clear of a vision of their role in these marketplaces. This is partially due to the function of voice as a convenience rather than a utility. A vitamin, not a painkiller. Meaning there is less functionality private individuals or companies are willing to pay for, in the current state of voice. Its main use, besides setting alarms, timers and taking notes are centered around accessing general information. Things related to information access, which no one is willing to pay for anymore when the alternatives are just too numerous.

Big brands are simply “ticking the box”

Most big brands nowadays are simply “ticking this box”, in order to get a position or benefit from the ancillary brand value their presence on these channels may provide. Dominos, Uber, Kayak and a few other brands do have commercial Skills/Actions which allow you to purchase their services.

However, a quick scan of their reviews suggests that there is still a lot of room for development in areas of accuracy and user experience. Until these nagging issues are fixed, voice projects will likely remain more as experiments, rather than investments in a new, strategic channel for most companies.

It’s also worth mentioning that services which would and could be ordered quite seamlessly over mobile, are perhaps not going to be consumed at greater volumes thanks to voice, but could likely cannibalize existing mobile transactions instead.

So what’s the buzz

The lower barrier to entry, to developing for voice -versus mobile-, is contributing to the mass of lower quality experiences packing these marketplaces. It has resulted in thousands of enthusiastic amateurs, and hobbyists, from across the globe developing semi-redundant Skills/Actions. Notably, the most popular third-party Skills/Actions are ambient sounds and noises or simple interactive games.

Also, the manner of how the Developer rewards program began, which pays an undisclosed, monthly sum to third-parties as an incentive, is indicative as to how this marketplace has been approached from its inception.

Despite the best efforts of many tech giants to advertise this future, most people still do not have a smart home. Regardless, the smart-home angle has been touted ad nauseum on these assistants but is still more of a marketing use case than an actual gen. population one.

So how do people make money on Voice Assistant platforms?

This is still unclear, we have seen some VC investments but few acquisitions or revenue numbers. Games have historically helped pioneer the adoption of new platforms. Game developers such as Volley, have made the switch from mobile to voice and have benefitted from being early to market. That said, playing voice-first games can be slightly awkward as they lack the visual stimuli and feedback, which hook users and make these games more navigable and satisfying. The acqui-hire of by Headspace may be the most notable exit in voice.

In-skill monetization and subscriptions have been introduced on both major platforms, however, without compelling offerings, these potential revenue streams remain untapped and uninteresting to most customers.

Accelerate commercial activity

The emerging theme seems to be, that the greater convenience of accessing information over voice, isn’t something anyone is willing to pay for. So as V-commerce matures, the winners will probably be those, able to reduce friction in the buying process of goods or services. It could be reordering commodity items, which is a current focus, or something a bit more involved such as booking appointments.

Even the prerequisite research to buying larger ticket items, where the buying process is then finalized over touch, bears much potential. Most of these activities are best achieved over multi-modal interfaces where users can issue commands over voice and have results returned to them visually. And new releases have screens as the main players recognize this.

As brands get better at understanding which moments in the path to purchase can be better impacted over voice, we will see a maturing of their presence. For startups and the developer ecosystem, however, it seems most will become (or remain) white glove services supporting these efforts.