This year is Home Assistant’s Year of the Voice. It is our goal for 2023 to let users control Home Assistant in their own language. Today we’re presenting Chapter 2, our second milestone in building towards this goal.
In Chapter 1, we focused on intents – what the user wants to do. Today, the Home Assistant community has translated common smart home commands and responses into 45 languages, closing in on the 62 languages that Home Assistant supports.
For Chapter 2, we’ve expanded beyond text to now include audio; specifically, turning audio (speech) into text, and text back into speech. With this functionality, Home Assistant’s Assist feature is now able to provide a full voice interface for users to interact with.
A voice assistant also needs hardware, so today we’re launching ESPHome support for Assist and; to top it off: we’re launching the World’s Most Private Voice Assistant. Keep reading to see what that entails.
To watch the video presentation of this blog post, including live demos, check the recording of our live stream.
Composing Voice Assistants
The new Assist Pipeline integration allows you to configure all components that make up a voice assistant in a single place.
For voice commands, pipelines start with audio. A speech-to-text system determines the words the user speaks, which are then forwarded to a conversation agent. The intent is extracted from the text by the agent and executed by Home Assistant. At this point, “turn on the light” would cause your light to turn on 💡. The last part of the pipeline is text-to-speech, where the agent’s response is spoken back to you. This may be a simple confirmation (“Turned on light”) or the answer to a question, such as “Which lights are on?”
Screenshot of the new Assist configuration in Home Assistant.
With the new Voice Assistant settings page users can create multiple assistants, mixing and matching voice services. Want a U.S. English assistant that responds with a British accent? No problem. What about a second assistant that listens for Dutch, German, or French voice commands? Or maybe you want to throw ChatGPT in the mix. Create as many assistants as you want, and use them from the Assist dialog as well as voice assistant hardware for Home Assistant.
Interacting with many different services means that many different things can go wrong. To help users figure out what went wrong, we’ve built extensive debug tooling for voice assistants into Home Assistant. You can always inspect the last 10 interactions per voice assistant.
Screenshot of the new Assist debug tool.
Voice Assistant powered by Home Assistant Cloud
The Home Assistant Cloud subscription, besides end-to-end encrypted remote connection, includes state of the art speech-to-text and text-to-speech services. This allows your voice assistant to speak 130+ languages (including dialects like Peruvian Spanish) and is extremely fast to respond. Sample:
As a subscriber, you can directly start using voice in Home Assistant. You will not need any extra hardware or software to get started.
In addition to high quality speech-to-text and text-to-speech for your voice assistants, you will also be supporting the development of Home Assistant itself.
Join Home Assistant Cloud today
The fully local voice assistant
With Home Assista