Conversational AI is a grand term. Many AI systems operate through voice or text and are given this name. But not many of these AI are capable of interactions that we can really call conversation.
Most conversational AI systems are limited to a small handful of scenarios or cannot follow an interaction longer than a question and its answer. In response, we’ve learnt to dumb conversation down: to speak slowly and awkwardly to AI customer service, to bark commands at our voice assistants at home, or to expect amusement without functionality from more socially oriented bots.
We’re building Alana to address the gaps in existing systems, paving the way to more fruitful, uplifting and expansive conversations.
Here are seven things we do differently.
1. Nuanced memory
Using anonymization and encryption, Alana keeps memories of each person it talks with and uses this to build a model of the user. So Alana learns your likes and dislikes and remembers information about you.
Alana can use this information to provide more relevant browsing and recommendations. Deeper than this, Alana figures out the types of decisions we are likely to prefer in conversations. So Alana develops an understanding of the individual user that feels far more satisfying.
But it also feeds Alana’s wider understanding of humankind. That means that, as Alana has more conversations with more people, the system grows to understand more about human behaviour.
2. Engages with the real diversity of human conversation
When you speak to Alexa or Siri, you have to say something precise and well-formed in order to be understood. You might be confined to issue a command using particular phrasing or asking only yes-no questions. If the user strays from an imagined script, then in most cases the AI won’t understand (“Computer says no”)
But real conversations don’t always follow a fluent path. Sometimes we pause, we change our minds about what we are saying, we stutter, or we go back to something we said before. Alana is expert at dealing with these quirks of real-world language use.
This means Alana can handle messy, non-linear, unscripted conversations, like the ones we have human-to-human. It can decipher confusion and clarify with questions. It can follow a thread of thought and handle open-ended questions.
3. Long-form, discursive conversations
Most AI technology is based on the model of a single-interaction command and response. The next thing you say to Siri after it has responded is effectively a new interaction. Alana is able to have longform conversations, following the threads of the conversation.
Alana discovers new ways to continue engaging the user, probing them about their interests and offering new information that will be of interest. And the conversation can cover multiple topics. For example, Alana might be talking about movies, then delve into detail about a particular soundtrack and the musicians on it, before going back to the main discussion about movies.
As humans, we gather information from multiple sources and combine these to form new understandings. Alana can do this too. It can link information across, for example, different news articles, making connections and intelligent decisions based on this information.
4. Proactive suggestions
Our system is much more than a passive servant. Alana doesn’t wait for you to ask for something, but predicts what you’re likely to be interested in, and also asks for more information to help you define what you want. So you don’t have to go to Alana with a completely clear picture. Helping you find out what you’re looking for is part of what Alana does best.
Alana suggests new conversational directions, which can get you out of your echo chamber. It might suggest a news story about someone you’re interested in, but from a news source you wouldn’t normally consult, giving you a new take.
In our client use cases where there are particular areas in which Alana must be especially proactive to make sure the user is getting all the relevant information, this can be hand-crafted based on the demands of the client.
5. Recognises emotion
We talk the most to the people we trust. That trust is built when someone can hear not only the words we speak, but how we say them and what we are feeling. Alana can gauge whether you are pleased or angry and respond appropriately.
6. Greater than the sum of its parts
Currently, AI comes in two flavours: they are either rule-based, meaning they are programmed to be able to interact around a specific subject area, but not much else, or they use deep learning, meaning they can chat on any subject but have little controllable functionality.
Where are the AI assistants that can help us with practical information, and at the same time provide companionship or show understanding? So far, AI does one or the other, not both.
Alana has combined both of these systems in one conversational AI, who can be specialised in specific tasks while also being able to chat on any topic.
And, in contrast to deep learning methods which are currently available elsewhere, the team can control the parameters of what Alana says, to eliminate nonsensical or inappropriate speech.
7. Internationally acclaimed AI and computer scientists
There is a reason why Alana is ahead of the rest. It is the manifestation of the life’s work of the scientists on the team, some of whom have been working in the field for nearly 30 years and all together have over 100 years of expertise between them! Alana was born out of decades of laboratory research, carried out purely in thirst for knowledge.
Built in close partnership with the Interaction Lab at Heriot Watt University, Alana absorbs the collective innovations of world class scientists from across the industry. So, Alana is constantly growing and incorporating the very latest research.
Participating in real world research projects throughout Alana’s development has exponentially developed the technology to specific use cases and finessed it into the product that is now available.
As finalists in the Alexa Prize in 2017 and 2018, Alana spent 2 years in American homes learning to hold longform conversations for information browsing and social interaction. And Alana is now part of a healthcare robot for the elderly, a robot for the general public in a shopping centre, a receptionist at Heriot Watt university, and has been integrated into an animated character on smartphones.
There is no single code to crack the code for AI. There can be no shortcuts. It takes careful combination of multiple parts and years of finessing to create conversation that feels right.
When put all together, Alana’s features take conversational AI beyond any other technology that is commercially available.