Most conversational AI applications today are PoCs or pilot projects. They’re often rule-based assistants that handle simple FAQs. Mady Mantha, senior technical evangelist at Rasa spoke at our recent deep learning conference about the key elements needed to build and scale enterprise-grade assistants for mission-critical applications. She also touched upon recent advances in NLU and dialogue, demonstrated how to take conversational AI applications from PoCs to production. We’re sharing some of the presentation here, but you’ll want to watch the video below to access all of the resources.
Terminology: understanding NLU and Dialogue Management
NLU: The goal of NLU is to extract structured information from messages. The most important parts of the message are what we call intents and entities.
Dialogue Management: Dialogue management looks at the user’s message and the extracted intents and entities and figures out respond to send to the user. As Mady examples, “It looks at everything that’s been said before, looks at what the user has just said or asked, what’s the next action that the bot can reliably take? It could be sending a message back to the user, it could be calling the REST API and then sending a message, it could be querying a database etc.”
Rasa creates a DIET for conversational AI
For a really powerful system that can handle all of this stuff. Rasa built DIET: Dual Intent and Entity Transformer. It’s a new open-source multicast transformer architecture join NLU architecture that handles both intent and entities. It does both income classification and entity extraction together. It consumes one message and spits out the intents and entities at the same time.
DIET is a transformer-based architecture for NLU. It can use any pre-trained language model in a plug and play fashion.
Thu you can use Bert. you can use convERT, you can use gloVe.
The modular architecture fits into a typical software development workflow which is important to encourage citizen developers as Rasa is working to democratise conversational AI.
DIET outperforms current soTa even without any pre-trained embeddings
Maddy notes that during their experiments “we found that DIET improves upon the current state of the art without any pre-trained embeddings. With just sparse features.
DIET outperforms Bert
It’s six times faster to train. It’s more accurate than anything we’ve ever released. You Large scale pre-trained models generalise well across tasks but they’re also very computationally expensive, but DIET outperforms these models. And it’s very lightweight and it fits into a typical software development workflow. DIET is a joint NLU architecture that does intent classification and entity extraction. And it helps make the first piece of conversational AI, which is NLU, a lot more performant and a lot more robust.
Transformer embedded dialogue policy
The second part of conversational AI is dialogue management.
After DIET (or any kind of NLU architecture that you end up using) defines the intents and entities, the dialogue management engine decides how to respond. Mady explains “With systems that don’t go beyond rules and FAQs and one turn or single turn interactions, it’s going to be a challenge to really respond to conversations that are not linear, that are actually multi-turn, which are naturally sounding dialogues in the way that humans speak. Because in real-time conversations thousands of users often deviate from a linear path.”
It’s going to be a challenge when you try to build it production-ready assistant, because, conversations don’t match 20% of what actually happens in production. If you want to build an assistant that has to automate maybe 30% of conversations or so, then you have to handle sub dialogues, chitchat, user corrections and other things that happen in natural conversations and dialogues.”
You also need to find a way to make iterative improvements and train your model so it can mirror what actually happens in production and so that it can actually work reliably in production.
The Transformer Embedding Dialogue policy can untangle sub-dialogues
Rasa wanted to build something that didn’t have to use a state machine, something that could handle use chit chat interjections, nonlinear conversation and unexpected user behaviour.
To handle this in a better way, and to avoid bottlenecks instead of using a typical recurrent neural network the research team. Rasa came up with the TED policy, which is a transformer embedding dialogue policy it uses transformer-based architecture. It uses self-attention, a mechanism in machine learning, where the model will learn which conversation trends to pay attention to and which turns to ignore in order to complete a task at hand. For example, learning to ignore jokes. And it will start paying attention again when the user starts cooperating again.
TED uses all of the training data and conversation examples and is able to better generalise from them. It looks at previous conversations and trains on them. And when it sees something unexpected, or something that it hasn’t seen before, it is better equipped to make a reasonably confident prediction as to what should happen next. TEd uses not rules but data and machine learning and specifically self-attention to solve it’s sub-dialogues, and multi-term interactions.
Research is not enough in conversational AI
Research isn’t enough. You also need proper software engineering best practices, things like the CI/CD test-driven development, purposeful integrations with third-party services to add skills to your systems, and you need integrated version control.
Mady explained, “We wanted to make the deployment aspect of that easier by abstracting away some of the complexity involved. So we have a one-line deploy script that if you run that, script it instal a containerized conversational AI application for you. It instals Kubernetes and a helm chart with everything. If you just use a one-line to deploy script, you could have a containerized application within five minutes.
It allows you to instead focus on improving your chatbot, right, and now getting bogged down with the deployment, but still being able to have best technologies like Kubernetes and containerized applications.”
How to build conversational AI in a just a few steps
You want to start by building a minimum viable assistance:
- First, build something that handles basic happy paths. It can cover all of the FAQ that people typically ask your customer service representative or online webchat portal. So collect all those faqs and add that to the chatbot.
- Once you have the basic, most important things covered, improve it by talking to the assistant first.
- Test it out yourself, make sure that it’s answering those questions correctly. If it’s not, go back and fix them.
- Write unit and functional tests, just like you would for any other piece of software or software applications to improve your chatbot and make it more robust.
- Model better interactions
- Ship it to test users
- Ship it to real users so they can start interacting with it and talking to it.
- Use all of those really digital conversations to continue to improve your chatbot.
- Fix any misanoted text, tweak responses, and then add additional capabilities based on what users are asking for.