Conversational AI systems have been around for several decades. We usually recognize them as chatbots. Nowadays, the word “chatbot” has been overused and it is losing its meaning as a productivity tool. But, chatbots or conversational AI in general has the potential to be one of the most revolutionary productivity tools ever invented. Knowing that, we need to understand the current fate of chatbots, their future and fundamental bottlenecks if there are any!.
First, let’s live in that future for a moment where someone has invented a “true” conversational AI. Of course, the first thing that comes to your mind is probably an “AI” from one of your favorite Sci-Fi movies. The computer will talk to you and solve your problems immediately. It will outsmart you in every way!. Even if the machine doesn’t outsmart you, the expectation is to have an engaging and meaningful conversation with you.
Back in the 1960s, a chatbot named Eliza was created by Joseph Weizenbaum. It was a revolution at that time. Then the list began to expand with landmark chatbot applications like PARRY, Jabberwacky, Mitsuku, etc. Some of these chatbots were implemented well before deep learning was known to the world but surprisingly had excellent benchmarking results against the so-called Turing test. With recent advancements of Deep Learning, we have come to an era where those AIs that we saw in movies are almost becoming realities. Some of the very interesting conversational AI applications which emerged recently includes AIs such as Google Meena, IBM’s Project Debater, etc.
When we look at this amazing evolution of chatbots from projects like Eliza in the 1960s to projects like Google Meena in 2020, a question comes to my mind.
Now let us dig deeper. First thing to note is that all of the chatbot applications cited above are specific use cases built to imitate certain types of human conversations. They are not capable of conducting fluent conversations outside the scope of their intended use cases. The technology has not come to a stage where generic AI can intelligently react in any given scenario. You can consider every deployment of conversational AI today as a specific use case and every use case has a defined scope. Now the question that we raised before can be rephrased as follows.
First, it is clear that this use case building is a manual process. At least up to a certain degree, it has to be a manual process even when the technology becomes highly sophisticated. Every practical business use case that we automate through conversational AI has its own knowledge base, a way of reasoning and a way of communicating. Even if we replace the same conversational automation with a human agent, we have to teach the human agent certain rules, certain things that are specific to the given business process. So this “teaching” or the knowledge transfer to the human agent has to be done manually by another person. This is how we configure a human agent for the specific use case, but through natural language based instructions.
Before you build any conversational process flow, you have to represent it. If you are communicating to a human agent, you do that with natural language based instructions and illustrations if necessary. You will probably take a couple of hours to completely communicate the process flow to the human agent (assuming he or she listens carefully and correctly understands what you mean). This is how we build a conversational use case using a human agent. Now if you look at the chatbot applications I mentioned before, we can say that in order to develop rule based bots like ELIZA or PARRY, developers have to express the use case related details as hard coded rules. To build a conversational use case (such as Meena) with Deep Learning technology, the developers indirectly express the use case related details through training data. Or you can sometimes express your use case in terms of Data and Rules together. In a way this is what we do in modern bot building platforms. Take Google DialogFlow, Microsoft LUIS or any other as examples. Developers express their view on the use case through hard coded rules and data. As an example, in Google DialogFlow, we use training data for intent recognition and that’s how we express our thoughts on what to recognize. And we use DialogFlow’s “context tags” to control the chat flow and writing “context tags” can be considered as hard coded rules that we use to express our thoughts on how the conversational transitions should happen. In some of the platforms we express the chat flow completely in terms of data rather than having mixed hard-coded rules (Eg: RASA).
Now to answer our original question, we believe “modeling a chatbot” for a certain use case takes a lot of effort and it can be viewed as a major bottleneck of the chatbot design process. Today, this process takes a lot of time and effort. Bad chatbots are results of this imbalance of the effort or the time that the developers have to put vs. the end results. Current advancements in DeepLearning have certainly made the modeling process efficient but as many real-life evidences suggest, there is a huge gap to fill. All of these wonderful, professionally built conversational AI applications had gone through a massive modeling process focused on a certain use case. Maybe we can say that, the “human instructing another human” scenario is the ceiling or the ultimate benchmark of conversational “modeling efficiency” where the developer expresses the process flow very conveniently and efficiently to the other party. We can clearly see that this is a fundamental bottleneck of conversational AI design.
There are many technologies emerging with the promise to make conversational AI more humanistic and reliable. Still the day we see a human-like chatbot in a website will not be soon enough until we address this fundamental bottleneck of conversation modeling. Natural Language Processing techniques are coming to technology maturity nowadays and there are technologies emerging for Natural Language Generation purposes as well. But their direct contribution to improve the expressive power related to conversational modeling is not drastically improving with the same “wow” factor attached to their individual performance in their own applications.
We, at Cognius, try to solve exactly this problem by creating a framework where developers can conveniently represent process flows of conversational scenarios and build conversational AI use cases efficiently. We understand that it is a massive challenge. But we are committed to solving conversational modeling, one step at a time!. For this purpose, we created our Conversational Modeling Language (CML) based on our patented framework called “3 Block Concept”. I thought of concluding this article, leaving the space for another article to discuss CML in detail.