End to End Learning for Child Chatbots

End-to-end learning has the potential to replace the traditional pipe-line oriented structure of a chatbot. Instead of having pre-defined discrete components in the pipeline to feed the data through, in end-to-end learning, a deep neural network simply takes in an input and returns an output. In a chatbot’s case, the model will be given a string / query and be expected to return a single response.

This approach has several benefits. The first is perhaps simplicity: the developer does not have to select components to put in the pipeline. This makes it less easier to get started developing it. More importantly, it decreases the amount of knowledge one has to have in the specific area of development in order to get decent results. In the traditional pipeline model, selecting components can only be done well if one understands what’s going on behind the scenes for this specific application. For a children based chat bot, one would likely have to modify the default pipeline given by frameworks such as Rasa until it works optimally. To do that, one would have to understand how each component works and what makes the most sense for responding to children specifically. Instead, with end-to-end, that modification doesn’t have to be done manually.

In addition, despite its elegance, end-to-end learning still achieves good results with limited training time. This makes it practical to substitute it in for traditional pipelines in many situations.

However, there are many drawbacks, some of which relate specifically for chatbots. First is that end-to-end learning models require a lot of data to train effectively. While finding and parsing conversational data normally isn’t always so easily, it’s doubly so for children’s conversational data. As I mentioned in a previous post, training data for conversations involving young children (that can be used ethically for this kind of research/development) is hard to come by, making end to end learning as a catch all solution more difficult.

In addition, many chatbot functions intrinsically require some level of human direction coded into it somewhere in the pipeline. For instance, when one asks “How is the weather in Osaka?”, an end-to-end model cannot be expected to answer the question satisfactorily. In a traditional model, the variable “Osaka” will be taken out at some point in the pipeline, where the developer will then have to code out what kind of api will be used to get the weather. While theoretically, some level of end-to-end learning would be possible in these cases, it’s not really end-to-end learning by definition.

However, end-to-end and traditional models are not totally incompatible. One solution for child chatbots specifically is a hybrid system where the traditional model tries first. If its confidence in user intent is too low, it passes the job onto the end-to-end model, which outputs the final result. This would allow pre-coded replies that use custom code such as the aforementioned weather question to be used as well as ensure that certain responses deemed essential will be returned in any case. However, it also adds the flexibility of an end-to-end model. One aspect of a child-centric chat bot which I have not touched upon enough yet is the intrinsic authority of a chatbot over a child. If an adult interacts with a chatbot and the bot makes a mistake in interpretation, the adult would most likely just assume the bot broke. However, a child could take the wrong message and assume the bot is right. This means that in situations with a low confidence level, where the traditional model is unsure of if it is correct, it is good to have a fallback option in end-to-end that might stand a better chance of returning something at least somewhat correct.