End to End Learning for Child Chatbots

End-to-end learning has the potential to replace the traditional pipe-line oriented structure of a chatbot. Instead of having pre-defined discrete components in the pipeline to feed the data through, in end-to-end learning, a deep neural network simply takes in an input and returns an output. In a chatbot’s case, the model will be given a string / query and be expected to return a single response.

This approach has several benefits. The first is perhaps simplicity: the developer does not have to select components to put in the pipeline. This makes it less easier to get started developing it. More importantly, it decreases the amount of knowledge one has to have in the specific area of development in order to get decent results. In the traditional pipeline model, selecting components can only be done well if one understands what’s going on behind the scenes for this specific application. For a children based chat bot, one would likely have to modify the default pipeline given by frameworks such as Rasa until it works optimally. To do that, one would have to understand how each component works and what makes the most sense for responding to children specifically. Instead, with end-to-end, that modification doesn’t have to be done manually.

In addition, despite its elegance, end-to-end learning still achieves good results with limited training time. This makes it practical to substitute it in for traditional pipelines in many situations.

However, there are many drawbacks, some of which relate specifically for chatbots. First is that end-to-end learning models require a lot of data to train effectively. While finding and parsing conversational data normally isn’t always so easily, it’s doubly so for children’s conversational data. As I mentioned in a previous post, training data for conversations involving young children (that can be used ethically for this kind of research/development) is hard to come by, making end to end learning as a catch all solution more difficult.

In addition, many chatbot functions intrinsically require some level of human direction coded into it somewhere in the pipeline. For instance, when one asks “How is the weather in Osaka?”, an end-to-end model cannot be expected to answer the question satisfactorily. In a traditional model, the variable “Osaka” will be taken out at some point in the pipeline, where the developer will then have to code out what kind of api will be used to get the weather. While theoretically, some level of end-to-end learning would be possible in these cases, it’s not really end-to-end learning by definition.

However, end-to-end and traditional models are not totally incompatible. One solution for child chatbots specifically is a hybrid system where the traditional model tries first. If its confidence in user intent is too low, it passes the job onto the end-to-end model, which outputs the final result. This would allow pre-coded replies that use custom code such as the aforementioned weather question to be used as well as ensure that certain responses deemed essential will be returned in any case. However, it also adds the flexibility of an end-to-end model. One aspect of a child-centric chat bot which I have not touched upon enough yet is the intrinsic authority of a chatbot over a child. If an adult interacts with a chatbot and the bot makes a mistake in interpretation, the adult would most likely just assume the bot broke. However, a child could take the wrong message and assume the bot is right. This means that in situations with a low confidence level, where the traditional model is unsure of if it is correct, it is good to have a fallback option in end-to-end that might stand a better chance of returning something at least somewhat correct.

Children Conversational Training Data for Machine Learning

While I have written quite a bit about the potential uses of a chatbot in educating young children, I am not the first person to ever get the idea. Indeed, the limitations in this specific application do not seem to be idea-based primarily, but instead based on other practical factors.

One such limitation, at the very least a limitation for smaller entities and startups creating chatbots, is a lack of publicly available annotated conversations (training data) by young children. Such data is essential to train NLP tools to correctly identify the meaning behind early childhood language. Without the data, any chatbot geared towards young children would not be very useful, since without understanding the the purpose of the child’s words, it would fail to give an adequate response no matter how well thought out that response is itself.

While there are many pieces of children conversational data lying around the internet, several factors make many inapplicable for practical usage. First are university ethics guidelines, which usually state that conversational data from children must be collected specifically for research as opposed to being simply sold to research as an afterthought. Then, such data must be cleaned up and/or transcribed, which is again harder in the case of messy/unintelligible children. In addition, with children, small age differences have big implications for speech. Hence, it’s essential that any dataset has metadata including child age (or be limited to a small age bracket altogether). Gender could potentially be relevant as well.

“A surprisingly small number of corpora have been produced which specifically contain child and/or teenage language”

Children Online: a survey of child language and CMC corpora (Baron, Rayson, Greenwood, Walkerdine and Rashid)

Even accounting for these challenges, one study finds that a “surprisingly small number of corpora have been produced which specifically contain child and/or teenage language.” It is worth noting that this study’s focus was skewed by their specific application of “protection of children online” and their status as a British university, meaning that datasets that were otherwise pretty valid but were mostly of Americans had that listed as a con, when in reality, it might be a good thing to have a chatbot most fluent in a relatively generic, American vernacular. However, on the flip side, it might not have emphasized enough the lack of younger-child focused datasets (many were broadly K-12 or only late teen).

One corpora that I found separately but was also mentioned in the study was CHILDES, a database of children primarily 5 and younger. It stood out to me for the breadth of data and the precise age-range for the conversations, while not finding the low amounts of British English speakers to be as much of an initial problem as the researchers did. I will certainly explore this corpora further and start training with it.

Looking Behind the Surface for Child-Oriented Chatbots

Previously, I mentioned how a chatbot designed for children has to treat its interactions fundamentally differently than one made for adults. The exigence of a communication between adult and robots, in most cases, “I need help” or “I was re-directed here instead of human support”, is different from the exigence of most child-robot communications, where a child can’t be reasonably expected to try to get anything out of what he or she probably sees as a conversation with a robotic friend. However, this makes the job of a child-oriented chatbot all the more challenging when attempting to deal with or otherwise account for emotional issues of a child.

Of course, this somewhat applies to normal chatbots. One previous example was Woebot, aimed at psychological health. However, the website mentions that Woebot establishes “a bond with users that appears to be non-inferior to the bond created between human therapists and patients.” This implies that at least in part, Woebot gauges emotion due to the patient explicitly stating his/her emotions as would happen in a therapist / patient relationship. Indeed, the exigence of the bot is being downloaded specifically for the purposes of mental health.

Child-oriented chatbots wouldn’t have this same luxury. Even disregarding the fact that not many children I know can adequately express their feelings if they wanted to, if a chatbot adopts a persona of a friend or mentor, it would be more difficult to establish a need to express feelings since children wouldn’t talk to the bot in a non-casual way. While a chatbot can always just ask “how are you feeling?”, this most likely wouldn’t yield accurate results all of the time (imagine asking this question yourself). Instead, a chatbot would have to imply emotions based on the language used.

Given adequately labelled data, natural language models can identify both stress levels and emotion in text. However, it’s unclear if the same method used in the study can be used for the language of young children, especially since with a decreased vocabulary (meaning less emotionally-charged meanings), a lot of human ability to interpret the emotions of young children (for me anyways) is based around non-verbal cues and vocal inflections that can’t be fed into a chatbot.

Connecting a Rasa Chatbot to Facebook Messenger

Recently, I was curious about what kind of messaging apps/interfaces would ultimately work with a rasa chatbot. While any front-end would technically work as long as there’s an api for that front-end to access text to be fed directly to the Rasa chatbot server, there are other chat bot features that are not as simple as feeding plain text across apis. One example is the button, which lets users click on a button to select an option to direct the conversation further. A chat application such as Facebook Messenger with its own Rasa-created channel connector supports these buttons without manual coding on the part of the developer. However, even for an application like discord that supports buttons beneath messages in the form of “reactions”, since it doesn’t have its own dedicated channel connector, it’s more to get buttons to work as they should and connect to the chatbot appropriately. This post will be about connecting to Facebook Messenger as, even though the process is simpler than creating a custom channel connector (Discord, for example), there are still some specific things to keep in mind and a resource like this would have saved me quite a bit of time. An alternate explanation, the one I followed, is provided by Rasa here.

The first step is installing ngrok. At least during development, this is necessary to forward your rasa bot on your localhost to an accessible web address. In addition, ngrok automatically ensures that your web connection is https instead of the default http connection with rasa (it gives you both http and https addresses), which is important since Messenger only works with https connections. To install ngrok, simply download it here and run the commands listed on the page. For the port number to forward, forward whichever port number Rasa uses for webhooks, which is normally 5005. If your chatbot is in production and has a non-localhost web location, you can skip this step and instead, just check if yourrasahostname.com:5005 returns anything.

Next, create a facebook page. As I learned, it has to be published and publicly available for messenger to work at all, even for testers in the development phase of the chatbot. It can be named whatever, and by default, messenger should be enabled on the page. Then, go to Facebook for Developers and click add a new app. Find messenger in the products section and set it up. Scrolling down the settings, the first section you should see is access tokens. Click add or remove pages and add the page you just made. It will give you a warning most likely concerning authorization or approval. While this would be pertinent for your chatbot to be publicly accessible, if you are just testing it out like me, this isn’t an issue for now. Ignore the warning and give it the permissions it needs. Click on the generate token button and record the token since it is needed later. Open a new tab for the settings of your app and record the app secret found there. You can close the app settings after that.

Locate your app’s credentials.yml file. Add the following to the end, with verify equalling any string you like, secret being the string you got from app secret, and page-access-token being the token you got from generate token.

facebook:
  verify: "rasa-bot"
  secret: "7e238b451c238ad8375923vm27542f9"
  page-access-token: "EBXS9I0Uvj53BAJ0fNl4yzz81KiYnsiZC8x29fZBGsdfWwceITcOu6RkDuVqf53CWefsdfEGGfWFxB3EchxJZCtvlL3SFwerfSDFEtZBUGMgfHWFUohJZCJej5BdjJjw3eoJojeJJA1FlXLU0CEUIHppsRQTq96L9I5UagAD43dgfwOe"

Go back to the settings for the messenger app specifically. The next section you will see on the same messenger settings page is labelled webhooks. In callback url, put

https://[yourngrokurl]/webhooks/facebook/webhook

and for the verify token, put whatever string you put for “verify” in your credentials.yml file. After starting up your rasa bot, you should be able to talk with your bot via messenger (visit your page for the link). Without going through the approval process, you can also add testers to allow them to also use your bot through the roles page.

Building a Docker Instance for a Rasa Chatbot

I recently started the development of my chatbot through Rasa, a framework that provides a pipeline combining existing Natural Language Processing (NLP) technologies and Rasa Core, which determines how the chat bot should respond. The installation of Rasa itself is relatively straightforward. However, in the course of my development, I want to use a tool called Rasa X, which lets me generate and annotate training data for my chat bot by chatting with it without having to input the data manually into the file. Setting up Rasa X to train and run a chat bot is more complicated in that the default installation process would most likely result in errors, as it did for me. It requires several tweaks and specific instructions that can only be found through trial and error, and works better in a Linux environment. To ensure that I wouldn’t have to reproduce these steps every single time I want to set up Rasa, and for the aforementioned Linux reason I decided to set up a Docker container, substituting these steps with a command or two instead. Here’s how I did it:

First, I pulled a base docker image for python on Linux and ran it (note that you have to install Docker Desktop and WSL 2 beforehand if you haven’t done so).

docker pull jupyter/base-notebook:python-3.8.6

docker run -p 5005:5005 -p 5002:5002 -p 80:80 -p 8888:8888 --name rasa -e GRANT_SUDO=yes --user root -e JUPYTER_ENABLE_LAB=yes -v %cd%:/home/jovyan jupyter/base-notebook:python-3.8.6

Then, I ran the following commands inside the CLI prompt of the container to install specific versions of packages that work. The first two lines are critical since installing rasa-x normally with the default dependencies would otherwise lead to a lot of library conflicts. It also installs spaCy, an open source natural language processing library that will be used with Rasa.

pip3 install --upgrade pip==20.2
conda install ujson==1.35 -y
pip3 install rasa-x==0.39.3 --extra-index-url https://pypi.rasa.com/simple
pip3 install spacy==3.0.6 PyDictionary bs4 lxml mathparse discord click==7.1.1
spacy download en_core_web_md

Finally, all that’s left to do is push your own container to Docker hub, replacing julianweng/cory with [yourdockerusername]/[anyprojectname]. On your host (Windows CMD), run the following.

docker commit rasa julianweng/cory:v1.0 
docker tag julianweng/cory:v1.0 julianweng/cory:latest
docker push julianweng/cory:v1.0
docker push julianweng/cory:latest

The final product that I produced can be found at https://hub.docker.com/repository/docker/julianweng/cory. With this, you can quickly set up a Rasa project with Rasa X with all needed dependencies. If you just want to use the docker image and get started with Rasa instead, CD to the directory you want your bot to be in and run these commands instead.

docker run -d -p 5005:5005 -p 5002:5002 -p 80:80 -p 8888:8888 --name rasa -e GRANT_SUDO=yes --user root -e JUPYTER_ENABLE_LAB=yes -v %cd%:/home/jovyan julianweng/cory

rasa init --no-prompt