Blog

End to End Learning for Child Chatbots

End-to-end learning has the potential to replace the traditional pipe-line oriented structure of a chatbot. Instead of having pre-defined discrete components in the pipeline to feed the data through, in end-to-end learning, a deep neural network simply takes in an input and returns an output. In a chatbot’s case, the model will be given a string / query and be expected to return a single response.

This approach has several benefits. The first is perhaps simplicity: the developer does not have to select components to put in the pipeline. This makes it less easier to get started developing it. More importantly, it decreases the amount of knowledge one has to have in the specific area of development in order to get decent results. In the traditional pipeline model, selecting components can only be done well if one understands what’s going on behind the scenes for this specific application. For a children based chat bot, one would likely have to modify the default pipeline given by frameworks such as Rasa until it works optimally. To do that, one would have to understand how each component works and what makes the most sense for responding to children specifically. Instead, with end-to-end, that modification doesn’t have to be done manually.

In addition, despite its elegance, end-to-end learning still achieves good results with limited training time. This makes it practical to substitute it in for traditional pipelines in many situations.

However, there are many drawbacks, some of which relate specifically for chatbots. First is that end-to-end learning models require a lot of data to train effectively. While finding and parsing conversational data normally isn’t always so easily, it’s doubly so for children’s conversational data. As I mentioned in a previous post, training data for conversations involving young children (that can be used ethically for this kind of research/development) is hard to come by, making end to end learning as a catch all solution more difficult.

In addition, many chatbot functions intrinsically require some level of human direction coded into it somewhere in the pipeline. For instance, when one asks “How is the weather in Osaka?”, an end-to-end model cannot be expected to answer the question satisfactorily. In a traditional model, the variable “Osaka” will be taken out at some point in the pipeline, where the developer will then have to code out what kind of api will be used to get the weather. While theoretically, some level of end-to-end learning would be possible in these cases, it’s not really end-to-end learning by definition.

However, end-to-end and traditional models are not totally incompatible. One solution for child chatbots specifically is a hybrid system where the traditional model tries first. If its confidence in user intent is too low, it passes the job onto the end-to-end model, which outputs the final result. This would allow pre-coded replies that use custom code such as the aforementioned weather question to be used as well as ensure that certain responses deemed essential will be returned in any case. However, it also adds the flexibility of an end-to-end model. One aspect of a child-centric chat bot which I have not touched upon enough yet is the intrinsic authority of a chatbot over a child. If an adult interacts with a chatbot and the bot makes a mistake in interpretation, the adult would most likely just assume the bot broke. However, a child could take the wrong message and assume the bot is right. This means that in situations with a low confidence level, where the traditional model is unsure of if it is correct, it is good to have a fallback option in end-to-end that might stand a better chance of returning something at least somewhat correct.

Children Conversational Training Data for Machine Learning

While I have written quite a bit about the potential uses of a chatbot in educating young children, I am not the first person to ever get the idea. Indeed, the limitations in this specific application do not seem to be idea-based primarily, but instead based on other practical factors.

One such limitation, at the very least a limitation for smaller entities and startups creating chatbots, is a lack of publicly available annotated conversations (training data) by young children. Such data is essential to train NLP tools to correctly identify the meaning behind early childhood language. Without the data, any chatbot geared towards young children would not be very useful, since without understanding the the purpose of the child’s words, it would fail to give an adequate response no matter how well thought out that response is itself.

While there are many pieces of children conversational data lying around the internet, several factors make many inapplicable for practical usage. First are university ethics guidelines, which usually state that conversational data from children must be collected specifically for research as opposed to being simply sold to research as an afterthought. Then, such data must be cleaned up and/or transcribed, which is again harder in the case of messy/unintelligible children. In addition, with children, small age differences have big implications for speech. Hence, it’s essential that any dataset has metadata including child age (or be limited to a small age bracket altogether). Gender could potentially be relevant as well.

“A surprisingly small number of corpora have been produced which specifically contain child and/or teenage language”

Children Online: a survey of child language and CMC corpora (Baron, Rayson, Greenwood, Walkerdine and Rashid)

Even accounting for these challenges, one study finds that a “surprisingly small number of corpora have been produced which specifically contain child and/or teenage language.” It is worth noting that this study’s focus was skewed by their specific application of “protection of children online” and their status as a British university, meaning that datasets that were otherwise pretty valid but were mostly of Americans had that listed as a con, when in reality, it might be a good thing to have a chatbot most fluent in a relatively generic, American vernacular. However, on the flip side, it might not have emphasized enough the lack of younger-child focused datasets (many were broadly K-12 or only late teen).

One corpora that I found separately but was also mentioned in the study was CHILDES, a database of children primarily 5 and younger. It stood out to me for the breadth of data and the precise age-range for the conversations, while not finding the low amounts of British English speakers to be as much of an initial problem as the researchers did. I will certainly explore this corpora further and start training with it.

Looking Behind the Surface for Child-Oriented Chatbots

Previously, I mentioned how a chatbot designed for children has to treat its interactions fundamentally differently than one made for adults. The exigence of a communication between adult and robots, in most cases, “I need help” or “I was re-directed here instead of human support”, is different from the exigence of most child-robot communications, where a child can’t be reasonably expected to try to get anything out of what he or she probably sees as a conversation with a robotic friend. However, this makes the job of a child-oriented chatbot all the more challenging when attempting to deal with or otherwise account for emotional issues of a child.

Of course, this somewhat applies to normal chatbots. One previous example was Woebot, aimed at psychological health. However, the website mentions that Woebot establishes “a bond with users that appears to be non-inferior to the bond created between human therapists and patients.” This implies that at least in part, Woebot gauges emotion due to the patient explicitly stating his/her emotions as would happen in a therapist / patient relationship. Indeed, the exigence of the bot is being downloaded specifically for the purposes of mental health.

Child-oriented chatbots wouldn’t have this same luxury. Even disregarding the fact that not many children I know can adequately express their feelings if they wanted to, if a chatbot adopts a persona of a friend or mentor, it would be more difficult to establish a need to express feelings since children wouldn’t talk to the bot in a non-casual way. While a chatbot can always just ask “how are you feeling?”, this most likely wouldn’t yield accurate results all of the time (imagine asking this question yourself). Instead, a chatbot would have to imply emotions based on the language used.

Given adequately labelled data, natural language models can identify both stress levels and emotion in text. However, it’s unclear if the same method used in the study can be used for the language of young children, especially since with a decreased vocabulary (meaning less emotionally-charged meanings), a lot of human ability to interpret the emotions of young children (for me anyways) is based around non-verbal cues and vocal inflections that can’t be fed into a chatbot.

Connecting a Rasa Chatbot to Facebook Messenger

Recently, I was curious about what kind of messaging apps/interfaces would ultimately work with a rasa chatbot. While any front-end would technically work as long as there’s an api for that front-end to access text to be fed directly to the Rasa chatbot server, there are other chat bot features that are not as simple as feeding plain text across apis. One example is the button, which lets users click on a button to select an option to direct the conversation further. A chat application such as Facebook Messenger with its own Rasa-created channel connector supports these buttons without manual coding on the part of the developer. However, even for an application like discord that supports buttons beneath messages in the form of “reactions”, since it doesn’t have its own dedicated channel connector, it’s more to get buttons to work as they should and connect to the chatbot appropriately. This post will be about connecting to Facebook Messenger as, even though the process is simpler than creating a custom channel connector (Discord, for example), there are still some specific things to keep in mind and a resource like this would have saved me quite a bit of time. An alternate explanation, the one I followed, is provided by Rasa here.

The first step is installing ngrok. At least during development, this is necessary to forward your rasa bot on your localhost to an accessible web address. In addition, ngrok automatically ensures that your web connection is https instead of the default http connection with rasa (it gives you both http and https addresses), which is important since Messenger only works with https connections. To install ngrok, simply download it here and run the commands listed on the page. For the port number to forward, forward whichever port number Rasa uses for webhooks, which is normally 5005. If your chatbot is in production and has a non-localhost web location, you can skip this step and instead, just check if yourrasahostname.com:5005 returns anything.

Next, create a facebook page. As I learned, it has to be published and publicly available for messenger to work at all, even for testers in the development phase of the chatbot. It can be named whatever, and by default, messenger should be enabled on the page. Then, go to Facebook for Developers and click add a new app. Find messenger in the products section and set it up. Scrolling down the settings, the first section you should see is access tokens. Click add or remove pages and add the page you just made. It will give you a warning most likely concerning authorization or approval. While this would be pertinent for your chatbot to be publicly accessible, if you are just testing it out like me, this isn’t an issue for now. Ignore the warning and give it the permissions it needs. Click on the generate token button and record the token since it is needed later. Open a new tab for the settings of your app and record the app secret found there. You can close the app settings after that.

Locate your app’s credentials.yml file. Add the following to the end, with verify equalling any string you like, secret being the string you got from app secret, and page-access-token being the token you got from generate token.

facebook:
  verify: "rasa-bot"
  secret: "7e238b451c238ad8375923vm27542f9"
  page-access-token: "EBXS9I0Uvj53BAJ0fNl4yzz81KiYnsiZC8x29fZBGsdfWwceITcOu6RkDuVqf53CWefsdfEGGfWFxB3EchxJZCtvlL3SFwerfSDFEtZBUGMgfHWFUohJZCJej5BdjJjw3eoJojeJJA1FlXLU0CEUIHppsRQTq96L9I5UagAD43dgfwOe"

Go back to the settings for the messenger app specifically. The next section you will see on the same messenger settings page is labelled webhooks. In callback url, put

https://[yourngrokurl]/webhooks/facebook/webhook

and for the verify token, put whatever string you put for “verify” in your credentials.yml file. After starting up your rasa bot, you should be able to talk with your bot via messenger (visit your page for the link). Without going through the approval process, you can also add testers to allow them to also use your bot through the roles page.

Building a Docker Instance for a Rasa Chatbot

I recently started the development of my chatbot through Rasa, a framework that provides a pipeline combining existing Natural Language Processing (NLP) technologies and Rasa Core, which determines how the chat bot should respond. The installation of Rasa itself is relatively straightforward. However, in the course of my development, I want to use a tool called Rasa X, which lets me generate and annotate training data for my chat bot by chatting with it without having to input the data manually into the file. Setting up Rasa X to train and run a chat bot is more complicated in that the default installation process would most likely result in errors, as it did for me. It requires several tweaks and specific instructions that can only be found through trial and error, and works better in a Linux environment. To ensure that I wouldn’t have to reproduce these steps every single time I want to set up Rasa, and for the aforementioned Linux reason I decided to set up a Docker container, substituting these steps with a command or two instead. Here’s how I did it:

First, I pulled a base docker image for python on Linux and ran it (note that you have to install Docker Desktop and WSL 2 beforehand if you haven’t done so).

docker pull jupyter/base-notebook:python-3.8.6

docker run -p 5005:5005 -p 5002:5002 -p 80:80 -p 8888:8888 --name rasa -e GRANT_SUDO=yes --user root -e JUPYTER_ENABLE_LAB=yes -v %cd%:/home/jovyan jupyter/base-notebook:python-3.8.6

Then, I ran the following commands inside the CLI prompt of the container to install specific versions of packages that work. The first two lines are critical since installing rasa-x normally with the default dependencies would otherwise lead to a lot of library conflicts. It also installs spaCy, an open source natural language processing library that will be used with Rasa.

pip3 install --upgrade pip==20.2
conda install ujson==1.35 -y
pip3 install rasa-x==0.39.3 --extra-index-url https://pypi.rasa.com/simple
pip3 install spacy==3.0.6 PyDictionary bs4 lxml mathparse discord click==7.1.1
spacy download en_core_web_md

Finally, all that’s left to do is push your own container to Docker hub, replacing julianweng/cory with [yourdockerusername]/[anyprojectname]. On your host (Windows CMD), run the following.

docker commit rasa julianweng/cory:v1.0 
docker tag julianweng/cory:v1.0 julianweng/cory:latest
docker push julianweng/cory:v1.0
docker push julianweng/cory:latest

The final product that I produced can be found at https://hub.docker.com/repository/docker/julianweng/cory. With this, you can quickly set up a Rasa project with Rasa X with all needed dependencies. If you just want to use the docker image and get started with Rasa instead, CD to the directory you want your bot to be in and run these commands instead.

docker run -d -p 5005:5005 -p 5002:5002 -p 80:80 -p 8888:8888 --name rasa -e GRANT_SUDO=yes --user root -e JUPYTER_ENABLE_LAB=yes -v %cd%:/home/jovyan julianweng/cory

rasa init --no-prompt

The Simple Things of Education

Sometimes, nurturing and educating a child can seem more like an enigma or a theoretical ideal than an achievable goal. This is felt on an individual level, when thousands of books, magazine, and commentators float around offering contradictory advice, but it could also seem that way on a societal level. It’s a common fact that the socioeconomic status of a child is the top predictor in his or her educational outcome. Coupled with how intrinsic inequality in SES is in our society, it could seem impossible to fix educational deficiencies on a societal level.

However, despite these massive and seemingly overwhelming correlations, there are many simple actions that can be done on a personal parent to child level that have huge impacts on life outcomes. A post-WWII study in Britain, tracking some 14,000 babies born in 1946, unveiled several basic actions that led to disproportionately successful outcomes for their recipients. These include talking and listening to kids, teaching letters and numbers, reading to kids, and maintaining a regular bedtime.

Perhaps some of the given suggestions, such as taking children on excursions outside, could only be achieved through more active parenting. However, many of these can be supplanted with external help. A chatbot (such as the one I am developing) can remind children of bedtimes easily using pre-existing notification concepts. A little more development, and it can incorporate teaching of letters and read stories. Using contextual AI, it has a good shot of talking and listening in a roughly similar manner to an adult.

Much of this sentiment is echoed in the development of the new mentor robot Moxie. Already, it functions much like an actual human both physically and in social interactions. However, if we focus all our attention on visibly sci-fi concepts such as these, we risk losing the idea of new technology allowing us to reduce inequities and improve educational outcomes for everybody through increased accessibility. Indeed, first developed as a specialist solution for kids with special needs, Moxie is now on the general market for a price of $1699, on the market for kids whose parents can afford to spend that kind of money on unproven technologies that have a history of failing.

The Personality of the Teacher

It is commonly thought that the personality of a teacher has a great impact on his or her ability to teach, for good reason. Personality inevitably affects everyday interactions between teacher and student. Just as with every other job or action, certain personality traits would work better for a teacher than others.

One attempt to quantify personality is the Big Five personality domains for teachers: openness to experiences, conscientiousness, extraversion, agreeableness, and emotional stability. Openness deals with an appreciation for new things: novel ideas, untried experiences, curiosity. Conscientiousness measures self-discipline and control of impulses. Extraversion is what you might expect: the inclination to interact with the external, social world. Agreeableness measures how concerned one is with the opinions of others. Emotional stability is how well one reacts to stress and other negative emotions.

A 2019 study published in Educational Psychology Review analyzed the correlation between these domains and two educational outcomes, one of them being teacher effectiveness (Kim). It found that every domain except agreeableness is positively associated with teacher effectiveness. Teachers that were more open to new experiences, more conscientious, more extroverted, and more emotionally stable tended to be better teachers.

This tendency also might extend to learning assistants such as chatbots. A 2018 study found that the personality of a chatbot has a “significant positive effect on the user experience of chatbot interfaces” (Smestad). However, this use of personality measures the level of personality (as in, does the bot have personality) instead of the type of personality. Regardless, this could still line up with the previously mentioned findings in human teachers since it is usually considered that extroverted, curious people can appear more “personable”. A 2018 article differentiates user preference based on the purpose of the chat bot. It found that people prefer “slow types” of personality, submission and compliance, for bots based around counseling like most education related bots would be (Kang). This especially ties into the positive relationship between agreeableness and teaching effectiveness.

Chatbots in Childhood Education

Advanced chatbots, using conversational, contextual AI, have versatile and promising applications. Chatbots have many advantages inherent within an automated platform, including promptness of response, scalability and accessibility. Recent advances have also increased the amount of personalization a chatbot can offer. This is especially evident when compared to existing communication methods such as mass emails and push notifications sent out to every user of an app.

This has been proven in traditionally human fields such as mental care. A company called Woebot created a chatbot that provides “continuous emotional support” to its users. It forges a personalized connection to the user to glean useful information for human specialists and to help deal with symptoms of stress and anxiety. This allows for human-like support that isn’t limited by doctor availability or cost.

Higher education institutions have also begun to use chat bots. Georgia State University rolled out a bot that helped students with enrolling and getting to college, decreasing the amount of admitted students who didn’t show up by 19%. Response levels from students were much higher with the bot than with email reminders. Chatbot platform Acquire identifies several different functions a chatbot can perform in education: providing information about school, administrative support, offering reminders and assistance, tutoring, and engaging students.

A chatbot built for younger students would most likely focus on the third, fourth, and fifth functions due to the nature of elementary schooling. As mentioned in a previous post, simple nudges and reminders are especially essential for younger children. In addition, children are less likely to be able to navigate information sources on their own. Communication-based chatbots are naturally easier interfaces for anyone, especially children, to use. Being able to answer basic questions and guide exploration would be a major benefit in and of itself in the education of young children.

Of course, a potential chatbot would have to be tailored for younger children specifically. It has to use grade-appropriate wording, while ideally selectively using new vocabulary to promote linguistic growth. There’s also a higher barrier to reach in terms of chatbot personality: a child would have little intrinsic motivation to keep on talking to a chatbot if it doesn’t act like a human. These are challenges that must be addressed in any chat bot dedicated towards childhood education.

What Changes a Child’s Attention Span

A child’s attention span — how long he or she can spend on a task before getting distracted — increases with age. In a 1990 study of young children in play, researchers found that a child’s duration of “focused attention” on the toys correlated with the age of the child. Older children focused more on problem solving and were less distracted by other physical movements. This means that the increase in attention span was not only due to the intrinsic development of the older child, but also the increased complexity of his activities.

In addition, the conditions in which these activities are presented also impact attention span. When frequently asked to “stay on task”, preschoolers paid less attention to distractions and more attention on the task itself. Simply repeating instructions to focus has a great effect on a child’s attention span, and can be implemented in any classroom or environment.

Children’s attention spans are also dependent upon how big each activity is. Sites such as parents.com claim that breaking a task into small pieces can keep children engaged more. This is corroborated in a 2010 study, which showed that young children allocated attention similarly to adults with small arrays of information. This means that children don’t have smaller attention spans because of inefficiencies in their memory allocation, but simply because they don’t have as much working memory to work with.

When dealing with younger children, it’s important to note the “why” behind general principles of education. It’s easy to assume that the reason of breaking up activities for children is simply to make each activity fit into smaller attention spans. However, this would miss activities that are short in duration, but are too complex to keep a child’s attention regardless. These nuances must be kept in consideration.

The Winners and Losers in Childhood Development

In a 2013 study at Stanford University, 48 infants of diverse economic socioeconomic status (SES) were tracked from 18 – 24 months of age and measured in language proficiency. Researchers found that at 18 months, children of higher SES were already significantly better off in vocabulary and language processing efficiency, and by 24 months, there developed “a 6-month gap between SES groups in processing skills critical to language development” (Fernald 2013).

This could easily be linked to the 1995 Hart and Risley Study, which found that children from professional families heard a significantly greater quantity of words per hour on average (2153 words) than those from working class (1251) or welfare-recipient (616) families, leading to the former having larger vocabularies. (Hart 1995). They reach the stark conclusion that the “most important aspect of children’s language experience is quantity”, though they note that children in professional families were encouraged more and discouraged less than their counterparts in less well-off families.

This gap is not just in vocabulary. According to an article published in Psychophysiology, “SES disparities in neurocognitive functioning have been shown across the domains of language, EF, memory, and social-emotional processing on both the behavioral and neurobiological levels” (Ursache 2017). They note a couple possibilities why these disparities could exist. As previously mentioned, language stimulation is a major distinction between households of different SES. In addition, the stresses of poverty could lead to “inconsistent, unpredictable, and non responsive parenting behaviors”, harming emotional development, and also lead to less time and energy being spent on supportive parenting.

Even though educational systems cannot solve the root issue of socioeconomic disparity, our preliminary understanding still points towards certain behaviors that would help solve the problem. In the classroom, educators can introduce more new words in their day to day speech as well as help parents to do the same (Colker 2014). Attention needs to be paid towards consistency and responsiveness in educator-child interactions, especially if these qualities are lacking in those of the parent and child.