How Mia Learns Kids’ Interests
Conversational AI through Natural Language Understanding, Machine Learning, and Open Learner Models
By Darren Cambridge
When making book recommendations, Mia looks carefully at each child’s interests. There are good reasons for that: Reading comprehension increases when young people read books that interest them, and having a clear understanding of their interests increases their psychological well-being and sense of purpose.
But how does Mia know what a child’s interests are? Most reading software, if it considers interests at all, simply asks readers to select broad topics or genres from pre-determined lists. At Mia Learning, we treat interests as both broader and more nuanced. They are broader in the sense that they include not only interest general themes or topics of books, but also characteristics such as the type of characters they contain or the writing style used by the author. They are more nuanced in the sense that they include finer-grained topics.
Mia can’t ask a child to pick from a list. That would be a crazy-boring, hour-long conversation, at least. Capturing interests effectively requires a more sophisticated approach. Mia uses two sources of information: what kids say and what they do. Just as a teacher, librarian, or parent would, she listens and observes. However, since she relies on artificial intelligence (AI), these two activities take different forms.
Understanding what kids say about their interests
The first conversation kids have with Secret Agent Mia is the Briefing. Mia asks the children about themselves as readers, building a casefile for each child that she will then use to full her mission of finding great books on their behalf. One of the first questions Mia asks is what interests them: What do they love to read, read about, or learn about? A child might say something like:
“I’m really interested in horses, and Quasimodo amphipod, because Ms. Hare told me about them, and also, I usually love books that make me and my friends laugh.”
To understand this statement, Mia needs to parse and identify key phrases that are interest clues, such as “horses” and “make me and my friends laugh.” She then needs to determine which of the large set of topics, genres, and book characteristics she knows about they express. Here, “horses” suggests the child may be interested in books about “animals.” It may also suggests the child might be interested in “animal fantasies,” a genre in which horses are frequently featured. “Make me and my friends laugh” may suggest a preference for books with a “funny tone” and for “humorous stories” that are focused on making the reader laugh.
These terms are part of the set that professional librarians used to classify books in Mia’s database. They arrange the terms in a hierarchy, so that Mia knows “horses” are a type of “animal.” To identify these key phrases and make the associations, Mia uses what natural language understanding (NLU) engineers call Named Entity Recognition (NER). NER is a process through which software determines that phrases within a text indicate a reference to certain type of thing, or entity. For example, the phrase “horses” refers to the entity “animals.” The simplest way to identify a named entity is to search for verbatim matches between a predetermined set of words or phrases that are reference terms for the categories, or words or phrases that mean basically the same thing, called either paraphrases or synonyms. Mia tries this first.
Because horses are a common type of animal, frequently featured in children’s books, “horse,” is a reference term within the Animals entity, and a variety of ways of saying it—e.g., “animals you ride,” “ponies,” and “stallions”—are paraphrases. The system also automatically takes into account differences in number (“horses” is equivalent to “horse”) and verb tense (“made” is equivalent to “make”). Mia Learning worked with a former Google engineer to develop software to crowdsource the list of paraphrases kids might use, which our content experts refined and our engineers enhanced through adding semantically-similar words from general purpose lexical databases.
This approach, which computer scientists have been using for decades, is limited. For example, the Quasimodo amphipod is a type of insect discovered just this year. Given that it was coined so recently, it’s not surprising that the phrase “Quasimodo amphipod” is not a reference term or synonym listed within the “animals” entity. Generating and maintaining an exhaustive list of every possible animal and every possible way to refer to it would be a Sisyphean task. Mia needs to have a sense of what the names of animals look like in general, not just to know many such names.
The title of the broader, but still quite obscure category of “amphipod” might be a term on Mia’s list, but she also needs to be able to understand that in this instance, “Quasimodo” refers to a specific type of amphipod rather than the hunchback character from Hugo’s famous novel. Similarly, she needs to know “Hare” refers to a person, not a rabbit. In other words, Mia needs to be able to recognize phrases as instances of entities based on the context of the child’s larger statement about interests.
To enable Mia to identify unusual interests, taking context into account, Mia Learning uses machine learning. We have developed neural networks that extend Mia’s ability to do named entity recognition. (More specifically, Mia uses deep convolutional neural networks with residual embedding.) We trained these networks, which build on a general statistical model of English usage, using a large set of real world texts in which people talk about books. Our team annotated all the phrases within these texts that correspond to the types of interests for which Mia listens. Our machine learning technology used some of the annotated documents to build a statistical model and tested its ability to correctly identify the named entities in the remaining ones. It then went through thousands of iterations of adjusting and testing the model to maximize its accuracy.
Determining Interests from What Kids Do
In addition to listening to what kids say about their interests, Mia observes what they do. Anytime a child does something using the app that may be relevant to understanding the child as a reader, Mia records it as an experience (using the Experience API, also known as xAPI, format). Many of these experiences provide clues about children’s interests. For example, suppose the child Mia knows is interested in animals and humorous stories from the statement we have been dissecting chooses C.S. Lewis’s novel The Horse and His Boy from Mia’s recommendations. This action suggests that the child may be interested in books with similar qualities to those of The Horse and His Boy. Although the child may not yet be able to say so, they are likely to be interested in the genre “fantasy fiction” and in books with “courageous” characters. While reflecting on their reading with Mia, if the child reports a highly satisfactory experience with the book, this is an even stronger indicator of these interests.
Refining Interests Through Reflection
Through the process of observation of relevant experiences, Mia learns more and more about children’s interests as they choose books, read them, and reflect on the experience. However, Mia needs each child’s help to ensure that her profile of their interests is accurate. In some cases, Mia’s hypotheses about a child’s interests based on their experience records may be off base. For example, the horse-loving child could have just liked this particular work of fantasy fiction because of its themes or characters and not have much interest in other books from that genre. A child’s interests are also likely to change over time. The early preference for humorous stories might fade as the child increasingly chooses to read for purposes other than entertainment, such as to learn skills related to leathercraft, a new hobby.
To ensure that her understanding of interests is accurate and up-to-date, periodically Mia discusses what she thinks she knows about the a child’s interests with the child. Children can correct Mia’s mistaken conjectures, disavow stated interests that have faded over time, and share new ones. Researchers call using this approach an open learner model.
Through ongoing observation and discussion, Mia develops an increasingly sophisticated picture of a child’s interests that enables her to make increasingly effective book recommendations and to provide increasingly personalized coaching. The child also benefits from this process directly as their understanding of their own interests sharpens through reflection and their interests grow broader and deeper through exposure to new books. I hope this post has helped you understand how Mia uses cutting edge AI technologies to make this possible.