For this feature, Facebook engineers were faced with an obstacle from the very beginning: ML systems need to be trained with relevant examples, but those examples were scarce in this case. Content related to blood donation was common enough to justify a new feature but still relatively rare, with thousands of pieces of relevant text scattered across billions of user-generated posts, and even those examples hadn’t been collected into a usable data set. An AI system would have to be accurate enough to isolate those (comparatively) few posts but also precise enough to avoid misinterpretations, which could lead to unwanted messages and negative reactions that might limit the feature’s impact.
It was, in other words, the AI version of a chicken-and-egg dilemma — a feature that called for a system trained using examples that weren’t readily available.
Solving this challenge required the AI engineers working on the feature to take a different approach.
Using AI to train better AI
To break down the steps that led to the blood donations feature’s final AI model, let’s circle back to the foundation of virtually every kind of AI: training.
Training on data is how AI learns. But it’s more than that, often determining the very structure of systems that have no other input. While humans might use flash cards and textbooks to learn, those are just training aids that build on people’s preexisting knowledge and ability to put new information into context. AI models, on the other hand, have no innate powers of learning and self-improvement. Most train on only the information they’re given, which has to be as specific and as purpose-oriented as possible, because machines have neither the versatility nor the humanlike agency to learn from general data. AI does what it’s trained to do, and, in the case of blood donation-related content, there weren’t enough useful examples to properly train the system.
Our engineers' response to this problem was a hand-tuned approach to AI, and specifically the creation of a simple ML model that could begin to find additional training data, however imperfect. This resulted in more models, each one finding the equivalent of better flash cards and textbooks and passing them on to the next system, with the ultimate goal of AI that could reliably identify that someone was expressing interest in giving blood.
One reason this task was so challenging — including finding examples for training purposes — was that text related to blood donation isn’t limited to a single kind of interaction. Relevant content ranges from posts in Groups that mention a potential donor’s location to freeform text posts that specify nothing beyond an interest in blood donations. The AI system must reliably identify the right posts and comments.
The training process started with the system simply looking for posts that contained relevant keywords, such as “blood” and “drive.” It then analyzed that set of posts in order to train a more sophisticated AI system that could recognize relevant posts even when they didn’t contain those exact words. That AI system, in turn, was used to spot more example posts, which were then used to train an even more effective AI system. Creating and deploying this series of fine-tuned ML systems took three days, while a more standard approach would have taken months.
When we later expanded the feature to work in more countries, our engineers adjusted the system to accommodate additional languages and different character sets. In Pakistan, for example, the AI was trained to recognize posts that were related to blood donation and written either in Urdu characters or in Anglicized Urdu. Training the system to understand the original posts rather than text that had been converted to Latin characters meant it was better able to spot nuances that indicated interest in blood donation.
A case study in curated machine learning
Counterintuitive as it might seem, part of our approach to improve accuracy was to allow certain types of false positives into the system during training. Because of the scarce training data, it was important to assess the kinds of false positives that were specific to content related to blood donation. For example, a person who posts “I donated blood today and I’m happy I did it” is clearly talking about a relevant activity, but it wouldn’t be appropriate to invite someone to give again so soon after he or she had donated. Though the system wouldn't act on this kind of example, it was useful training data for the AI model.
And since no AI system is perfect, our engineers adjusted the model to allow in another class of example, called near-positive negatives. These were cases in which the content was in the same overall domain (related to organ donation, for example) but not necessarily directly related to blood donation or requests. They helped improve the final AI system’s understanding of the exceptions and fringe examples that reveal what is and isn’t related to blood donation.