One thing that surprised me about using AI on children’s drawings was just how difficult it is to get a model to predict a good character segmentation map that’s suitable for animation. One reason is that many characters are drawn in a “hollow” manner; part or all of the body is outlined by a stroke, but the inside is unmarked. Because both the inside and outside of the character have the same color and texture, which means you can’t rely on texture cues to infer which pixels belong to the character. This is a fundamental difference between sketches and photographs. We’re still experimenting with different combinations of models, but as of yet, we haven’t found anything that consistently performs as well as a non-deep-learning–based segmentation pipeline.
Are certain kinds of animations more difficult than others to pull off?
JS: One rule of character animation is that the style and quality of the motion should match those of the character. Because these characters are mostly drawn in a flat 2D manner, we flatten the motion capture data down to 2D prior to retargeting it onto the character. This works better for some motions than others. Motion that lies along a single axis, like a boxer throwing a punch, or two axes, like a dancer doing the Charleston, both work well. But if the motion takes place in all three spatial dimensions, like Neo dodging bullets in The Matrix, that won’t look great after it’s applied to the character.
Have you been able to see any children’s reactions to this work?
JS: I have! Some parents sent me reaction videos of their children initially viewing the animations — it’s great seeing them smile and laugh as they watch the output. I also saw a wonderful video of a 2-year-old excitedly emulating the motion of the character on-screen.
Honestly, seeing those reactions has been a highlight of this project and helped convince me it was worthwhile to turn this work into a public demo for everyone to try.
Someday, projects like this may lead to new animation tools for kids and adults, but can you tell us what this shift from real-life images to less conventional images could apply to other types of images?
JS: That’s a great question. I don’t want to extrapolate too broadly about the shift from real-life images to abstract representations, since this project is only a single example. I will mention, though, that even though we focused on the domain of children’s drawings, our demo does a good job of animating people in clip art images. I’m very curious what other sorts of less conventional images people will try to animate and what they will do with the outputs.
Does broadening the scope of the model to human figures from both children and adults introduce a new set of challenges? What are they?
JS: It really depends on the sophistication with which the adults draw. Figures drawn by adults who are amateur artists are similar enough to children’s sketches. However, if the drawings are done with proper perspective and foreshortening, the animations may not work well or be appealing. A different animation pipeline (likely involving creating a 3D representation of the character) might be necessary.
Where do you see this project going in the future? Is there any way that people can add their own sounds or voice-overs and create more complete animations, for example?
JS: In many ways, this is just a first step into the domain of animating children’s sketches. There are more complicated types of animation that require more complicated analysis of the character. Facial animation, hallucinating undrawn parts of the character, and adjusting motion based on the character’s personality all fall into this category. However, these approaches require more data, and there aren’t many large-scale, easily usable, annotated data sets of children’s drawings of human figures. Using this demo, we hope that people will be willing to share their children’s drawings so that we can build this much-needed data set of children’s drawings and share it with other researchers.
Even without more data, though, I think there are a lot of ways to extend this work. Adding sounds is one way. Another is letting people use their own bodies instead of preexisting motion capture clips to drive character motions. Personally, I think it would be very cool to extend this work by creating a video game where you draw the characters. Maybe it’s an action-adventure game, or maybe it’s a puzzle game where the characters need to be drawn in certain ways to complete the puzzles.
How does this project advance AI research? What broader effects do you hope your work will achieve?
JS: At Meta AI, I’m surrounded by brilliant people who are pushing the boundaries of what AI can do. However, I think it’s also important for us to focus on finding the best ways to apply those AI advances. For me, this project is about identifying ways we can combine those recent AI gains with more traditional animation techniques to delight and bring joy to people.
I hope our project inspires others to rethink what’s possible and to create the tools they wished had existed when they were younger. I also hope people are encouraged to share their children’s drawings with us so that we can further develop this demo and allow other researchers to consider how their works might be applied to the domain of children’s drawings.
Click here to learn more about the research. Click here to try it out!