Talk to any artificial intelligence researcher and they’ll tell you that, while A.I. may be capable of complex acts like driving cars and spotting tiny details on X-ray scans, they’re still way behind when it comes to the generalized abilities of even a 3-year-old kid. This is sometimes called Moravec’s paradox: That the seemingly hard stuff is easy for an A.I., while the seemingly easy stuff is hard.
But what if you could teach an A.I. to learn like a kid? And what kind of training data would you need to feed into a neural network to carry out the experiment? Researchers from New York University recently set out to test this hypothesis by using a dataset of video footage taken from head-mounted cameras worn regularly by kids during their first three years alive.
This SAYcam data was collected by psychologist Jess Sullivan and colleagues in a paper published earlier this year. The kids recorded their GoPro-style experiences for one to two hours per week as they went about their daily lives. The researchers recorded the footage to create a “large, naturalistic, longitudinal dataset of infant and child-perspective videos” for use by psychologists, linguists, and computer scientists.
Training an A.I. to view the world like a kid
The New York University researchers then took this video data and used it to train a neural network.
“The goal was to address a nature vs. nurture-type question,” Emin Orhan, lead researcher on the project, told in an email to Digital Trends. “Given this visual experience that children receive in their early development, can we learn high-level visual categories — such as table, chair, cat, car, etc. — using generic learning algorithms, or does this ability require some kind of innate knowledge in children that cannot be learned by applying generic learning methods to the early visual experience that children receive?”
The A.I. did show some learning by, for example, recognizing a cat that was frequently featured in the video. While the researchers didn’t create anything close to a kid version of Artificial General Intelligence, the research nonetheless highlights how certain visual features can be learned simply by watching naturalistic data. There’s still more work to be done, though.
“We found that, by and large, it is possible to learn pretty sophisticated high-level visual concepts in this way without assuming any innate knowledge,” Orhan explained. “But understanding precisely what these machine learning models trained with the headcam data are capable of doing, and what exactly is still missing in these models compared to the visual abilities of children, is still [a] work in progress.”
A paper describing the research is available to read online.