Hotdogs, Emoji and Faces: How AI Is Learning to See
If you’ve noticed a huge surge in companies saying they’re using artificial intelligence lately, you aren’t alone. Even simple-sounding tech is suddenly “built on AI”, such as the hotdog identifier Not Hotdog or Google’s Allo chat app, which turns selfies into custom emoji.
If you’re the skeptical type, you’re probably distrustful of these claims. Buzzwords like “big data” and “cloud processing” were used by pretty much every startup in their heyday, too. But some of these whimsical apps are surprisingly legit.
Not Hotdog, for instance, is built on top of Google’s machine learning library, Tensorflow, plus the open source Keras neural network. Over several months, its developers tested multiple frameworks, trained their AI on edge cases, and figured out how to apply the results using only mobile phone processors.
Small applications like Not Hotdog offer a window into the larger challenge of accurately tagging thousands of distinct objects, which we’ve spent years on at GumGum. Eye color, for example, is tricky for machines. A convolutional neural network learns to recognize eyes by breaking images of faces into pieces, examining each piece separately, then merging the resulting data to consider as a whole. Do the pieces have characteristics of a face? If so, which is the eye, and what color is it? Once it has the answer, we can overlay a glow of the correct color and size on a person’s iris in an image.
While each neural network is its own unique model, some details are consistent: neural networks, for instance, loosely mimic the way our visual cortex processes visual information, thus some of the underlying mathematics is shared across applications. Several of the most important ideas can be learned from The Visionary’s micro-course on machine learning and computer vision.
The web is going visual: YouTube reports a billion hours of video watched every day. Instagram gets 95 million uploads per day. Mary Meeker’s Internet Trends 2017 report notes that images and video will soon be the core tech for search, augmented reality, social networking and advertising. Computer vision is needed to assist all of these uses, as well as important real-world applications like autonomous vehicles.
So the profusion of computer vision apps and developers working on them actually couldn’t come at a better time. For more of the developments in computer vision, subscribe to our newsletter.
This article originally appeared on Engadget.
Illustrations by Sergio Membrillas
An Interactive History of Internet Ad Targeting
By the mid-aughts, advertising technology was exploding. Companies offering behavioral and contextual targeting solutions promised to multiply adv...
The Man Who Stopped the Internet with a Single Color
Though there’s been plenty of finger-pointing in the aftermath of Fyre Festival, the guy behind the orange tiles, graphic designer and social media...
More Cool Uses of Computer Vision (and How Marketers Can Make the Most of Them)
The same innovations that are helping users sign into their phones faster or apply special effects inexpensively are also useful to brands and mark...