Great Moments In Computer Vision
Teaching computers how to ‘see’ has been a goal of researchers since electronic thinking machines first filled entire basements in the 1950s. But the story of how computer vision developed starts from the very first time humans succeeded at capturing light onto metal plates, and follows through the digital revolution and the contemporary resurgence of artificial intelligence (AI). The evolution of computer vision today impacts everything from facial recognition technology to driverless cars. Here’s how we got where we are now:
LE DAGUERREOTYPE (1839)
In a Paris salon in January 1839, Louis-Jacques-Mandé Daguerre amazed the French Académie des Sciences by demonstrating a way to permanently capture images. The Daguerreotype camera exposed light to sheets of copper coated with silver iodide, which were developed using mercury vapor and salt. Voilà! Photography was born.
Film is introduced (1888) In 1888, George Eastman introduced the Kodak, a simple box camera with a fixed lens. The contraption’s key innovation: It saved images to celluloid film, rather than metal plates. Owners could capture up to 100 images, but needed to send the device back to the factory to process pictures and reload the camera. Instant photos (1947) After Edwin Land unveiled the Polaroid Model 95 in February 1947, eager shutterbugs no longer had to wait for images to be processed before they could see them. Polaroids remained the fastest way to capture and view images until the first commercial digital cameras appeared in the late 1980s. The phrase “shake it like a Polaroid picture” was later immortalized in the Outkast song, “Hey Ya!” AI goes to summer camp (1956) In the mid-1950s, legendary scientists Marvin Minsky, Claude Shannon, John McCarthy, and Nathaniel Rochester proposed a two-month summer project at Dartmouth to study “how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.” This was the first known use of the phrase “artificial intelligence.”
Birth of the pixel (1957)
In the spring of 1957, National Bureau of Standards scientist Russell Kirsch took a photo of his infant son Walden and scanned it into the Standards Eastern Automatic Computer. To fit the image into SEAC’s limited memory, he divided the picture into a grid measuring 176 x 176–a total of 30,976 one-bit pixels–and scanned it multiple times. The five-centimeter-square photo was the first digital image ever created; it would eventually lead to CAT scans, satellite imagery, and digital photography.
Playing with blocks (1963)
In May 1963, MIT graduate student Larry Roberts submitted a PhD thesis outlining how machines can perceive solid three-dimensional objects by breaking them down into simple two-dimensional figures. Roberts’ “Block World” provided the basis for future computer vision research. He would later go on to oversee the development of ARPANET, the forerunner to today’s internet.
THE “SUMMER VISION PROJECT” (1966)
Ten years after the first AI brainstorming session, MIT professor Seymour Papert gave his undergrads a summer assignment: Work on a visual system that could divide pictures into “likely objects, likely background areas, and chaos.” This was the first application of artificial intelligence to pattern recognition.
The charged coupled device is invented (1969)
Around the time ARPANET went live in the fall of 1969, Bell Labs scientists Willard S. Boyle and George E. Smith were busy inventing the charge coupled device (CCD). The CCD, which converted photons into electrical impulses, quickly became the preferred technology for capturing high-quality digital images. In October 2009, they were awarded the Nobel Prize in Physics for their invention.
First digital camera (1975)
In December 1975, Eastman Kodak engineer Steven Sasson MacGyvered together the world’s first digital camera using discarded parts from a Super 8 movie camera, a voltmeter, a finicky 100×100-pixel CCD, and a half-dozen circuit boards. The eight-pound unit took 23 seconds to capture a black and white image 0.1 megapixels in size. The images were recorded to cassette tape and displayed on a black and white TV.
CONVOLUTIONAL NEURAL NETWORKS (1980S)
Inspired by the animal visual cortex, which uses both simple and complex brain cells to process images, convolutional neural networks (CNNs) rely on overlapping layers of neurons to identify and isolate patterns. Among the more notable examples are Kunihiko Fukushima’s Neocognitron and Yann LeCun’s LeNet. CNNs are essential components of image, speech, and handwriting recognition systems. DARPA Grand Challenge (2005) Relying heavily on computer vision to identify landscape features and avoid obstacles, the first fully autonomous car completed a 132-mile trek through the Nevada desert in October 2005, capturing the $2 million prize offered by DARPA. The winning vehicle, built by a team from Stanford University, completed the journey in a blistering six hours and 53 minutes. Neural networks get their game on (2005) Training neural networks used to be both resource intensive and extremely slow. That changed in 2005, after Microsoft’s Dave Steinkraus and Patrice Simard, along with nVidia’s Ian Buck, published a paper describing how to do it using off-the-shelf graphical processing units (GPUs) found in game consoles. The result: Training that was multiple times faster, much cheaper, and more accurate.
Generative Adversarial Networks face off (2014)
In 2014, a team of researchers at the University of Montréal introduced the idea that machines can learn faster by having two neural networks compete against each other. One network attempts to generate fake data that looks like the real thing; the other network tries to discriminate the fake from the real. Over time, both networks improve; the generator produces data so real the discriminator can’t tell the difference. Generative Adversarial Networks (GANs) are considered the next great breakthrough in computer vision.
Machines 1, humans 0 (2015)
Since 2010, the ImageNet Large Scale Visual Challenge has pitted people against computers to see who does a better job of identifying images. In 2015, the machines won, with Google’s and Microsoft’s neural networks producing fewer recognition errors than their biped competitors. At around the same time, Facebook announced that its DeepFace facial recognition algorithm identifies the correct person 97.35 percent of the time, putting it more or less on par with people. Puny humans–who needs ’em?
AMAZON DRONES GET ‘EYES’ (2016)
As part of its unmanned autonomous vehicle delivery program, the giant etailer is planning to add the ability to ‘see’ obstacles and landing areas to its Amazon Prime Air drones. Using computer vision, Amazon hopes to enable its UAVs to distinguish between a grassy area and a swimming pool, or between a tree and the reflection of a tree in a window. Self driving cars hit the road (2017) Volvo announced it will put 100 self-driving XC90 sedans on the road in Gothenberg Sweden this year. The Drive Me project is the next step in Volvo’s plan to sell fully autonomous “death proof” vehicles direct to customers by 2021. Ford also revealed it will road test 100 self-driving autos in Europe in 2017. GM, BMW, and Google’s Waymo subsidiary are all ramping up their tests of vision-guided cars this year; Elon Musk has declared that a Tesla running Autopilot will drive itself from Los Angeles to New York before the end of 2017.
Feds face consequences of facial recognition (2017)
Congressional hearings in March put a spotlight on the FBI’s facial recognition database, but not a positive one. Members of the House Committee on Government Oversight and Reform found the agency failed to issue a privacy impact assessment about the database, which contains the images of approximately 50 percent of US citizens. It also found high levels of inaccuracy and racial bias, and called for stricter enforcement and more regulation about how such images are collected and used.
Illustrations by Major Savage
When Less Is More
More than half of brand marketers and agencies are currently using the six-second bumper ad—which was introduced just two years ago—and that figure...
The Renaissance of Contextual Advertising
As access to consumers’ personal data becomes more scarce, marketers find themselves returning to a tried-and-true technique that a few short years...
The Problems With Pre-Roll
Throughout the relatively short lifespan of digital video, advertisers have tried myriad ways to make pre-roll more palatable. Yet it remains far f...