Since first appearing in 2018, deepfake technology has evolved from hobbyist experimentation to an effective and potentially dangerous tool. Here’s what it is and how it’s used. Don’t believe every video you see!
“Deep learning is revolutionizing so many fields, from robotics to medicine and everything in between,” said Obama, who joined the class by video conference.
After speaking a bit more on the virtues of artificial intelligence, Obama made an important revelation: “In fact, this entire speech and video are not real and were created using deep learning and artificial intelligence.”
Amini’s Obama video was, in fact, a deepfake—an AI-doctored video in which the facial movements of an actor are transferred to that of a target. Since first appearing in 2018, deepfake technology has evolved from hobbyist experimentation to an effective and dangerous tool. Deepfakes have been used against celebrities and politicians and have become a threat to the very fabric of truth.
How Do Deepfakes Work?
Deepfake applications work in various ways. Some transfer the facial movements of an actor to a target video, such as the one we saw at the beginning of this article, or this Obama deepfake created by comedian Jordan Peele to warn about the threat of fake news:
Other deepfakes map the face of a target person onto other videos—for example, this video of Nicolas Cage’s face mapped onto that of characters in different movies.
Like most contemporary AI-based applications, deepfakes use deep neural networks (that’s where the “deep” in deepfake comes from), a type of AI algorithm that is especially good at finding patterns and correlations in large sets of data. Neural networks have proven to be especially good at computer vision, the branch of computer science and AI that handles visual data.
Deepfakes uses a special type of neural-network structure called an “autoencoder.” Autoencoders are composed of two parts: an encoder, which compresses an image into a small amount of data; and a decoder, which decompresses the compressed data back into the original image. The mechanism is similar to those of image and video codecs such as JPEG and MPEG.
But unlike classical encoder/decoder software, which work on groups of pixels, the autoencoder operates on the features found in images, such as shapes, objects, and textures. A well-trained autoencoder can go beyond compression and decompression and perform other tasks—say, generating new images or removing noise from grainy images. When trained on images of faces, an autoencoder learns the features of the face: the eyes, nose, mouth, eyebrows, and so on.
Deepfake applications use two autoencoders—one trained on the face of the actor and the other trained on the face of the target. The application swaps the inputs and outputs of the two autoencoders to transfer the facial movements of the actor to the target.
What Makes Deepfakes Special?
Deepfake technology isn’t the only kind that can swap faces in videos. In fact, the VFX (visual effects) industry has been doing this for decades. But before deepfakes, the capability was limited to deep-pocketed movie studios with access to plentiful technical resources.
Deepfakes have democratized the capability to swap faces in videos. The technology is now available to anyone who has a computer with a decent processor and strong graphics card (such as the Nvidia GeForce GTX 1080) or can spend a few hundred dollars to rent cloud computing and GPU resources.
That said, creating deepfakes is neither trivial nor fully automated. The technology is gradually getting better, but creating a decent deepfake still requires a lot of time and manual work.
First, you have to gather many photos of the faces of the target and the actor, and those photos must show each face from different angles. The process usually involves grabbing thousands of frames from videos that feature the target and actor and cropping them to contain only the faces. New deepfake tools such as Faceswap can do part of the legwork by automating the frame extraction and cropping, but they still require manual tweaking.
Training the AI model and creating the deepfake can take anywhere from several days to two weeks, depending on your hardware configuration and the quality of your training data.
The Dangers of Deepfakes
Creating fun educational videos and custom casts for your favorite movies are not the only uses of deepfakes. AI-doctored videos have a darker side that has become much more prominent than its positive and benign uses.
Shortly after the first deepfake program was released, Reddit became flooded with fake pornography videos that featured celebrities and politicians. In tandem with deepfakes, the development of other AI-powered technologies have made it possible not only to fake the face but also the voice of virtually anyone.