Using open-source software and some pocket change, a researcher created plausible images and audio of actor Tom Hanks
There are many photos of Tom Hanks, but none like the images of the leading everyman shown at the Black Hat computer security conference Wednesday: They were made by machine-learning algorithms, not a camera.
Philip Tully, a data scientist at security company FireEye, generated the hoax Hankses to test how easily open-source software from artificial intelligence labs could be adapted to misinformation campaigns. His conclusion: “People with not a lot of experience can take these machine-learning models and do pretty powerful things with them,” he says.
Seen at full resolution, FireEye’s fake Hanks images have flaws like unnatural neck folds and skin textures. But they accurately reproduce the familiar details of the actor’s face like his brow furrows and green-grey eyes, which gaze cooly at the viewer. At the scale of a social network thumbnail, the AI-made images could easily pass as real.
To make them, Tully needed only to gather a few hundred images of Hanks online and spend less than $100 to tune open-source face-generation software to his chosen subject. Armed with the tweaked software, he cranks out Hanks. Tully also used other open-source AI software to attempt to mimic the actor’s voice from three YouTube clips, with less impressive results.
By demonstrating just how cheaply and easily a person can generate passable fake photos, the FireEye project could add weight to concerns that online disinformation could be magnified by AI technology that generates passable images or speech. Those techniques and their output are often called deepfakes, a term taken from the name of a Reddit account that late in 2017 posted pornographic videos modified to include the faces of Hollywood actresses.
Most deepfakes observed in the wilds of the internet are low quality and created for pornographic or entertainment purposes. So far, the best-documented malicious use of deepfakes is harassment of women. Corporate projects or media productions can create slicker output, including videos, on bigger budgets. FireEye’s researchers wanted to show how someone could piggyback on sophisticated AI research with minimal resources or AI expertise. Members of Congress from both parties have raised concerns that deepfakes could be bent for political interference.
Tully’s deepfake experiments took advantage of the way academic and corporate AI research groups openly publish their latest advances and often release their code. He used a technique known as fine-tuning in which a machine-learning model built at great expense with a large data set of examples is adapted to a specific task with a much smaller pool of examples.
To make the fake Hanks, Tully adapted a face-generation model released by Nvidia last year. The chip company made its software by processing millions of example faces over several days on a cluster of powerful graphics processors. Tully adapted it into a Hanks-generator in less than a day on a single graphics processor rented in the cloud. Separately, he cloned Hanks’ voice in minutes using only his laptop, three 30-second audio clips, and a grad student's open-source recreation of a Google voice-synthesis project.
Line ID @ufa98v2