Unveiling the Truth: Effective Techniques for Recognizing Synthetic Videos Crafted by AI Systems Today

Quick Links

With the arrival of OpenAI’s SORA text-to-video model, we’re staring at an inescapable future full of AI-generated video. But the technology hasn’t yet been perfected, so here are some tips for spotting AI-generated video (for now).

Spotting AI Generated Content Can Be a Challenge

At first glance, you’d be forgiven for passing AI-generated video off as the real deal. It’s only when you start to look a little deeper that you might start to notice something is amiss.

All of the examples we’ll talk about in this article pertain to OpenAI’s SORA text-to-video model , announced in February 2024. It’s by far the most advanced model of its kind, converting text prompts into moving images. Things have come a long way since the infamous Will Smith eating spaghetti Reddit post surfaced in early 2023. At the time of writing in March 2024, SORA is still in closed testing.

Spotting AI-generated photos and videos is more of an art than an exact science. There are ways to tell if a photo has been AI-generated , but they don’t work consistently. Tools designed to detect AI content are often unreliable , even when it comes to text.

The aim here is to highlight some of the ways you can pick out AI-generated content, at least for now. Remember that models are always evolving, so these traits will become harder to spot. Sometimes the choice of subject and context of the video can make all the difference.

Watch for Subtle Changes and “Ghosts”

Looking for subtle changes is one way to spot a convincing AI fake, but it’s not exactly easy. One example of OpenAI’s SORA depicted a woman walking down a neon-lit Tokyo street . The scene is impressive for a text-to-video tool, so impressive that you might have missed the wardrobe change towards the end of the footage.

The woman’s clothing in the opening scene shows a red dress with a full-length cardigan and a leather jacket. The cardigan is a bit strange in the way that it seems to blend into the jacket, but I’m not exactly Mr Fashion so I’ll give it a pass:

OpenAI

Now take a look at the same clothes in the close-up and you’ll see that the dress now has dark patches on it and that the leather jacket has a much larger off-center lapel:

OpenAI

This is so subtle that most people would need to watch the footage multiple times to spot it. The scene is dense, filled with reflections, and background actors which helps distract you from the gaffe.

Something else to watch out for are ghosts, or objects phasing in and out of existence. OpenAI’s video of a gold rush California town provides a good example of this. Take a look at this rather nebulous-looking figure, which your brain probably interprets as a man with a horse:

OpenAI

Two seconds later, the figure has disappeared entirely. If you watch the video, you’ll see this figure blend right into the dirt as if it were a ghost:

OpenAI

AI Struggles with Fingers, Glasses, and Finer Elements

One of the biggest problems for AI-generative models are extremities and fine lines. In particular, have a good hard look at hands, held objects, spectacles, and the way things interact with human features (like hats, helmets, or even hair).

Video can make this sort of error easier to spot compared to AI-generated photography because these features can change from one scene to the next.

Fingers and hand placement are particularly difficult for AI to pull off. Generative models have a tendency to produce hands with more or less fingers than you’d expect. Sometimes things don’t look quite right, fingers are very thin, or there are too many knuckles. Held objects exhibit the same wonkiness, at times appearing like the human in the frame has absorbed whatever it is they are holding.

Look for glasses that don’t seem to be symmetrical or that merge into faces. In a video, they may even phase in and out of view and change between scenes. The same is true of arms and legs, just take a look at this SORA video of people in Lagos, Nigera :

OpenAI

Can you take your third arm off my leg, please?

Look Closely at Objects in the Background of an Image

Background details are often a dead giveaway when it comes to AI-generated video, even more so than photos. A good fake depends on the subject being convincing enough to distract you that the background isn’t quite behaving in the way it should be.

Take a look at the Tokyo night scene video again. This scene is so dense that it’s easy to just take everything at face value, but look closely at the people walking in the background, particularly those to the left of the subject:

OpenAI

Some of this movement just doesn’t look right. At one point, a person seems to duplicate themselves. Later, what appears to be a group of people appears to phase into a single object as if they’re all wearing the same skirt or overcoat. In some areas, the walking animations are odd too.

Keep an eye out for suspect background activity to spot AI-generated video. Sometimes you’ll notice natural objects like trees, fields, or forests interacting in strange ways. Perspectives can seem off, sometimes moving objects don’t quite line up with the path portrayed in the animation.

Another example is OpenAI’s Big Sur coastline drone shot . Have you ever seen a wave that looks that straight in nature?

Lighting and the “AI Aura”

This is something we’ve seen a lot in AI-generated photos, and it’s arguably more of a “feel” than an objectively identifiable trait. If lighting feels particularly flat and unnatural in instances where you’d expect more variance, that can signal that it might not be real.

For example, a lack of camera imperfections like blooming, highlight blowout (where highlights are lost due to too much light entering the lens), or shadow roll-off (where shadow detail is lost due to the absence of light) simply aren’t present.

Everything can look a bit like a highly-produced music video, or like video games in the days before realistic lighting and raytracing . Subjects may look perfectly lit in instances where you’d expect them not to be.

The Uncanny Valley Effect

The uncanny valley effect is a term used to describe the mixing of human and inhuman traits in a manner that makes the viewer feel uncomfortable. Androids or human-like robots are oft-cited examples since they give the outward appearance of being human but are inescapably inhuman at the same time.

More often than not the uncanny valley effect simply comes down to a feeling. You can sense something isn’t quite right, but you can’t put your finger on exactly what it is. This effect often rears its head in AI-generated photos and videos, and one area that I experienced it is in SORA’s spaceman video .

Ignoring for a second that the spaceman in question is wearing a knitted space helmet, there’s something about this face that sends a shiver down my spine:

OpenAI

And there’s a similarly ghoulish grandmother failing to blow out her birthday candles, which looks far worse in motion:

OpenAI

Watch Out for Nonsense

This seems like the easiest red flag to spot, but sometimes your brain just gives things a pass. The aforementioned spaceman video is a good example of this. There’s a brief scene of a door, or a handle, or a lever, or something that just doesn’t make sense:

OpenAI

What is this thing? Why is the animation seemingly played in reverse? The knitted helmet I can excuse, but this thing has puzzled me since the moment I saw it.

The same goes for movements. The SORA cat in bed video is impressive, but the movement isn’t right. Cat owners will recognize that the behavior is strange and unnatural. It feels like there’s a mismatch between the behavior of the subject and the context of the situation. Over time, this will improve.

Garbled text is another good example of what AI generative processes often get wrong. The Japanese characters in SORA’s Tokyo night scene video are a jumble, and so is some of the road and shop signage. Choosing a scene where most people are unable to distinguish Japanese from a bad tribute was a smart choice on OpenAI’s part.

Train Yourself to Better Spot This Content

The best way to train yourself to spot AI-generated content is to study it yourself. Most generative models have active communities both on the web and on social media platforms like Reddit. Find some and take a look at what people are coming up with.

On top of this, you could generate your own images using a tool like Stable Diffusion . At the time of writing, OpenAI’s SORA isn’t available for public usage so you’ll have to wait before diving in for yourself.

AI-generated video is impressive, fascinating, and terrifying in equal measure. Over time, these tips will likely become less relevant as models overcome their weaknesses. So buckle up, because you haven’t seen anything yet.

Some Skills

Unveiling the Truth: Effective Techniques for Recognizing Synthetic Videos Crafted by AI Systems Today