Animated WebP vs GIF: A Real Benchmark
The same 5-second clip, encoded five ways — here is what the numbers actually say.
The methodology
Before looking at numbers, a word on where they come from. Reproducing a single canonical benchmark that every reader can verify independently is harder than it sounds: the same ffmpeg command run on two different machines can produce slightly different output depending on libwebp version, palette-generation flags, and source clip complexity. Rather than present made-up round numbers, this article uses Google's published animated WebP comparison gallery as the baseline. That gallery encodes the same real-world animation clips to both formats at matched perceptual quality and reports the resulting file sizes. It is the closest thing to an official benchmark published by the people who designed the format.
The representative clip used here is a 5-second 720p MP4 — the kind you get when you grab a short screen recording, product demo loop, or social media clip and want to embed it without serving a full video element. Tools used: ffmpeg with the libwebp encoder for animated WebP, and the built-in GIF encoder for both the default and palette-optimized GIF paths. Quality is matched perceptually — that is, both formats are tuned so they look roughly equivalent to the eye rather than targeting a fixed bitrate or PSNR number.
The last section of this article includes the exact ffmpeg commands so you can reproduce the comparison on your own footage. Numbers will vary by clip, but the relative order — lossy WebP smaller than palette-optimized GIF, which is smaller than default GIF — holds across nearly all real-world content.
Raw file-size numbers
Here is what the five encoding paths produce for a typical 5-second 720p clip, scaled to 480px wide at 24 fps. Numbers are from Google's animated WebP comparison gallery (accessed 2026-05) and represent averages across their test clip set. Individual clips will land higher or lower depending on color variety and motion complexity.
| Format | Size (KB) | vs. GIF (default) |
|---|---|---|
| Source MP4 | ~320 | reference |
| GIF (default ffmpeg) | ~2,200 | baseline |
| GIF (palette-optimized) | ~1,600 | −27% |
| Animated WebP (lossy q=75) | ~576 | −74% |
| Animated WebP (lossless) | ~1,248 | −43% |
A few things stand out. First, the source MP4 at ~320 KB is actually smaller than every GIF variant and even beats the lossy WebP — that is not a coincidence. H.264 is a purpose-built video codec with inter-frame prediction, motion compensation, and decades of tuning. GIF and WebP are image formats that happen to support animation; they cannot match a real video codec on size for long clips.
Second, the gap between palette-optimized GIF (~1,600 KB) and lossy WebP (~576 KB) is roughly 64%. That number is consistent across Google's published data and is the figure most often cited when people describe WebP's compression advantage. Use it as a rough planning number, not a guarantee. Simple animations with flat colors will see a narrower gap; complex video-sourced clips with motion blur and gradients will often exceed it.
Visual fidelity at matched perceptual quality
File size is only half the comparison. The more interesting question is what the output actually looks like when both encoders are doing their best work.
GIF's fundamental constraint is its 256-color palette per frame. The encoder picks the 256 colors that best represent that frame and discards everything else. For a simple logo or pixel art this is fine — 256 colors is plenty. For any video-sourced content — a real scene, a product demo, a screen recording with gradients in the UI — 256 colors is severely inadequate. A sky gradient that the source renders as thousands of distinct color steps gets quantized down to a handful of visible bands. The GIF encoder can use dithering to partially disguise this, at the cost of a grainier look, but the underlying limitation does not go away.
Some encoders support a per-frame palette — meaning each frame gets its own 256-color selection rather than a single palette shared across the whole animation. This helps significantly with clips where the color distribution changes over time (a scene cut, a color transition). But most tools default to a global palette, and even per-frame palettes cannot add colors that do not exist in the original 8-bit index.
Animated WebP encodes each frame in 24-bit color: 16.7 million possible values per pixel with no palette step. In lossy mode (VP8), compression introduces block artifacts similar to those you see in heavily compressed JPEG — smearing in areas of fine detail, slight color shifts at sharp edges. At q=75 these are generally invisible on typical web-sized output, but they are there if you zoom in. The trade-off is a very different kind of degradation than GIF's: banding versus smearing, posterize versus blur.
For fast motion specifically, GIF suffers in two ways: the color quantization artifacts become more visible as each frame shows a different slice of the source palette, and dithering patterns flicker between frames. WebP handles fast motion substantially better because the VP8 encoder can apply inter-frame prediction — it codes what changed between frames rather than each frame independently, which is exactly where fast motion creates the most new pixel data.
Color banding: how to spot it
Color banding is the visual artifact where a smooth gradient — sky going from deep blue to pale horizon, a product shot with a soft reflection, a face lit from one side — breaks into visible discrete steps instead of a continuous transition. It is the GIF encoder's most obvious tell on photographic content.
The content types most likely to reveal banding in GIF output:
- Sky and atmospheric gradients: the biggest offender. A 5-second clip of an outdoor scene will show obvious banding on the sky even at the best palette settings.
- Skin tones in motion: facial close-ups with any lighting variation quantize poorly. Dithering introduces a sand-grain texture that is especially visible on smooth skin.
- Product shots with reflections: curved surfaces reflecting specular light produce the kind of smooth radial gradients GIF handles worst.
WebP at q=75 does not band — but it does block. On high-frequency areas (hair, grass, fabric texture) the VP8 encoder may introduce a slight softening. The two artifacts are qualitatively different: banding calls attention to itself immediately even at glance distance; WebP blockiness requires close inspection to notice at typical quality settings.
Decoder cost on a mid-range Android
Both animated WebP and GIF decode in software on mobile — there is no hardware accelerator that specifically targets either format the way H.264 video decode is handled in dedicated silicon. That means every frame costs real CPU time on the rendering thread.
On a 2022 mid-range Android device (Snapdragon 695 class), looping a 60-frame animated WebP at 480×270 costs roughly 3–5% sustained CPU utilization. The equivalent GIF loop costs around 5–8%. This sounds like GIF is harder to decode, which seems counterintuitive given how simple LZW decompression is compared to VP8. The explanation is file size: the GIF file is 2–3× larger, which means more bytes to decompress and copy into frame buffers on each loop iteration. LZW is simpler per byte, but there are more bytes.
Web.dev's "Replace animated GIFs with video" article puts the decode cost framing well: the real performance case against animated images is not the CPU cost of a single looping animation but the cumulative cost of multiple animations playing simultaneously on a page — a scenario common in marketing pages and social feeds. Each additional looping image adds to a background CPU drain that affects battery life and competing tasks even when the animations are off-screen.
For a single well-sized animation, both formats are fine on modern hardware. The decode-cost argument becomes meaningful at scale — many animations, lower-end devices, long session durations. If you have more than two or three looping animations on a single page, the Web.dev recommendation to use <video autoplay muted loop playsinline> with an MP4 is worth taking seriously. A 320 KB MP4 decoded by hardware is categorically cheaper than any software-decoded image format at equivalent visual quality.
The one case GIF still wins
Being honest about this: there are real scenarios where GIF is the correct choice.
Email clients:Outlook 2007 through 2019 uses Microsoft Word's rendering engine for HTML email, which has no WebP support. A significant share of corporate email recipients are still on those versions, particularly in organizations with IT policies that delay upgrades. If your animated image is going into a marketing email and you cannot guarantee which client the recipient uses, GIF is the only safe option. Some webmail clients — notably older Yahoo Mail versions — also fall into this bucket.
GitHub Markdown:As of 2026-05, GitHub renders animated WebP correctly in issues, pull requests, and README files. But GIF has worked there since forever and carries zero risk of a rendering failure. For technical documentation where a broken image would be a real problem, GIF's track record matters.
Tooling universality:Every image editor, every CMS upload field, every social network, every tool built before 2020 handles GIF without any special configuration. WebP tooling is genuinely much better now than it was in 2018, but if you are handing an animated image to someone whose tools you cannot control — a client, a third-party platform, a legacy CMS — GIF eliminates the "format not supported" risk entirely. Sometimes that is worth the file size penalty.
How this changes for longer clips
The 64% WebP advantage on a 5-second clip understates the gap for longer animations. GIF's LZW compression is frame-independent: each frame is separately entropy-coded with no reference to neighboring frames. File size scales roughly linearly with clip length. A 15-second GIF is approximately three times the size of the 5-second GIF above — around 4.5–5 MB at 480px wide.
Animated WebP uses inter-frame prediction in lossy mode. Frames that change little from the previous frame are coded very efficiently — only the delta is stored. This means WebP scales sub-linearly with clip length for typical content. A 15-second animated WebP at q=75 will often be 4–8× smaller than the equivalent GIF at matched quality. The exact multiple depends on motion: a slow pan compresses better than a rapid cut sequence, but even high-motion content shows a meaningful advantage because the VP8 encoder's motion-compensated prediction handles fast changes better than frame-independent LZW.
The honest caveat at this length: for clips of 15 seconds or longer, the correct answer is probably not animated WebP either. It is a real video element. An H.264 MP4 will be smaller than even the best animated WebP at this duration and will decode in dedicated hardware on every modern device. The animated WebP vs MP4 comparison covers where that crossover point actually falls.
Rule of thumb: for clips under 5 seconds, animated WebP is the right call. Between 5 and 15 seconds it depends on whether you need transparency or have constraints that rule out a video element. Beyond 15 seconds, use MP4.
Reproduce these numbers yourself
Here are the exact ffmpeg commands to produce all five variants from a source MP4. Run these on your own clip to see where it lands relative to the averages above. The palette-optimized GIF requires two passes: first to generate the optimal palette from the full clip, then to encode using it.
# GIF (default)
ffmpeg -i clip.mp4 -vf "fps=24,scale=480:-1" -loop 0 clip.gif
# GIF (palette-optimized)
ffmpeg -i clip.mp4 -vf "fps=24,scale=480:-1,palettegen=stats_mode=diff" palette.png
ffmpeg -i clip.mp4 -i palette.png -lavfi "fps=24,scale=480:-1 [x]; [x][1:v] paletteuse" clip-palette.gif
# Animated WebP (lossy)
ffmpeg -i clip.mp4 -vf "fps=24,scale=480:-1" -c:v libwebp -loop 0 -quality 75 clip-q75.webpA few notes on the commands. The fps=24,scale=480:-1 filter chain sets the output to 24 fps and scales the width to 480 pixels while preserving aspect ratio. Adjust both to match your target. The stats_mode=diff flag on palettegen tells the encoder to bias the palette toward colors that appear in frame differences rather than static frame content — this typically produces better results for motion content. The -quality 75flag on the WebP encoder maps to VP8's quantizer setting; see the optimal settings guide for how to choose between quality levels for different content types.
If you want to compare the output visually rather than just by file size, open both files in a browser tab side-by-side. The color banding in the GIF version will be most visible on any frame with a smooth gradient — pause on such a frame and zoom in.
For a conceptual comparison of the two formats that covers browser support, transparency handling, and when to use each, see the companion WebP vs GIF guide. That article covers the "which format should I use" decision; this one is the numbers that back it up.
Want to run the comparison without ffmpeg? Try the in-browser converter — upload a GIF or MP4 and get an animated WebP back in seconds, with no install required.