Decision Guide8 min read

When to Use Animated WebP vs an MP4 Video

The answer depends less on file size than on where the content will live — and whether a <video> tag is even an option.

The honest framing: same content, different containers

Animated WebP and MP4 can hold the same visual content — a looping product demo, a subtle UI animation, a short clip that would otherwise be a GIF. In most cases the pixels you see are functionally identical. The format choice does not change what the viewer watches; it changes how the browser fetches, decodes, and renders it, and that matters in ways that are easy to overlook.

There is one hard constraint worth stating upfront: animated WebP carries no audio. If the content has a soundtrack, a voice-over, or any sound that matters, MP4 is the only choice. That removes a slice of the decision space immediately.

For silent loops — which is the large majority of animated content on the web — both formats are viable. The question becomes context: where will this be embedded, who controls the rendering environment, and what are the constraints around autoplay and file size? Working through those questions in order produces a clear answer faster than comparing codec specs.

Where you can and cannot use a <video> tag

This is the most decisive question in the entire guide. If the embedding context does not support the <video> tag, you have no choice — you need an image format, and animated WebP is the best image format for this job. No trade-off analysis required.

Contexts where <video> does not work:

  • Markdown / GitHub README — GitHub and most Markdown renderers sanitize HTML and strip video tags. Images (via ![alt](url) syntax) are the only animated content that renders. Animated WebP works; MP4 does not.
  • Email — The majority of email clients either strip video tags outright or refuse to autoplay them. Gmail, Apple Mail, and most webmail clients do not render <video>. GIF has historically been the fallback here, but animated WebP now works in most modern clients. See the email guide for a full compatibility breakdown.
  • Image-only CMS fields— Many content management systems present a single “image” upload field that pipes the URL into an <img> tag. Notion embeds, Slack rich previews, and several headless CMS block types fall into this category. Whatever you put there will be treated as an image — so it needs to be one.
  • Social media previews (Open Graph) — The og:image meta tag takes a URL and is rendered as a static or animated image by the receiving platform. Twitter, LinkedIn, and Facebook do not pick up a video element from your page; they fetch the image URL you declared.

Contexts where <video> works fine:

  • Web pages and web applications under your control
  • MDX blog posts (if your pipeline allows HTML)
  • Native app webviews with a full browser engine
  • Electron and similar frameworks

In these contexts, both formats are in play and the decision moves to the next questions.

When you do control the HTML and want to combine both formats for maximum compatibility, a common pattern wraps the video in a <picture>-style approach using a <video> with an image fallback:

<!-- Preferred when video tag is available -->
<video autoplay loop muted playsinline>
  <source src="clip.mp4" type="video/mp4" />
  <!-- Fallback for image-only contexts (email, Markdown, etc.) -->
  <img src="clip.webp" alt="Product demo loop" />
</video>

The browser picks the <video> path; image-only renderers fall through to the <img>. This pattern is not always practical (the two files need to stay in sync), but it is the right answer for hero sections and product pages where you want both contexts covered.

Autoplay restrictions across browsers

Modern browsers enforce a simple rule: a <video autoplay> element will not play unless muted is also present. Chrome has enforced this since version 66 (2018), Safari enforces it on iOS and progressively on macOS, and Firefox follows the same policy. The MDN Autoplay guide documents the full policy including the Media Engagement Index that Chrome uses to allow exceptions for trusted sites.

Animated WebP has no such restriction. It is an image. The browser animates it the same way it animates a GIF — unconditionally, as part of painting the page. There is no permission model, no muted requirement, no user gesture needed.

For decorative loops — a subtle background animation, a logo animation, a product turntable — this difference is real. A <video autoplay muted loop playsinline> element will work correctly in practice, but it requires getting four attributes right. Miss muted on iOS and the video renders as a frozen poster frame. Miss playsinline and iOS full-screens it when tapped.

For teams that ship a lot of short decorative animations — design agencies, marketing sites, documentation with motion examples — this operational simplicity is a real argument for animated WebP. One file, one <img> tag, no autoplay policy surface.

File-size economics: when MP4 pulls ahead, when WebP does

The file-size comparison is not one-sided, and the crossover point matters. Here are representative numbers for a 480p clip with moderate motion (a typical product UI recording) encoded with ffmpeg defaults:

FormatSettings5 s clip2 s clip
Animated WebP (lossy)q=75, 24 fps~500 KB~195 KB
MP4 (H.264)CRF 23, fast preset~250 KB~290 KB
MP4 (H.265 / HEVC)CRF 28, fast preset~180 KB~240 KB

The pattern is clear: MP4 wins on raw file size for clips of 5 seconds or longer, because H.264 and H.265 are fundamentally more efficient codecs for motion content. They use inter-frame prediction with B-frames and P-frames, which animated WebP's VP8-based encoder cannot match for longer sequences.

The reversal at short durations is because MP4 carries a fixed container overhead — the moov atom, codec initialization headers, and track metadata — that costs roughly 200–250 KB regardless of clip length. For a 2-second clip, that overhead dominates. Animated WebP has minimal per-file overhead, so it scales down cleanly for short loops.

To produce these formats yourself, the ffmpeg commands are:

# Animated WebP (lossy, q=75, 24 fps, 480px wide)
ffmpeg -i clip.mp4 -vf "fps=24,scale=480:-1" \
  -c:v libwebp -loop 0 -quality 75 clip.webp

# MP4 (H.264, CRF 23, web-optimized)
ffmpeg -i clip.mp4 -vf "fps=24,scale=480:-1" \
  -c:v libx264 -crf 23 -preset fast \
  -movflags +faststart -an clip-h264.mp4

# MP4 (H.265, CRF 28, smaller but less universal)
ffmpeg -i clip.mp4 -vf "fps=24,scale=480:-1" \
  -c:v libx265 -crf 28 -preset fast \
  -movflags +faststart -an clip-h265.mp4

The -movflags +faststart flag moves the moov atom to the front of the MP4 file, which lets browsers start playing before the full download completes. The -an flag strips audio — appropriate for silent decorative loops. H.265 is not universally supported (notably absent on older Android and Windows without codec packs), so for maximum compatibility H.264 remains the safer default.

The decoder and memory comparison

File size is not the only cost. Decoding has a runtime cost that shows up as CPU usage, battery drain, and frame-drop on constrained devices.

MP4 played through a <video> element uses the platform's hardware video decoder — dedicated silicon present on every modern phone and laptop. On iOS this is the Apple Neural Engine-adjacent video decode block; on Android it is the device's media codec hardware (MediaCodec). Hardware decoding is extremely energy-efficient. A 480p H.264 loop running at 24 fps costs almost nothing on battery.

Animated WebP decodes on the CPU. The browser's image decoder handles it, frame by frame, in software. For a short clip or an infrequently-updated loop, this is not noticeable. For a long background animation that updates at 24 fps continuously — say, a 30-second ambient loop behind a landing page — the CPU cost is measurable. On mid-range Android devices from 2022 or earlier, sustained animated WebP at full resolution can push a CPU core into steady utilization.

Memory follows a similar pattern. A <video> element maintains a small decode buffer (typically two or three frames). An animated WebP decoded as an image may expand all frames into memory at once depending on the browser implementation. For anything over 5 seconds at high resolution, this can represent a meaningful memory allocation. If your users include low-end mobile, a short-duration WebP or a hardware-decoded MP4 is the safer choice.

Decision tree

Work through these steps in order and stop when you have your answer:

  1. Can the embedding context render a <video> tag? If NO — you are in Markdown, email, a Notion block, or an image-only CMS field — use animated WebP. Done.
  2. Is the clip under 3 seconds? If YES, animated WebP will usually produce a smaller file due to MP4's container overhead. Prefer WebP unless a downstream constraint overrides it.
  3. Is this a decorative autoplay loop with no user controls? If YES, lean toward animated WebP — it autoplays unconditionally without the muted requirement and simplifies your markup. If NO (the user might pause, seek, or the clip has intentional start/stop behavior), lean toward MP4.
  4. Does the content have audio that matters? If YES, MP4 is mandatory. Animated WebP has no audio track.
  5. Is your target audience heavily low-end mobile? If YES, lean toward MP4 — hardware decode is significantly cheaper on battery and avoids the sustained CPU cost of animated WebP at high frame rates or resolutions.

If you reach step 5 without a definitive answer, you are in the “either works” zone. Pick MP4 if raw file size matters most (longer clips benefit from H.264 compression); pick animated WebP if operational simplicity matters most (one file, one image tag, no attribute checklist).

Practical recipes for the three most common contexts

These recommendations apply to the contexts most developers encounter. Adjust the quality target and file-size ceiling based on your actual audience and performance budget.

ContextFormatFPSQualitySize ceilingFallback
GitHub READMEAnimated WebP15–24q=75–80500 KBStatic PNG
Email blastAnimated WebP (or GIF for Outlook)10–15q=70250 KBStatic JPG
Hero on a marketing siteMP4 (H.264) + WebP fallback24–30CRF 231 MBAnimated WebP

For email, Outlook 2007–2019 on Windows does not render animated WebP and will show only the first frame — which is acceptable for most use cases. If the first-frame experience is meaningful, design it intentionally. See the email guide for a client-by-client breakdown and the GIF fallback strategy.

The ten-second rule

After working through dozens of these decisions, a rough heuristic emerges: if the clip is under 10 seconds and decorative — no audio, no user controls, no interaction — animated WebP is usually the right choice. It is smaller than MP4 at short durations, it autoplays without attribute ceremony, it works everywhere an image works, and it requires no fallback strategy for image-only contexts.

Over 10 seconds, or anywhere audio or user control enters the picture, MP4 wins decisively. The compression efficiency of H.264 compounds over time, hardware decode becomes more valuable for sustained playback, and the <video> element's controls API is the right abstraction anyway.

Short, silent, decorative: animated WebP. Long, audio-bearing, interactive: MP4. Most content fits cleanly into one bucket or the other, and the edge cases are handled by the decision tree in section 6.

If you have a video or GIF that you want to turn into an animated WebP, the 2WebP converter handles it in-browser with no install required. For format comparisons, the animated WebP vs GIF benchmark covers file-size numbers in more detail for that specific comparison.