Skip to content

Add APNG R/W Support via libpng#4944

Open
felixbuenemann wants to merge 2 commits into
libvips:masterfrom
felixbuenemann:apng-support
Open

Add APNG R/W Support via libpng#4944
felixbuenemann wants to merge 2 commits into
libvips:masterfrom
felixbuenemann:apng-support

Conversation

@felixbuenemann

@felixbuenemann felixbuenemann commented Mar 11, 2026

Copy link
Copy Markdown
Collaborator

Summary

Add APNG (animated PNG) load/save support via libpng.

Adds animated PNG to the existing libpng backend, gated behind PNG_APNG_SUPPORTED (libpng 1.6+APNG patch or libpng 1.8+). The loader supports page/n parameters matching the GIF/WebP loaders, full frame compositing (all dispose and blend ops, 8-bit and 16-bit), and sequential access. The saver writes APNG via sink_disc with palette mode support. Non-animated PNGs are unaffected — the only overhead is an acTL chunk check.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

Btw. I also opened a PR for spng (which libvips also supports) to add APNG R/W support at randy408/libspng#283, but it looks like the library is unmaintained since 2023.

@jcupitt

jcupitt commented Mar 11, 2026

Copy link
Copy Markdown
Member

This is great @felixbuenemann!

Yes, libspng seems to have gone, sadly, so libpng only is fine.

I'm trapped in a horrible deadline right now (I must submit a paper in a month presenting work I've still not finished), but after that, and a beer, I'll be back on libvips.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

@jcupitt That's fine. Do you prefer if I split the quantiser into a separate PR?

@jcupitt

jcupitt commented Mar 12, 2026

Copy link
Copy Markdown
Member

I'm ok with one PR, but maybe someone else will review. I'll defer to them.

@jcupitt

jcupitt commented Mar 12, 2026

Copy link
Copy Markdown
Member

Getting rid of a dependency sounds great, especially if we save time and memory.

There's an issue somewhere asking for a "find $N most significant colours" operation, which is difficult with libimagequant because of the API. If we had our own cluster finder we could (probably?) add stuff like this relatively easily.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

Getting rid of a dependency sounds great, especially if we save time and memory.

There's an issue somewhere asking for a "find $N most significant colours" operation, which is difficult with libimagequant because of the API. If we had our own cluster finder we could (probably?) add stuff like this relatively easily.

That shouldn't be too hard to do, but currently the built-in quantizer is only used as a fallback, when neither quantizr nor libimagequant are found.

I also have another optimization in the pipeline, that remaps pixels with a fully transparent alpha, but different rgb values to a single 0,0,0,0 alpha palette entry to ensure remaining palette entries are not wasted. I just haven;t come around to generating good test images to exercise the code.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

I pushed the optimisation that replaces all fully transparent pixels with a single {0,0,0,0} palette entry at position 0, which also improves PNG compression.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

I also discovered #4971 while working on this.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

I fixed the fuzzer errors by checking sub-frame dimensions and frame counts before allocating.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

@jcupitt you said:

There's an issue somewhere asking for a "find $N most significant colours" operation, which is difficult with libimagequant because of the API. If we had our own cluster finder we could (probably?) add stuff like this relatively easily.

We can get just the palette with all three quantizers (built-in, libimagequant, quantizr) and return it as RGBA strip or what was the intended API?
Or is "most significant colours" something else than the palette, when quantizing to $N colors?

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

I amended the quantizer commit, because cgif was disabled with the built-in quantizer and also added an error handling bugfix for the quantizr lib in a separate commit.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

@jcupitt I've split the quantizer to PR #4987 since both changes are independent of each other.

Add animated PNG (APNG) support to the libpng backend, gated behind
PNG_APNG_SUPPORTED (libpng 1.6+APNG patch or libpng 1.8+).

Load:
- Extend Read struct with APNG fields (page/n, canvas, dispose state)
- Add page/n parameters to pngload (matching GIF/WebP loaders)
- Detect acTL chunk and set animation metadata (n-pages, page-height,
  delay, loop)
- Scan raw PNG chunks via vips_source_map to extract fcTL delays at
  header time, matching how GIF/WebP loaders handle delay metadata
- Frame compositing with DISPOSE_OP_NONE/BACKGROUND/PREVIOUS and
  BLEND_OP_SOURCE/OVER for both 8-bit and 16-bit
- Sequential generate callback for frame-by-frame reading
- Compat defines for constant naming differences between libpng 1.6
  APNG patch (PNG_DISPOSE_OP_NONE) and 1.8+ (PNG_fcTL_DISPOSE_OP_NONE)
- Non-animated PNG fallthrough avoids duplicate png_read_update_info
  call (rejected by libpng 1.8)

Save:
- Detect animation via page-height metadata
- Write APNG with acTL/fcTL/fdAT chunks via sink_disc callback
- Disable interlace for animated output with warning
- Palette mode support via quantisation

Tests:
- Add cogs-apng.png test image (indexed APNG converted from cogs.gif)
- Add hand-crafted APNG test images for compositing: dispose-background,
  dispose-previous, and alpha blend-over
- Add test_apng_load: metadata, page handling, single/multi-frame
- Add test_apng_dispose_background: verify BACKGROUND dispose clears
  canvas to transparent
- Add test_apng_dispose_previous: verify PREVIOUS dispose restores
  canvas, including spec rule that PREVIOUS on frame 0 = BACKGROUND
- Add test_apng_blend_over: verify alpha-over compositing of
  semi-transparent frames
- Add test_apng_save: lossless round-trip, palette round-trip, GIF
  to APNG conversion
- Add have_apng() / skip_if_no_apng for builds without APNG support
@felixbuenemann felixbuenemann changed the title Add APNG R/W Support via libpng & Built-in Quantizer Add APNG R/W Support via libpng Mar 30, 2026
@jcupitt

jcupitt commented Jun 8, 2026

Copy link
Copy Markdown
Member

Now I look more carefully, this pr is HUGE, and very difficult to review properly.

I'll split out a bite-size chunk (just apng load?) and make a small PR for that.

@jcupitt jcupitt mentioned this pull request Jun 8, 2026
@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

Now I look more carefully, this pr is HUGE, and very difficult to review properly.

I'll split out a bite-size chunk (just apng load?) and make a small PR for that.

I already cut this PR down to just apng support with the quantizer in another PR. Or do you refer to the comment noise on this PR?

@jcupitt

jcupitt commented Jun 9, 2026

Copy link
Copy Markdown
Member

I already cut this PR down to just apng support with the quantizer in another PR.

Yes, it needs to be smaller still, ideally under 1,000 lines. It's a huge effort to review large PRs :(

I'm also concerned about code quality. The apng load PR I've started is based on your loader, but I'm being forced to rewrite almost the entire thing by hand.

The LLM generated code is not maintainable --- anyone trying to fix an issue in it will get a bad headache. I think merging code like that would seriously damage the long-term health of the project.

@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

If you have specific feedback, maybe the llm can fix most if it by itself, saving you some work.

@jcupitt

jcupitt commented Jun 10, 2026

Copy link
Copy Markdown
Member

For example, this was the first function I looked at:

https://github.com/felixbuenemann/libvips/blob/apng-support/libvips/foreign/vipspng.c#L618-L683

It scans the file looking for delay chunks to build the delay array.

It's hard to read and verify:

  • common code in then and else parts
  • things are incremented, and then "var - 1" appears later, creating
    complex dependencies between different parts of the function, and
    making maintenance much harder
  • conceptually this should be a for loop along the chunks, why is this a while?
  • (minor) libvips is 4-space tabs, so a lot of linebreaks can go
  • silently ignores malformed chunks -- it should probably be a warning and optionally throw an error
  • lots of unnecessary ternaries
  • unnecessary apng_delay_to_ms() function

The rest of the PR has similar problems. The code is tortuous, unclear, and would be difficult to maintain.

There are larger structural problems too. APNG load hasn't been built in layers, instead it's a general melange of mixed up concepts. Everything dissolves into a uniform fog of vague confusion, with no light of intelligence.

For example, there should be four functions for manipulating areas of pixels: clear a buffer, take a copy of a buffer, paste at position, and blend at position. They should be easily verifiable, including slightly tricky things like clipping and handling of out of range values. Later parts of apng load would use those four to do any pixel manipulation, and would not contain any direct memory writes.

Without strict layers, verifiability is difficult. Can you prove (informally anyway!) that apng load can never write out of bounds? It's much harder if writes happen everywhere rather than being isolated in single functions.

I think for non-library projects, code like this is probably fine. If there's a problem, you can just regenerate the whole thing. Maintainability doesn't really matter.

For long-lived libraries though, it's just not going to work. Maintainability over many years is almost the key quality. You have to be able to present a stable and predictable API to downstream users, and that means code has to evolve. You can't evolve code that no one can read.

I feel bad, I know you spent a lot of time on this PR and must be disappointed and annoyed, but now I've spent time reading the code I think it all has to be rewritten, and by a human. LLMs are great, but for projects like libvips, only as very junior assistants, not as (almost) final authors.

I'll write an AI-POLICY.md to try to stop this happening again.

Rework the APNG code in vipspng.c following review.

Load is now built in layers: all canvas pixel access goes through four
functions -- apng_canvas_clear/copy/paste/blend() -- which clip against
the canvas edges, so out-of-bounds reads and writes are impossible by
construction. Dispose, frame compositing and the generate callback are
expressed in terms of these and contain no direct memory writes.

Rewrite apng_scan_delays() as a for loop along the chunks, with a local
big-endian fetch helper instead of unaligned guint32 casts. Malformed
and truncated chunks now warn, and fail with fail_on >= error. Inline
the delay-to-ms conversion at its single use site.

Merge the ~250-line write_vips_apng() duplicate back into write_vips():
animation is now a few small conditionals (page-height IHDR, acTL,
delay fetch, sink callback choice). write_apng_block() streams rows and
emits fcTL chunks at frame boundaries, dropping the whole-frame
accumulation buffer.

Fix several bugs found while restructuring:

- hidden first frames failed to load: a hidden IDAT has no fcTL, but we
  called png_get_next_frame_fcTL() for it, and the delay scan skipped a
  nonexistent "hidden fcTL", shifting all delays by one
- opaque APNGs decoded to garbage: png_set_add_alpha() was called after
  png_read_update_info() (no effect), and the alpha-strip path wrote
  fewer bands than the header advertised; animated output now always
  has an alpha channel, like the GIF and WebP loaders (required anyway,
  since BACKGROUND dispose can create transparency from solid frames)
- interlaced PNG loads lost all metadata (ICC, EXIF, comments): restore
  the original png2vips_image() structure with an early-return animated
  branch, which also removes the double png_read_update_info() hack
- interlaced APNGs now fail cleanly at header time instead of decoding
  garbage (png_read_row() cannot handle interlace)
- bad page numbers on plain PNGs now error, like gifload
- a per-frame frame buffer that leaked on png_read_row() longjmp is
  replaced by one canvas-sized buffer for the lifetime of the read

Add hand-crafted test images and tests for the previously broken paths:
hidden first frame (paging and delays), fully opaque RGB APNG (forced
alpha with BACKGROUND dispose), and 16-bit BLEND_OP_OVER (verified
within 1 LSB of float-reference compositing).

Tested against both APNG-capable libpngs (1.6.55 + APNG patch, and
libpng 1.8 with native APNG support).
@felixbuenemann

Copy link
Copy Markdown
Collaborator Author

@jcupitt I have pushed a new commit that refactors the code according to your comments. Let me know, if it is an improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants