Replies: 1 comment 2 replies
-
So basically, you're extracting images from an existing pdf, processing them, and then put them into a new PDF? The thing is, usually, adding flate compression on top of DCT only results in marginal size improvements, so that sounds a bit as if the optimizer might have been lossy. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! I'm working on an application to clean up a scanned PDF file. The idea is to remove shadows in the background, straighten the text so the PDFs could be fed into an OCR software or printed. For testing I loaded a PDF document and saved it as a new file. However I can see that the files saved are much larger then the ones that were read (input ~5MB, output ~18MB). I uploaded the generated file into online PDF optimizer and investigated the result. It has around ~8MB so much better then my 18MB output. The difference is that the image files have a
FlateDecode
in filters:I wonder if there is a way to force use FlateDecode for the PDF Images in pypdfium?
Beta Was this translation helpful? Give feedback.
All reactions