This project has moved and is read-only. For the latest updates, please go here.

Compressing PDFs

Dec 17, 2009 at 4:23 PM

Sometimes, when I use the deflate compression on a PDF file, the compressed size is larger than the original. At this point, to zip a file with ZipStorer, I just check the extension and only "store" the pdf instead of compress it.  The bloating does not affect all PDFs.   Is there anyway to know if a PDF is already applying internal compression and if the deflate algorithm would just add overhead?

Otherwise, great class! I really appreciate you opening it up to all of us.  Thanks!

Dec 22, 2009 at 12:46 AM

Hello dirq,

I cannot do any specific related to PDF format, but I have already considered the case where double-conversion inflates the package instead of deflating. This is a very well-known behaviour of the Deflate algorithm.

This has been filed in the "wish list" (Issue Tracker section). I am planning to release before EOY.

Regards,

Jaime.

 

Dec 22, 2009 at 2:47 PM

Awesome.  Thanks again.

Feb 17, 2010 at 4:59 PM

Hi dirq,

I have released a new version of ZipStorer (2.30). I have added a new control to switch from Deflate to Store algorithm, in case the file is bloated during compression. Please have a look and tell me if this works for you.

Thanks and best regards,

Jaime.

Mar 27, 2010 at 7:41 PM

I can tell, that zipstorers compression is very pure in all cases, not only PDF files. PDFs just highlighte this problem. I compared the results:

here is the list of files to compress

5,171 fw4.pdf
 5,929 HelloWorld (protected).pdf
  2,407 HelloWorld.pdf
116,751 Portable Document Format.pdf
 81,603 SomeLayout.pdf
 =======================================
281,861 bytes

Different compressors:

Demo.zip                                   235,738
SharpZipLib.zip                          176,121
ic.zip                                         173,331
ShellCompressed.zip                  173,975

Where:

Demo.zip is produced by the ZipStorers Demo program

ZIP.zip - compressed by ZIP.exe (Cigwin port)

SharpZipLib.zip - compressed by SharpZipLib

ShellCompressed.zip - Explorer drag'n drop into compressed folder

BTW, Codeplexe's Folderzip, which uses the same (even simpler) approach as zipstorer, giver even worse result.  

 

 

Mar 27, 2010 at 8:00 PM

Hi Lev,

Thanks for this detailed analysis. I am aware that DeflateStream is not too much efficient, but the aim of ZipStorer is to provide a small-footprint library for storing multiple files in a zip package.

Anyway, if you have any idea on how to improve DeflateStream efficiency, please advice me.

Thanks again!

Jaime

Mar 28, 2010 at 3:26 AM
Hi, Jaime
ZipStorer is a rather complex class. But the behavior of ZipFolder is striking. The size of the file should be the same as when you drag'n drop is Explorer. Anyway, next week I will try to talk to Windows people, who know how this is implemented inside the Explorer.

Sent: Saturday, March 27, 2010 12:00 PM
To: [email removed]
Subject: Re: Compressing PDFs [zipstorer:78457]

From: jaime_olivares

Hi Lev,

Thanks for this detailed analysis. I am aware that DeflateStream is not too much efficient, but the aim of ZipStorer is to provide a small-footprint library for storing multiple files in a zip package.

Anyway, if you have any idea on how to improve DeflateStream efficiency, please advice me.

Thanks again!

Jaime

Mar 28, 2010 at 4:36 AM
Jaime!
Forget about it Smile emoticon Just read the archives. All 3-rd party libs implement their own DeflateStream. The one from .NET is not good. I suggest you write this piece of info on codeplex page.
Lev.
PS. Still will try to file a bug in .net library.

From: [email removed]
Sent: Saturday, March 27, 2010 12:00 PM
Subject: Re: Compressing PDFs [zipstorer:78457]

From: jaime_olivares

Hi Lev,

Thanks for this detailed analysis. I am aware that DeflateStream is not too much efficient, but the aim of ZipStorer is to provide a small-footprint library for storing multiple files in a zip package.

Anyway, if you have any idea on how to improve DeflateStream efficiency, please advice me.

Thanks again!

Jaime

Apr 1, 2010 at 12:18 AM
I used Reflector and found that DeflateStream is using very weak algorithm. The right way is to use GZipStream , which also has some limitations. Unfortunately this would defeat the goal you have in mind: creating a "onefiler".

Sent: Saturday, March 27, 2010 12:00 PM
Subject: Re: Compressing PDFs [zipstorer:78457]

From: jaime_olivares

Hi Lev,

Thanks for this detailed analysis. I am aware that DeflateStream is not too much efficient, but the aim of ZipStorer is to provide a small-footprint library for storing multiple files in a zip package.

Anyway, if you have any idea on how to improve DeflateStream efficiency, please advice me.

Thanks again!

Jaime