This project has moved and is read-only. For the latest updates, please go here.

Compatibility with OCF Container format

Dec 21, 2009 at 7:50 AM

I'm trying to get ZipStorer to write files that are compatible with the OCF Container format but it is not producing valid files. Would you please consider changing zipstorer so that it can write files compatible with this standard? I believe the major problem is with the zip header, which is specifically mentioned in the final paragraph in the section below.

The OCF Container is just a zip file written in a specific manner. Here is the specs concerning the zip file:

4         ZIP Container

OCF’s ZIP Container supports the ZIP format as specified by the application note at http://www.pkware.com/business_and_developers/developer/appnote/, but with the following constraints and clarifications:

  • Conforming OCF ZIP Containers MUST NOT use the features in the ZIP application note that allow ZIP files to be split across multiple storage media. Conforming OCF Reading Systems MUST treat any OCF files that specify that the ZIP file is split across multiple storage media as being in error.
  • Conforming OCF ZIP Containers MUST only include uncompressed files or Flate-compressed files within the ZIP archive. Conforming OCF Reading Systems MUST treat any OCF Containers that use compression techniques other than Flate as being in error.
  • Conforming OCF ZIP Containers MAY use the ZIP64 extensions and SHOULD only use those extensions when the content requires them. Conforming OCF Reading Systems MUST support the ZIP64 extensions.
  • Conforming OCF ZIP Containers MUST NOT use the encryption features defined by the ZIP format; instead, encryption MUST be done using the features described in Section 3.5.5. Conforming OCF Reading Systems MUST treat any other OCF ZIP Containers that use ZIP encryption features as being in error.
  • It is not a requirement that Conforming OCF Reading Systems preserve information from an OCF ZIP Container through load and save operations that do not map to corresponding representation within the OCF Abstract Container; in particular, a Conforming OCF Reading System does not have to preserve CRC values, comment fields or fields that hold file system information corresponding to a particular operating system (e.g., “External file attributes” and “Extra field”)
  • Conforming OCF ZIP Containers MUST encode File System Names using UTF-8.

 

Here are some details about particular fields in the ZIP archive:

  • On the local file header table, Conforming OCF ZIP Containers MUST set the ‘version needed to extract’ fields to the values 10, 20 or 45 in order to match the maximum version level needed by the given file (e.g., 20 if Deflate is needed, 45 if ZIP64 is needed). Conforming OCF Reading Systems MUST treat any other values as being in error.
  • On the local file header table, Conforming OCF ZIP Containers MUST set the ‘compression’ method field to the values 0 or 8. Conforming OCF Reading Systems MUST treat any other values as being in error.
  • Conforming OCF Reading Systems MUST treat OCF ZIP Containers with an “Archive decryption header” or an “Archive extra data record” as being in error.

 

The first file in the ZIP Container MUST be a file by the ASCII name of ‘mimetype’ which holds the MIME type for the ZIP Container (i.e., “application/epub+zip” as an ASCII string; no padding, white-space or case change). The file MUST be neither compressed nor encrypted and there MUST NOT be an extra field in its ZIP header. If this is done, then the ZIP Container offers convenient “magic number” support as described in RFC 2048 and the following will hold true:

  • The bytes “PK” will be at the beginning of the file
  • The bytes “mimetype” will be at position 30
  • The actual MIME type (i.e., the ASCII string “application/epub+zip”) will begin at position 38

Dec 22, 2009 at 12:49 AM

Hello awx,

I will add an entry regarding your request in the "wish list" (Issue Tracker section), but cannot promise if possible, at least for the next release before EOY.

Regards,

Jaime.

Feb 9, 2010 at 7:53 AM
Edited Feb 9, 2010 at 7:57 AM

Hi awx, I have been studying your requirements for a while and have some comments in red:

OCF’s ZIP Container supports the ZIP format as specified by the application note at http://www.pkware.com/business_and_developers/developer/appnote/, but with the following constraints and clarifications:

  • Conforming OCF ZIP Containers MUST NOT use the features in the ZIP application note that allow ZIP files to be split across multiple storage media. Conforming OCF Reading Systems MUST treat any OCF files that specify that the ZIP file is split across multiple storage media as being in error. Not being used.
  • Conforming OCF ZIP Containers MUST only include uncompressed files or Flate-compressed files within the ZIP archive. Conforming OCF Reading Systems MUST treat any OCF Containers that use compression techniques other than Flate as being in error. Just using deflate algorithm.
  • Conforming OCF ZIP Containers MAY use the ZIP64 extensions and SHOULD only use those extensions when the content requires them. Conforming OCF Reading Systems MUST support the ZIP64 extensions. Currently zip64 not being supported.
  • Conforming OCF ZIP Containers MUST NOT use the encryption features defined by the ZIP format; instead, encryption MUST be done using the features described in Section 3.5.5. Conforming OCF Reading Systems MUST treat any other OCF ZIP Containers that use ZIP encryption features as being in error. Encryption is not supported.
  • It is not a requirement that Conforming OCF Reading Systems preserve information from an OCF ZIP Container through load and save operations that do not map to corresponding representation within the OCF Abstract Container; in particular, a Conforming OCF Reading System does not have to preserve CRC values, comment fields or fields that hold file system information corresponding to a particular operating system (e.g., “External file attributes” and “Extra field”) Can be avoided
  • Conforming OCF ZIP Containers MUST encode File System Names using UTF-8. Currently using CP850 for wider compatibility, but can be changed with ease.

Here are some details about particular fields in the ZIP archive:

  • On the local file header table, Conforming OCF ZIP Containers MUST set the ‘version needed to extract’ fields to the values 10, 20 or 45 in order to match the maximum version level needed by the given file (e.g., 20 if Deflate is needed, 45 if ZIP64 is needed). Conforming OCF Reading Systems MUST treat any other values as being in error. Currently using 20.
  • On the local file header table, Conforming OCF ZIP Containers MUST set the ‘compression’ method field to the values 0 or 8. Conforming OCF Reading Systems MUST treat any other values as being in error. Currently using just 0 or 8, as required.
  • Conforming OCF Reading Systems MUST treat OCF ZIP Containers with an “Archive decryption header” or an “Archive extra data record” as being in error. Not sure about what is this, but I think I am not using them.

The first file in the ZIP Container MUST be a file by the ASCII name of ‘mimetype’ which holds the MIME type for the ZIP Container (i.e., “application/epub+zip” as an ASCII string; no padding, white-space or case change). The file MUST be neither compressed nor encrypted and there MUST NOT be an extra field in its ZIP header. (*)This can be done by adding the mentioned file just after creation.

 If this is done, then the ZIP Container offers convenient “magic number” support as described in RFC 2048 and the following will hold true:

  • The bytes “PK” will be at the beginning of the file. This is ok
  • The bytes “mimetype” will be at position 30. Can be done as mentioned in (*)
  • The actual MIME type (i.e., the ASCII string “application/epub+zip”) will begin at position 38.

Can be done as mentioned in (*)

 So, I think unique constraint for OCF compatibility is the encoding, which can be changed at line 58 of ZipStorer.cs as:

private static Encoding FilenameEncoder = Encoding.UTF8;

Regarding the demo application, I would suggest the following replacement for TestForm.cs:

        private void ButtonProceed1_Click(object sender, EventArgs e)
        {
            // Previous checkings
            if (this.listBox1.Items.Count <= 0)
            {
                MessageBox.Show("Source files not chosen.", "ZipStorer Demo", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
                return;
            }
            if (string.IsNullOrEmpty(TextStorage1.Text))
            {
                MessageBox.Show("Target filename not defined.", "ZipStorer Demo", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
                return;
            }
            try
            {
                ZipStorer zip;
                if (this.RadioCreate.Checked)
                    // Creates a new zip file
                    zip = ZipStorer.Create(TextStorage1.Text, "");
                else
                    // Opens existing zip file
                    zip = ZipStorer.Open(TextStorage1.Text, FileAccess.Write);
                // Creates a memory stream with text
                if (this.RadioCreate.Checked)
                {
                    MemoryStream mimetype = new MemoryStream(System.Text.Encoding.UTF8.GetBytes("application/epub+zip"));
                    // Create the required mimetype file, just for new zip files
                    zip.AddStream(ZipStorer.Compression.Store, "mimetype", mimetype, DateTime.Now, "");
                    mimetype.Close();
                }
                // Stores all the files into the zip file
                foreach (string path in listBox1.Items)
                {
                    zip.AddFile(this.checkCompress.Checked ? ZipStorer.Compression.Deflate : ZipStorer.Compression.Store, 
                        path, Path.GetFileName(path), "");
                }
                // Updates and closes the zip file
                zip.Close();
                MessageBox.Show("Target file processed with success.", "ZipStorer Demo", MessageBoxButtons.OK, MessageBoxIcon.Information);
                // Clear controls
                this.listBox1.Items.Clear();
                this.TextStorage1.Text = "";
            }
            catch (InvalidDataException)
            {
                MessageBox.Show("Error: Invalid or not supported Zip file.", "ZipStorer Demo", MessageBoxButtons.OK, MessageBoxIcon.Error);
            }
            catch
            {
                MessageBox.Show("Error while processing target file.", "ZipStorer Demo", MessageBoxButtons.OK, MessageBoxIcon.Error);
            }
        }

 

 

Feb 15, 2010 at 4:00 AM
Edited Feb 15, 2010 at 4:01 AM

Thank you for taking the time to thoroughly investigate this. I've made the few changes you mentioned and can now output valid files with my ePubHub app. Now it will be possible to make the first public release in a few weeks.

 

PS- I'm glad you didn't need to make any drastic changes to your code. I think you should be able to claim compatibility with OpenOffice.org docs now as well as ePub docs.

Feb 15, 2010 at 4:26 AM

Glad to help :)
Please tell me if any issue in further tests.
Regards,
Jaime.