Date: Sun, 14 Apr 91 12:45:25 EDT From: Johnny Lee Subject: Compactor file format... A few people keep posting questions on this. By examining the files Compact Pro produces I was able to discern the file format which corresponds very well to the file format which Compactor's authour send out to developers. I thought you might want the file containing the file format. It contains the file format description as given by Compactor's authour and by me. It doessn't contain info on how to extract files from Compact Pro archives. If you decide not to, that's fine with me. Sorry about the first desciption (from Compact Pro's authour). Johnny ======================================================================= Date: 25-Jul-90 09:56 CDT >From: Bill Goodman [71101,204] Subj: file format Here is the information for accessing the directory information in Compactor archives which I have distributed to other developers. ---------------------------------------------------------------------------- - One complication with Compactor archives is that they can be segmented to span multiple disks. Each segment is a standard Macintosh file; however, you may need to give them special treatment. I believe that the only segment which will be of interest to you is the last one, because it is the one that contains the directory for the archive. Each segment begins with a segment header (DiskSegHeader). The dirOffset field gives you the offset of the directory in the file. If this is part of a multi-segment archive and this is not the last segment, dirOffset will be 0. You should skip these files since they do not contain any directory Enter command or for more! CompuServe Mail information. Using dirOffset you can read the directory header (DiskDirHeader). Note that the directory header is of variable length due to the archive note. You must determine where the directory header ends because the first file/folder record begins immediately after the directory header. The number of file/folder records which follow is given by recordCnt. File records (DiskFileRec) and folder records (DiskFolderRec) are also variable length because the file/folder names are stored as the minimum length Pascal string. Special encoding of the first character of the file/folder name allows you to determine whether the following record is file record or a folder record. If the first byte (the string length) is less than 128, the record is a file record. If the first byte is greater than 128, the record is a folder record. Note that the length byte for the folder name string must be adjusted (by subtracting 128) to obtain the correct length of the folder name. I believe the fields in the file records should be self explanatory; however, the recordCnt field in the folder record requires a little more description. Enter command or for more! CompuServe Mail The recordCnt field indicates the total number of following records which are included in this folder - this includes both file and folder records. To illustrate, here is a small example with the recordCnt numbers supplied in parentheses: Dog folder (10) File1 File2 Cat folder (2) File3 File4 Moose folder (3) File5 File6 File7 File8 File9 Empty folder (0) File10 Enter command or for more! CompuServe Mail ------------------ typedef struct { unsigned char fileFormat; /* File format version code (1) */ unsigned char segNum; /* Segment number */ unsigned int arcID; /* Randomly generated archive ID number */ unsigned long int dirOffset; /* Offset of directory in this segment */ } DiskSegHeader; typedef struct { unsigned char filler[4]; /* CRC for directory */ int recordCnt; /* Number of file/folder records in dir */ unsigned char arcNote[n]; /* Archive note (Pascal string, n = 1 - 256) */ } DiskDirHeader; typedef struct Enter command or for more! CompuServe Mail { unsigned char fName[n]; /* File name (Pascal string, n = 1 - 32) */ unsigned char filler[5]; OSType fileType; /* File type */ OSType fileCreator; /* File creator */ unsigned long int createDate; /* Creation date */ unsigned long int lastModDate; /* Last modification date */ unsigned char filler[8]; unsigned long rForkLen; /* Length of resource fork */ unsigned long dForkLen; /* Length of data fork */ unsigned char filler[8]; } DiskFileRec; typedef struct { unsigned char fName[n]; /* Folder name (Pascal string, n = 1 - 32) */ int recordCnt; /* Number of records contained in this folder */ } DiskFolderRec; ============================================================================== January 9, 1991 Compactor file format (as far as I've been able to discern) =========================================================== Archive structure ----------------- A Compactor archive contains an Archive Header(AH) followed by the compressed data followed by the Table of Contents (TOC). : 1 byte - always 1, probably version number 1 byte - number of segments in this archive, i.e. 0-255 2 bytes - ??? 4 bytes - location of TOC; always at the end of an archive, i.e. will be in the last segment of a multi- segment archive, otherwise 0. The TOC refers to information about the directories and files in the archive. : 4 bytes - CRC for directory 2 bytes - number of files and folders in the TOC 1 byte - length (LEN) of "Note" text LEN bytes - actual "Note" text, if LEN == 0, no bytes <...followed by directory and file structures...> =========================>> NOTE 1 <<=============================== The following structures are never guaranteed to be aligned to word or long word boundaries in the archive, so you'll have to read them into an appropriate structure as opposed to using a pointer to the data. ==================================================================== =========================>> NOTE 2 <<=============================== Remember that directory and file names will always be less than 128 characters (31 chars max, I believe). ==================================================================== If a directory exists in the archive, it will be of this form : 1 byte - length (LEN) of directory name (high bit will be set) LEN bytes - directory name text 2 bytes - number of files/folders(NFF) in this directory, followed by NFF File and Directory structures. If a file exists in the archive, it will be of this form : 1 byte - length (LEN) of file name (high bit will not be set) LEN bytes - file name text 1 byte - seems to always be 1, though I don't really know why yet 4 bytes - offset from start of file for file's compressed data 4 bytes - Mac File Type 4 bytes - Mac File Creator 4 bytes - file creation time (I assume, in secs. since Jan1, 1904) 4 bytes - file modification time 4 bytes - checksum? (one for resource and one for data fork?) 4 bytes - checksum? (could try all the popular CRC algorithms to see if anything matches) 4 bytes - original resource fork length 4 bytes - original data fork length 4 bytes - compressed resource fork length 4 bytes - compressed data fork length =========================>> NOTE 3 <<=============================== My guess as to compression is that Compactor uses a variant of Limpel-Ziv compression similar to that of Unix Compress & Stuffit 1.5.1 as opposed to the Limpel-Ziv variant used in LHarc. Anyone willing to tell me what it is? ==================================================================== =========================>> NOTE 4 <<=============================== Encryption seems to add about eight bytes to the size of a file's compressed data. ====================================================================