[[[[[[[[[ ARCHIVE FORMATS AND DATA ]]]]]]]]] (C) 1989 Raymond Clay Permision to freely distribute and modify is granted so long as credit is given and all of the text through the disclaimer is retained and unchanged October 31, 1989 If you want to submit corrections, changes or source code (in plain ASCII), you can reach me via: CIS : 74730,1344 GEnie : R.CLAY1 AppleLink : Raymond6 StarText : 209287 511 bytes, < 128k DOS 3.3 256 $3 > 129k HFS 524 $5 extended file $D subdirectory 20 NXCRTIM HEX 000000 ;file creation time and date 23 NXCRDAT HEX 0000000000 28 NXMDTIM HEX 000000 ;time and date file last modified 2B NXMDDAT HEX 0000000000 30 NXARCTIM HEX 000000 ;time and date file archived 33 NXARCDAT HEX 0000000000 . . Any other attributes are added here. NXATRBCNT points to the NXFNL, which is always the last attribute. . . NXATRBCNT-2 NXFNL DW 0000 ;filename length FILENAME SECTION ---------------- NXFNL+2 NXFILE DS NXFNL ;Filename, partial pathname or disk ;volume name. Names ported across ;systems may have illegal characters ;or characteristics. Page 12 THREAD SECTION -------------- Thread records are 16 byte records which immediately follow the filename and describe the types of data structures which are included with a given record. The number of threads is in the attribute section under NXNUMTHR. A thread record can be represented as follows: OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 THRCLASS DW 0000 ;describes the class of the thread 0000 CLS_MSG 0001 CLS__CNTRL 0002 CLS_DATA 0003 CLS_SPRSE 02 THRFRMT DW 0000 ;format of the data within the thread 0000 ;Uncompressed 0001 ;SQueezed (SQ/USQ) 0002 ;Dynamic LZW [ShrinkIt] 0003 . ;RESERVED, contact the author FFFF 04 THRKIND DW 0000 ;describes data in thread if THRCLS = and THRKIND = THEN THE THREAD CONTAINS: ---------------- -------------- --------------------------- CLS_MSG 0000 ASCII text all others undefined CLS_CNTRL 0000 create directory all others undefined CLS_DATA 0000 data_fork of file 0001 disk image 0002 resource_fork of file all others undefined 06 DS 2 08 THREOF HEX 00000000 ;length of the uncompressed thread 0C THRCMPEOF HEX 00000000 ;length of the compressed thread POSITIONING IN FILE ------------------- Start of the thread list = (beginning of header) + NXATRBCNT + NXFNL End of the thread list = (beginning of header) + NXATRBCNT + NXFNL + (16 * NXNUMTHR) Start of a data_thread = (beginning of header) + NXATRBCNT + NXFNL + (16 * NXNUMTHR) + (THRCMPEOF of all threads in the thread list which are not data prior to finding a CLS_DATA = 0000) Page 13 Start of a resource_thread = (beginning of header) + NXATRBCNT + NXFNL + (16 * NXNUMTHR) + (THRCMPEOF of all the threads in the thread list which are not resources prior to finding a CLS_DATA = 0002) Next record = (beginning of header) + NXATRBCNT + NXFNL + (16 * NXNUMTHR) + (THRCMPEOF of each thread) ************************************************** PACKIT ====== System of Origin : Macintosh FILE HEADER ----------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 PITFLN HEX 00 ;filename length 01 PITFNAM DS 63 ;filename 40 PITTYP HEX 00000000 ;file type 44 PITCRT HEX 00000000 ;Creator 48 PITFFLG DW 0000 ;Finder flags 4C PITLOK DW 0000 ;locked? 4E PITDSIZ HEX 00000000 ;data fork uncompressed size 52 PITRSIZ HEX 00000000 ;resource fork uncompressed size 56 PITCDSIZ HEX 00000000 ;data fork compressed size 5A PITCRSIZ HEX 00000000 ;resource fork compressed size 5E PITCRC DW 0000 ;CRC ************************************************** STUFFIT ======= System of Origin : Macintosh Original author : Raymond Lau FILE FORMAT ----------- Master Header file header 1 file 1 resource fork file 1 data fork file header 2 file 2 resource fork file 2 data fork . . file header n file n resource fork file n data fork EOF Page 14 MASTER HEADER ------------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 SITHSIG ASC 'SIT!' ;STUFFIT archive signature 04 SITHNUM DW 0000 ;number of files in archive 06 SITHLEN HEX 00000000 ;length of entire archive incl hdr 0A SITHID2 ASC 'rLau' ;authors name - R. Lau 0E SITHVER DB 00 ;version number 0F DS 7 ;reserved FILE HEADER ----------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 SITFRCMP DB 00 ;rsrc fork compression method 01 SITFDCMP DB 00 ;data fork compression method 02 SITFNL DB 00 ;file name length 03 SITFNAM DS $3F ;filename 41 SITFTYP DB 00000000 ;filetype 45 SITFCR DB 00000000 ;file creator 49 SITFFFL DW 0000 ;Finder flags 4B SITFCRD HEX 00000000 ;creation date 4F SITFMDD HEX 00000000 ;modification date 53 SITFRLN HEX 00000000 ;uncompressed resource fork length 57 SITFDLN HEX 00000000 ;uncompressed data fork length 5B SITFCRLN HEX 00000000 ;compressed resource fork length 5F SITFCDLN HEX 00000000 ;compressed data fork length 61 SITFRCRC DW 0000 ;resource fork CRC 63 SITFDCRC DW 0000 ;data fork CRC 65 SITFRPAD DB 00 ;pad bytes for encrypted files, 66 SITFDPAD DB 00 ;resource and data forks 6A DS 4 ;reserved 6E SITFHCRC DW 0000 ;CRC of file header STUFFIT METHODS NAME METHOD DESCRIPTION ----------- ------ -------------------------------------------- noComp 0 uncompressed rleComp 1 RLE compression lzwComp 2 LZW compression, 18k buffer, 14 bit code size hufComp 3 Huffman compression encrypted 16 bit set if encrypted. ex: encrypted+lzwComp startFolder 32 marks start of a new folder endFolder 33 marks end of the last folder started Page 15 POSITIONING IN FILE ------------------- First File Header = SITHSIG + $15 Begining of Resource Fork = SITFRCMP + $6F Begining of Data Fork = Begining of Resource Fork + SITFCRLN Next File Header = Begining of previous Data Fork + SITFCDLN or = Previous File Header + $6F + SITFCRLN + SITCDLN ************************************************** ZIP === System of Origin : IBM Original author : Phil Katz FILE FORMAT ----------- Files stored in arbitrary order. Large zipfiles can span multiple diskette media. Local File Header 1 file 1 extra field file 1 comment file data 1 Local File Header 2 file 2 extra field file 2 comment file data 2 . . . Local File Header n file n extra field file n comment file data n Central Directory central extra field central comment End of Central Directory end comment EOF Page 16 LOCAL FILE HEADER ----------------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 ZIPLOCSIG HEX 04034B50 ;Local File Header Signature 04 ZIPVER DW 0000 ;Version needed to extract 06 ZIPGENFLG DW 0000 ;General purpose bit flag 08 ZIPMTHD DW 0000 ;Compression method 0A ZIPTIME DW 0000 ;Last mod file time (MS-DOS) 0C ZIPDATE DW 0000 ;Last mod file date (MS-DOS) 0E ZIPCRC HEX 00000000 ;CRC-32 12 ZIPSIZE HEX 00000000 ;Compressed size 16 ZIPUNCMP HEX 00000000 ;Uncompressed size 1A ZIPFNLN DW 0000 ;Filename length 1C ZIPXTRALN DW 0000 ;Extra field length 1E ZIPNAME DS ZIPFNLN ;filename -- ZIPXTRA DS ZIPXTRALN ;extra field CENTRAL DIRECTORY STRUCTURE --------------------------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 ZIPCENSIG HEX 02014B50 ;Central file header signature 04 ZIPCVER DB 00 ;Version made by 05 ZIPCOS DB 00 ;Host operating system 06 ZIPCVXT DB 00 ;Version needed to extract 07 ZIPCEXOS DB 00 ;O/S of version needed for extraction 08 ZIPCFLG DW 0000 ;General purpose bit flag 0A ZIPCMTHD DW 0000 ;Compression method 0C ZIPCTIM DW 0000 ;Last mod file time (MS-DOS) 0E ZIPCDAT DW 0000 ;Last mod file date (MS-DOS) 10 ZIPCCRC HEX 00000000 ;CRC-32 14 ZIPCSIZ HEX 00000000 ;Compressed size 18 ZIPCUNC HEX 00000000 ;Uncompressed size 1C ZIPCFNL DW 0000 ;Filename length 1E ZIPCXTL DW 0000 ;Extra field length 20 ZIPCCML DW 0000 ;File comment length 22 ZIPDSK DW 0000 ;Disk number start 24 ZIPINT DW 0000 ;Internal file attributes LABEL BIT DESCRIPTION ----------- --------- ----------------------------------------- ZIPINT 0 if = 1, file is apparently an ASCII or text file 0 if = 0, file apparently contains binary data 1-7 unused in version 1.0. 26 ZIPEXT HEX 00000000 ;External file attributes, host ;system dependent 2A ZIPOFST HEX 00000000 ;Relative offset of local header ;from the start of the first disk ;on which this file appears 2E ZIPCFN DS ZIPCFNL ;Filename or path - should not ;contain a drive or device letter, ;or a leading slash. All slashes ;should be forward slashes '/' -- ZIPCXTR DS ZIPCXTL ;extra field -- ZIPCOM DS ZIPCCML ;file comment Page 17 END OF CENTRAL DIR STRUCTURE ---------------------------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 ZIPESIG HEX 06064B50 ;End of central dir signature 04 ZIPEDSK DW 0000 ;Number of this disk 06 ZIPECEN DW 0000 ;Number of disk with start central dir 08 ZIPENUM DW 0000 ;Total number of entries in central dir ;on this disk 0A ZIPECENN DW 0000 ;total number entries in central dir 0C ZIPECSZ HEX 00000000 ;Size of the central directory 10 ZIPEOFST HEX 00000000 ;Offset of start of central directory ;with respect to the starting disk ;number 14 ZIPECOML DW 0000 ;zipfile comment length 16 ZIPECOM DS ZIPECOML ;zipfile comment ZIP VALUES LEGEND ----------------- HOST O/S VALUE DESCRIPTION VALUE DESCRIPTION ----- -------------------------- ----- ------------------------ 0 MS-DOS and OS/2 (FAT) 5 Atari ST 1 Amiga 6 OS/2 1.2 extended file sys 2 VMS 7 Macintosh 3 *nix 8 thru 4 VM/CMS 255 unused GENERAL PURPOSE BIT FLAG LABEL BIT DESCRIPTION ----------- --------- ----------------------------------------- ZIPGENFLG 0 If set, file is encrypted or 1 If file Imploded and this bit is set, 8K ZIPCFLG sliding dictionary was used. If clear, 4K sliding dictionary was used. 2 If file Imploded and this bit is set, 3 Shannon-Fano trees were used. If clear, 2 Shannon-Fano trees were used. 3-4 unused 5-7 used internaly by ZIP Note: Bits 1 and 2 are undefined if the compression method is other than type 6 (Imploding). Page 18 COMPRESSION METHOD NAME METHOD DESCRIPTION ----------- ------ -------------------------------------------- Stored 0 No compression used Shrunk 1 LZW, 8K buffer, 9-13 bits with partial clearing Reduced-1 2 Probalistic compression, L(X) = lower 7 bits Reduced-2 3 Probalistic compression, L(X) = lower 6 bits Reduced-3 4 Probalistic compression, L(X) = lower 5 bits Reduced-4 5 Probalistic compression, L(X) = lower 4 bits Imploded 6 2 Shanno-Fano trees, 4K sliding dictionary Imploded 7 3 Shanno-Fano trees, 4K sliding dictionary Imploded 8 2 Shanno-Fano trees, 8K sliding dictionary Imploded 9 3 Shanno-Fano trees, 8K sliding dictionary EXTRA FIELD OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ---------- ---------------------------- 00 EX1ID DW 0000 ;0-31 reserved by PKWARE 02 EX1LN DW 0000 04 EX1DAT DS EX1LN ;Specific data for individual . ;files. Data field should begin . ;with a s/w specific unique ID EX1LN+4 EXnID DW 0000 EXnLN DW 0000 EXnDAT DS EXnLN ;entire header may not exceed 64k ************************************************** ZOO === System of Origin : IBM Original author : Rahul Dhesi FILE FORMAT ----------- Master Header file 1 header file 1 file 2 header file 2 . . file n header file n EOF Page 19 MASTER HEADER ------------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 DS 20 14 ZOOSIG HEX A7DCFDC4 ;File signature 18 ZOO1PTR HEX 00000000 ;pointer to 1st header 1C ZOO? HEX 00000000 ;? 20 ZOOMVER DB 00 ;version making archive 21 ZOOMIN DB 00 ;minimum version needed to extract FILE HEADER ----------- OFFSET LABEL TYP VALUE DESCRIPTION ------ ----------- ---- ----------- ---------------------------------- 00 ZOOFSIG HEX A7DCFDC4 ;signature 04 ZOOFTYP DB 00 ;? 06 ZOOFCMP DB 00 ;Compression method 08 ZOOFNXH HEX 00000000 ;Nxt hdr ofst frm Start of ZOO file 0A ZOOFCUR HEX 00000000 ;Offset of this hdr 0E ZOOFDAT DW 0000 ;Last mod file date (MS-DOS) 10 ZOOFTIM DW 0000 ;Last mod file time (MS-DOS) 12 ZOOFCRC DW 0000 ;CRC-16 14 ZOOFOSZ HEX 00000000 ;Uncompressed size 18 ZOOFNSZ HEX 00000000 ;Compressed size 1C ZOOFMVER DB 00 ;version that made this file 1D ZOOFMIN DB 00 ;minimum version needed to extract 1E Z00FDEL DB 00 ;1 if file deleted from archive 1F ZOOFCMTP HEX 00000000 ;pointer to comment, 0 if none 23 ZOOFCMTL DW 0000 ;length of comment 25 ZOOFNAM DS 13 ;filename ZOO METHOD ---------- NAME METHOD DESCRIPTION ----------- ------ -------------------------------------------- Stored No compression used Crunched Packing, LZW, 4K buffer, var len (9-12 bits) POSITIONING IN FILE ------------------- Begining of 1st File header = Begining of File + ZOO1PTR or = Begining of File + $21 Begining of File Data = Begining of File Header + $31 Begining of Next File = Begining of File + ZOOFNXH Begining of File Comment = Begining of File Header + ZOOFCMTP ************************************************** Page 20 TIME VALUES =========== MS-DOS TIME FORMAT ------------------ LABEL BIT DESCRIPTION ----------- --------- ----------------------------------------- DATE 15-9 Year 8-5 Month 4-0 Day (all zeroes means no date) TIME 15-11 Hours (military) 10-5 Minutes 4-0 Seconds ProDOS/SOS TIME FORMAT (APPLE) ------------------------------ LABEL BIT DESCRIPTION ----------- --------- ----------------------------------------- DATE 15-9 Year (0-99) 8-5 Month 4-0 Day TIME 15-8 Hour (military time) 7-0 Minutes ************************************************** EXTENDED FILES -------------- Extended files are a storage format used by a variety of operating systems. The filename information points to a file that points to 2 other files known as the DATA FORK and the RESOURCE FORK. The resource fork contains information about, and/or for the use of, the data fork. In porting amongst systems the resource fork is probably of no use. The data fork contains the actual file. ************************************************** FILENAMES --------- File name lengths, legal characters and format vary amongst the various operating systems. MS-DOS allows a wider variety of characters while the Apple operating systems allow longer names with no set format (no extensions). Any program must be ready to convert a filename into the current operating system format as well as handle any paths (either by creating or ignoring them). Page 21 A suggestion would be to use a more universal standard for the filenames of files that are likely to be ported (ie; Text, ASCII Source, GIF, ASCII data files, etc) while making no special effort with executable code (including tokenized BASIC) filenames. Such a standard might be: a filename of no more than 13 or less than 6 characters legal characters A-Z (all uppercase) the '.' and 0-9 periods to be used only once in a file to make an MS-DOS type extension (ie; .TXT, .DOC, etc) the filename MUST start with an alphabetic character (A-Z) ********************************************************************** Information taken from files by Alex Bamdad, Rahul Dhesi, Jim Dorsey, Don Elton, Colin James, Phil Katz, Raymond Lau, Gary Little, Andrew Nicholas, Haruhiko Okumura, Martin Peckham, Mike Sax, Tim Swihart and probably others.