See also: [Amyc 1.21 - Now!] - Freeware Netscape Cache Program
I have been playing with Microsoft Windows internet browsers
for several years. I think I started with Netscape's Personal
Edition (version 1.1). They have improved since then, but
I still like getting at the cached files directly from time
to time. This document tells you what I know about doing this.
I have stopped actively working on this as IE finally gave me
enough access to the cache through the history command that
I didn't need to dig it out by myself.
I have a couple DOS based programs written in C that do some
of this so if anyone is interested get in touch per above,
or use snail mail to:
W. Kranz, P.O. Box 333, Bradford, NH 03221
I have no affiliation with Microsoft or Netscape. What is
presented here was learned by inspection. Use this information
at your own risk! I am not responsible for any damage to your
system or loss of data that may result.
My Netscape experience isn't as extensive as the Microsoft. I
tend to use Netscape on my Win 3.1 systems. However the
information at the end of this file on Netscape's file naming
conventions may be useful to all.
Everyone seems to change the structure of their index file with
each new release! The comments below are arranged by decreasing
version number. Note I assume Win95 for the Win32 systems below
which is what I was using, if you have NT installed you probably
need to look in \WinNT rather than \Windows. I assume a C
structure definition style makes sense to the reader.
The following versions are discussed:
Microsoft Internet Explorer 4.0 - Win32
Microsoft Internet Explorer 3.0 - Win32
Microsoft Internet Explorer 2.0 - Win32
Microsoft Internet Explorer 3.0A - Win 3.1 (16 bit)
Netscape Navigator 3.01 - Win 3.1 (16 bit)
Netscape Personal Edition 1.1 - Win 3.1 (16 bit)
----- Microsoft Internet Explorer 4.0 - Win32
As an initial disclaimer, I have no idea about the purpose of
approximately 30% of the information in the file. However I do
know how to step through it and find the files URL, time stamp,
and location in cache. Under Windows 95 and 98 there is a hidden
subdirectory which contains the following file:
\WINDOWS\Temporary Internet Files\index.dat
The basic layout of this file is as follows:
#pragma pack(1) //probably don't need this, but...
struct ie40head {
char id[0x20];
unsigned long info[11];
}
struct dir_info {
unsigned long data;
char name[8];
} // dirs[4]; this is what I have seen, may be variable
A data region whose purpose is unknown, typically extending
to offset 0x4000 in the file. The remainder of the file
is broken up into data blocks of three types (that's all I've
seen anyway!). These will be described below, but let me back up
and give a little more information on the header block. Below
is a dump of one of my headers:
0000: 43 6C 69 65 6E 74 20 55 72 6C 43 61 63 68 65 20 |Client UrlCache
0010: 4D 4D 46 20 56 65 72 20 34 2E 37 00 00 00 0D 00 |MMF Ver 4.7.....
0020: 00 40 00 00 80 19 00 00 53 16 00 00 00 5B 72 A9 |.@......S....[r.
0030: 00 D8 FF 03 00 00 00 00 00 80 A6 03 00 00 00 00 |................
0040: 00 00 00 00 00 00 00 00 04 00 00 00 E4 01 00 00 |................
0050: 4F 50 42 55 38 44 47 53 E2 01 00 00 58 49 58 55 |OPBU8DGS....XIXU
0060: 4C 56 4F 45 E1 01 00 00 53 57 36 36 59 35 4C 44 |LVOE....SW66Y5LD
0070: E3 01 00 00 52 44 30 41 56 50 34 4C 00 00 00 00 |....RD0AVP4L....
In the files I have looked at info[0] (at offset 0x20) has always
been 0x4000 which represents the offset to the begining of the
data region I understand. info[11] (at offset 0x48) has always
been 4 which is the number of struct dir_info, dirs[], which
follow the struct ie40head. Each dir_info appears to contain a
long and and 8 bytes of character data which represent the name
of a hidden sub directory located below index.dat in the
directory tree. Strange to hide these, oh well. Note these
aren't strings in that they have no NUL terminator. To parse the
file you want to save these names in an array as they are
referenced by index in the URL blocks. The purpose of the long
associated with each cache directory name is also unknown, on my
system it always seems to be a little less than the number of
files currently in the associated cache directory! The state of
these directories at the time of the dump above is shown below:
Directory of C:\WINDOWS\Temporary Internet Files\OPBU8DGS
580 file(s) 2,811,216 bytes
Directory of C:\WINDOWS\Temporary Internet Files\XIXULVOE
582 file(s) 3,369,986 bytes
Directory of C:\WINDOWS\Temporary Internet Files\SW66Y5LD
590 file(s) 2,896,821 bytes
Directory of C:\WINDOWS\Temporary Internet Files\RD0AVP4L
576 file(s) 2,444,706 bytes
Note that these sub directory names are different on the couple
of installations I have examined.
The bulk of the file, everything after offset = info[0], can be
viewed as blocks of 0x80 bytes. Seek to this offset and read a
data block. Examine the first two longs in the data block. The
first is a block identifier with one of the following values:
// a long that has same pattern as four ascii chars
#define HASH_ID 0x48534148L // "HASH"
#define REDR_ID 0x52444552L // "REDR"
#define URL_ID 0x204C5255L // "URL "
The 2nd long in this block is the total number of blocks
associated with the identifier. Basically this is a variable
length record system (see the notes on Version 3.0 for a
comparison of what they did previously). My experience is:
HASH_ID records are 32 blocks long.
REDR_ID records are normally 1 and sometimes 2 blocks long.
URL_ID records are often 2 and sometimes 3 blocks long.
I have no knowledge of what HASH_ID records are used for. One
can assume it is how IExplorer does a fast search to determine if
a particular URL is already in the cache, but I just skip over
them.
I'm not sure what a REDR_ID record is for, but observe they tend
to be things link search requests where the server generated the
data. If you use the IExplorer history and click on one of these
while off-line it will force you on-line to regenerate the data.
There seems to be no corresponding file in the cache sub-
directories. The full URL appears to be a NUL terminated string
which starts at offset 0x10 in the record. A dump of a sample
record is shown below:
7C80: 52 45 44 52 01 00 00 00 60 71 04 00 00 4B A0 37 |REDR....`q...K.7
7C90: 68 74 74 70 3A 2F 2F 69 6E 74 65 6C 2E 6E 67 61 |http://intel.nga
7CA0: 64 63 65 6E 74 65 72 2E 6E 65 74 2F 69 6D 61 67 |dcenter.net/imag
7CB0: 65 2E 6E 67 2F 73 70 61 63 65 64 65 73 63 3D 73 |e.ng/spacedesc=s
7CC0: 65 61 72 63 68 26 6B 65 79 77 6F 72 64 3D 50 6E |earch&keyword=Pn
7CD0: 50 26 74 72 61 6E 73 61 63 74 69 6F 6E 49 44 3D |P&transactionID=
7CE0: 39 31 35 34 36 37 30 30 35 35 36 30 00 F0 AD 0B |915467005560....
7CF0: 0D F0 AD 0B 0D F0 AD 0B 0D F0 AD 0B 0D F0 AD 0B |................
The URL_ID record is the one that interests me as these are the
ones that are stored in the cache sub directories. The URL
appears to be a NUL terminated string which starts at offset 0x68
in the record. I view the data ahead of this as an array of
longs, the table below indicates the purpose of those I think I have
identified:
Array Index Purpose
0 Block identifier
1 record length in 0x80 byte blocks
2 & 3 QUADWORD - probably time file was last modified
or 0 if not known, always less than 4 & 5
4 & 5 QUADWORD - Win32 time file was cached
13 offset to end of data in record
14 offset to start of URL string (always 0x68)
15 index of cache sub directory, dir[i]
EXCEPT if = 0xFF, see below
16 offset to file name string in cache dir[i]
18 offset to content data
A dump of a sample record is shown below:
BC00: 55 52 4C 20 03 00 00 00 00 00 00 00 00 00 00 00 |URL ............
BC10: 20 54 A0 CA 32 2E BE 01 00 00 00 00 00 00 00 00 | T..2...........
BC20: 86 78 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |.x..............
BC30: 00 00 00 00 0C 01 00 00 68 00 00 00 01 00 00 00 |........h.......
BC40: 8C 00 00 00 41 00 00 00 98 00 00 00 70 00 00 00 |....A.......p...
BC50: 00 00 00 00 97 25 84 29 01 00 00 00 00 00 00 00 |.....%.)........
BC60: 97 25 82 29 0D F0 AD 0B 68 74 74 70 3A 2F 2F 77 |.%.)....http://w
BC70: 77 77 2E 63 6F 6D 70 75 73 65 72 76 65 2E 63 6F |ww.compuserve.co
BC80: 6D 2F 67 61 74 65 77 61 79 2F 00 0B 67 61 74 65 |m/gateway/..gate
BC90: 77 61 79 2E 68 74 6D 00 48 54 54 50 2F 31 2E 31 |way.htm.HTTP/1.1
BCA0: 20 32 30 30 20 4F 4B 0D 0A 43 6F 6E 74 65 6E 74 | 200 OK..Content
BCB0: 2D 4C 65 6E 67 74 68 3A 20 33 30 38 35 34 0D 0A |-Length: 30854..
BCC0: 43 6F 6E 74 65 6E 74 2D 54 79 70 65 3A 20 74 65 |Content-Type: te
BCD0: 78 74 2F 68 74 6D 6C 0D 0A 43 61 63 68 65 2D 63 |xt/html..Cache-c
BCE0: 6F 6E 74 72 6F 6C 3A 20 70 72 69 76 61 74 65 0D |ontrol: private.
BCF0: 0A 0D 0A 7E 55 3A 76 61 6C 75 65 64 20 63 75 73 |...~U:valued cus
BD00: 74 6F 6D 65 72 0D 0A 00 0D F0 AD 0B 0D F0 AD 0B |tomer...........
I have not spent a lot of time looking at the content data.
Its not clear its NUL terminated, but the record length at
index 13 makes a good terminator. Seems to be made up of
sub records terminated by 0xD,0xA. Always seems to include
"Content-Length" and "Content-Type", but also some pretty
interesting stuff as per "Cache-control" above. May be
terminated by a double 0xD,0xA pair?
Given the values of the longs at index 15 & 16 in the block,
{0x1,0x8c}, one can generate the cache file used:
XIXULVOE\gateway.htm
Ie its a zero based index system where 1 represents the 2nd
directory after the file header block.
Except I find the initial URL entries on my system have a value
of 0xFF for the 19th long. This seems to be a special flag.
There is no associated file in the cache and the URL has the a
*.cdf extension. These look to be the Channel Definition Files,
but I haven't played with them....
If these records are sorted, I don't know the algorithm. They do
not seem to be ordered by access time!
You have to be careful about file sharing when accessing index.dat.
I did my test in a DOS window. One can copy the file to
something else and access it directly, or use sopen() for
shared access.
As a point of interest, there is also a hidden directory
\Windows\Favorites\
this contains *.url files where the file name is the name you add
to your favprites list and the URL embedded in the file. Its
a text file, you can look at it with any editor.
----- Microsoft Internet Explorer 3.0 - Win32
You should find the following directory, I can't remember if
it is normally visible (maybe not).
\WINDOWS\Temporary Internet Files\
Beneath it are four hidden sub directories:
cache1, cache2, cache3, and cache4
In turn each of these contains two master directory lists in
addition to approximately 1/4 of the cached files. The master
list contains a header block followed by fixed length records.
list file name record length (bytes)
mm256.dat 0x100
mm2048.dat 0x800
I'm afraid I don't have IExplore 3.0 installed anymore, and
have lost my initial debug dumps. However the following
structure worked as the initial header. It contained a
descriptive string identifer for file in the 1st 28 bytes followed
by some longs. You really only need other[3] starting at offset
0x28 in the record. This is the number of valid entries in the
file. I suspect the system allocates larger blocks for io,
other[2] looks like the number of blocks available in the file >=
other[3].
struct cache_head {
char desc[28]; // this works but might be var length str?
unsigned long other[12];
};
The data records for both files start at offset 0x400 in file,
so after determining the total number of entries in the file,
tcnt=other[3], and the record size from the file name you can read
the fixed length records. It looks like the system generates data
records, and adds them to mm256.dat if their length is less than
0x100 bytes. Otherwise they go into mm2048.dat.
The first 0x68 bytes of each record can be treated as an array of
longs which describe the record as follows (I only indicate those
I think I know, index 0 is the 1st long in the record):
Array Index Purpose
0 length of record in bytes <= block size
5 offset to start of URL string
6 offset to file name string (in this sub directory)
10 & 11 QUADWORD time, probably last modified time
12 & 13 QUADWORD time, Win32 time file written to cache
14 & 15 maybe a QUADWORD time (but what??)
17 offset to start of content data
This is a series of descriptive strings each
terminated by cr/lf. Two cr/lf terminate region.
Almost always contains "Content-Length" and
"Content-Type" but do a case insensitive compare!
18 length of content data region above
19 if non-zero offset to extra data region
20 extra data region length (if it exists)
21 if non-zero offset to a type string?
sometimes its 1, others an offset to an
extension string (esp if was an ftp transfer)
I was able to generate a master list by reading all record
entries in all four cache sub directories into memory, sorting
the records on file time (to make it pretty), then outputing
the results.
The favorites listing is handled as in IExplorer 4.0 per above,
see the directory:
\Windows\Favorites\
----- Microsoft Internet Explorer 2.0 - Win32
By default this had a cache sub directory below where ever you
installed the software. You could change the location of the
cache directory through the options menu. There was a text file,
iexplore.cif, which was the master list in this directory along
with all the cached files. This format is similar to the early
Netscape cache file format. On my system the cache directory
was: \progra~1\plus!\micro~1\cache\
There was also a hidden text file:
\progra~1\plus!\micro~1\history\globhist.htm
It contained a list of files last visited by url shortcut. This
seems redundant, but must have had a purpose.
A sample dump of the first few lines in cache\iexplore.cif
follows (for ref see Will's iexcache.c):
V,Microsoft® Windows(TM) Internet Tools,71303476
F,http://www.home.msn.com/msn.htm,1123,2092492075,8,4294967295,0,2092552558,0,text/html,msn.htm,0
F,http://www.home.msn.com/images/msn-side.gif,290,2092483178,6,4294967295,0,2092552558,0,image/gif,msn-side.gif,0
... ended with following:
F,http://www.home.msn.com/images/bar.gif,6819,2092483178,2,4294967295,0,2092552585,0,image/gif,bar.gif,0
F,http://www.home.msn.com/maps/maps.htm,652,2092525763,32,4294967295,0,2092552585,0,text/html,maps.htm,0
The first line is an identifier. Each additional comma delemited line
describes a file in the cache directory.
Each line has the following entries:
Entry Purpose
1st flag either 'F' or 'V' purpose unknown
2nd URL for file
3rd - 9th treat as array of longs, nums[7]
3rd file length in bytes
4th time, probably last modified
8th time file written to disk
10th data content string ie file type
11th file name in cache directory
12th unknown, always 0?
Note I never understood the times above. I couldn't get them
to convert exactly to DOS 16 bit time. One assumes
it is some representation of a QUADWORD Win32 time, but I
didn't get it. The following works fairly well but I'm
missing something.
let d = dos time in seconds since 1970
w = unsigned long from 4th or 8th entry above
d = (w - 2063317606L)/ 0.036;
The entries are in order of the 8th entry, ie the file time.
----- Microsoft Internet Explorer 3.0A - Win 3.1 (16 bit)
I believe this was the last version of IExplorer for Win31. It
was a port of the Win32 version 3.0 which allowed secure
transactions over the internet with Win31. It was similar to the
Win32 version, but the cache master list file was main.idx, and
did not have an initial descriptive identification line. This is
a copy of the first two lines (ref netcache.c):
F http://darien.and.newcanaan.com/ 1020 857525592 0 858482469 text/html F:\IEXPLORE\Cache\iea41.htm 2 0
F http://darien.and.newcanaan.com/frconten.htm 2525 856113190 0 858482493 text/html F:\IEXPLORE\Cache\iea26500.htm 2 0
Note the lines were tab, , delimited. The order of entries is
slightly different, and DOS 16 bit times are used (seconds since
1970).
Entry Purpose
1st flag either 'F' or 'V' purpose unknown
2nd URL for file
3rd file length in bytes
4th time, probably last modified
5th unknown
6th time file written to disk
7th data content string ie file type
8th full path of file in cache directory
9th unknown
10th unknown, always 0?
The entries are in order of the 6th entry, ie the file time.
All cached files were in this directory as in Win32, IExplorer 2.0.
----- Netscape Navigator 3.01 - Win 3.1 (16 bit)
Win 3.1 Netscape 3.01 seems to have gone to a binary format
with the master list being FAT.DB. I haven't done as much
validation with this format and may be missing something.
The general outline below seems correct.
The file is broken up into blocks of 0x1000 bytes each.
I don't know what is in the 1st block, and ignore it!
One can read the remaining blocks in order and obtain
a list of the cached files with the associated URL.
Each block seems self contained.
One starts by treating the beginning of the block as an
array of two byte integers (shorts). The first integer is the
number of valid entries in the array, n. If one accesses the
rest of the entries in this array from entry[n] to entry[1]
you obtain the offsets to the file records in this block.
Each offset is with respect to the start of the block.
A sample dump of my first data block at 0x1000 follows:
1000: 0A 00 B9 0F 08 0F B6 0E FA 0D A7 0D EA 0C A2 0C |................
1010: F0 0B B0 0B 06 0B EC 0A 06 0B 9D 09 E0 08 96 08 |................
1020: D4 07 8D 07 CE 06 8E 06 E4 05 B6 05 E4 05 EE 04 |................
1030: 3C 04 F2 03 3E 03 EF 02 36 02 E2 01 24 01 02 08 |<...>...6...$...
1040: 00 00 DE 00 24 01 00 00 00 00 00 00 00 00 00 00 |....$...........
1050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................
There are 10 (0xA) offsets in the entry array from:
entry[0xA] = 0xB06 at file offset 0x1014 (FILE_INFO)
to
entry[1] = 0xFB9 at file offset 0x1002 (URL)
Per discussion below I think the number of entries will
always be even as the records occur in pairs. Note there
is also additional data in this region beyond entry[0xA]
which points to older unused/invalid records.
Stepping through the data in the block using the offsets
in entry[] in the order shown gives the cached files in
ascending order of the time they were written to disk.
Note these offsets are to portions of the record, typically
the information for a file is represented by a pair of
offsets. The first is to the FILE_INFO record, the second
is the URL record described below.
There are a couple tricks here, and this is the area where I
may be a little off! My file had one block that was almost
all 0xff bytes. The value of the 1st integer word in this
block is 0xfffb. Since I'm treating this as an integer, its
negative. Even if it were unsigned it is much bigger than the
size of the block so clearly invalid. I ignore such a block.
There are also some oddball records which I dump below,
note these occur as pairs of records just as the FILE_INFO
and URL records:
2A96: 4E 65 74 73 63 61 70 65 20 49 6E 74 65 72 6E 61 |Netscape Interna
2AA6: 6C 20 44 69 73 6B 20 43 61 63 68 65 00 49 4E 54 |l Disk Cache.INT
2AB6: 5F 45 78 74 65 72 6E 61 6C 43 61 63 68 65 4E 61 |_ExternalCacheNa
2AC6: 6D 65 53 74 72 69 6E 67 |meString
apparently a string and an associated identifier
locate by pair of entries in data region at 201A: {B3 0A 96 0A}
next record starts at 0x2ACE
46BC: 44 00 00 00 49 4E 54 5F 44 69 73 6B 43 61 63 68 |D...INT_DiskCach
46CC: 65 4E 75 6D 62 65 72 |eNumber
apparently a long and an associated identifier
locate by pair of entries in data area at 0x402A: {C0 06 BC 06}
next record starts at 0x46D3
Note there are 68 files in this cache directory, ie 0x44
7762: 95 62 06 00 49 4E 54 5F 44 69 73 6B 43 61 63 68 |.b..INT_DiskCach
7772: 65 53 69 7A 65 |eSize
apparently a long and an associated identifier
locate by pair of entries in data area at 0x7026: {66 07 62 07}
next record begins at 0x7777
note the file length of this fat.db is 0x9000
this cache directory uses 0x334ED bytes for files
Not clear what the CacheSize would be...
In my preliminary investigation I also found one case where
the first entry[i] in a pair of entries was zero. This is
clearly invalid, the second offset didn't point to anything
which appeared to be meaningful. Don't know how common
this is, but I'd skip cases where entry[] = 0.
Below is a typical record for a file. FAT.DB actually treats
this as two records which I call FILE_INFO and URL, but
the two are logically associated pairs with the URL record
following the FILE_INFO.
In the sample dump below, A FILE_INFO record starts at
0x1F08. The 1st long is flen = 0xB1.
This is the total record length, or the offset to the
URL record at 0x1FB9 with a URL string at 0x1FC1.
I have yet to see a case where the FILE_INFO and URL
were not contiguous.
The pair of offsets in the data region 1002: {B9 0F 08 0F}
point to the sample record dump below. They are:
entry[1] = 0xFB9 => URL at 0x1FB9
entry[2] = 0xF08 => FILE_INFO at 0x1F08
The record has a fixed structure at the beginning, can treat as series
of longs
FILE_INFO record
long flen, // record length, distance to URL record
unknw3, // always 3L, I use as test for FILE_INFO record
time1, // probably last modified on net
time2, // file closed after written to disk
unknw2, // always seems to be zero
length, // of file on disk
unknw3; // always 1L maybe # pad bytes?
/* gets squirely here, next five bytes are always same
maybe pad byte and file name string length 13 = 0xD.
String lengths seem to include terminating NUL
For Win31 the file name string always starts at offset
0x21 into the record
Additional longs start again at 0x2E in record:
*/
char pad1;
long nm_len;
char name[13];
long unknw4, // always 0L
unknw5; // always 1L, maybe # pad bytes
char pad2[25]; // this is just a guess. but seems to be zeros
// up to offset 0x4f
long type_len; // think this defines variable length type string
char type[] // variable length string
/* tends to be zero's again for a long time, but file length
byte pattern gets repeated for some reason in here.
don't think I've every had a FILE_INFO record that
wasn't URL followed by a URL record, ie always in pairs.
The URL string is always 8 bytes after the start of the
URL record, and NUL terminated.
In general the long, ulen, is offset to next record pair,
but not always. One must use the list of record pairs at
the start of each block to step through
*/
URL record // beginning of 2nd record associated with file
long ulen, // length of URL record
slen; // length of URL string with terminating NUL
char url[]; // variable length string
long term; // record alwasy seems to end with 0L
/*
The two longs at the start of the URL record, are the overall
record length, ulen, and the length of the URL string, slen,
including its trailing NUL.
These two longs always seem to differ by 12 bytes, 8 for
the bytes they contain themselves, and 4 more for a long, which always seems
to follow URL strings NUL and is zero. I test this difference to
verify a URL record.
*/
sample record dump:
1F08: B1 00 00 00 03 00 00 00 93 62 BE 36 D1 F5 CA 36 |.........b.6...6
1F18: 00 00 00 00 8C 23 00 00 01 00 00 00 00 0D 00 00 |.....#..........
1F28: 00 4D 30 52 43 4C 54 45 42 2E 47 49 46 00 00 00 |.M0RCLTEB.GIF...
1F38: 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 |................
1F48: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0A |................
1F58: 00 00 00 69 6D 61 67 65 2F 67 69 66 00 00 00 00 |...image/gif....
1F68: 00 00 00 00 00 00 8C 23 00 00 00 00 00 00 00 00 |.......#........
1F78: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................
1F88: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................
1F98: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................
1FA8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................
1FB8: 00 47 00 00 00 3B 00 00 00 68 74 74 70 3A 2F 2F |.G...;...http://
1FC8: 69 6D 61 67 65 73 65 72 76 31 2E 69 6D 67 69 73 |imageserv1.imgis
1FD8: 2E 63 6F 6D 2F 69 6D 61 67 65 73 2F 41 64 34 33 |.com/images/Ad43
1FE8: 35 33 33 53 74 31 53 7A 31 53 71 34 49 64 34 2E |533St1Sz1Sq4Id4.
1FF8: 67 69 66 00 00 00 00 00
Note see the comments below for Netscape 1.1 file naming
convention. This is essentially the same in 3.01, still
a base 32 time stamp. However the extension now normally
reflects the extension used on the internet, ie *.gif, *.jpg,
*.js, and *.htm are common. Under Win 3.1 the use of a
time stamp eliminates the possibility of file name conflicts
allowing all files to be in a single directory.
----- Netscape Personal Edition 1.1 - Win 3.1 (16 bit)
My recollection is that version 2.0 used the same cache system,
but I can't swear to it. I confess to submitting to Microsoft's
apparent plan in that it became so easy to use IExplorer that I
stopped using Netscape. However I found it very annoying in the
early version of Netscape that one couldn't seem to view files
off line. Hopefully they fixed this, but it killed my interest...
If I get some spare time I plan to do some Linux work, that
will probably rekindle my interest because Netscape is supporting
Linux.
My experience is with Version 1.1. As with IExplorer 2.0 (which
presumably copied this format) there was a cache\ subdirectory
below the location the software was installed. It contained a
text file which was a master list of the cached files, FAT.
A copy of the first 4 lines of a FAT file follows:
MCOM-Cache-file-allocation-table-format-1
0 834117613 834475091 d:\netscape\netsv11\cache\M0ORQ4QD.MOZ http://www.borland.com/Connect/Connect.html#newsgroups text/html 17951
0 832045966 834474442 d:\netscape\netsv11\cache\M0ORQ4DU.MOZ http://www.borland.com/TechInfo/cpp/index.html text/html 5222
0 811557014 834474517 d:\netscape\netsv11\cache\M0ORQ4G1.MOZ http://loki.borland.com/cpp/all.htm#winprog text/html 31839
The first line is just a descriptive format line.
Each additional line describes a cached file which is located in
the same directory. The entries in each line are tab delimited
(ie white space) as follows:
Entry Purpose
1st unknown (in memory boolean??? if so only active at run time)
2nd DOS file time (seconds since 1970) last modified on inet
3rd DOS file time (seconds since 1970) last modified on disk
4th full path to file in cache
5th URL
6th data content, ie file type
7th file size in bytes
Entries in file are increasing order of the 3rd entry, ie the
file time. It is worth noting that this really seems to be the
time the operation completed. On large files it may be
significantly different that the time the operation was started.
The other point of interest is the file naming system. Rather
than using some permutation of the files name, Netscape uses a
unique time stamp system. All disk file names start with the
letter 'M' and end with the extention ".MOZ". The remaining
7 chars in the file name are the DOS 16 bit time at which the
write operation started. One can estimate the time required for
the download by comparing the time based on the file name to the
last modification time obtained from DOS for the file or the
3rd entry in the master list above which is equivalent.
Just extend the Hex base 16 notation to base 32. The value
0-9 is represented by '0' - '9'
10-31 is represented by 'A' - 'U'