18. archie -via- www

net connected or not connected?

18.1 Gopher - what is it?
18.2 Archie - what is it?
18.3 ArchiePlex
18.4 example archie session
18.5 Unexpected errors
18.6 Top of the list - outdated!
18.7 Looking with FTP to check the package versions
18.8 No fake dates, please!
18.9 Strange Netscape Dialog Box - press [Filter


Gopher - what is it?

You can install xgopher to have a look. More or less it is what they used before www came on-line. It is a university service menu front-end to browse lists of files, download them and view them.

You connect to a gopher server, and follow it's menu commands. It may have interesting links to archie, or whatever the server administrator felt like adding. Since your initial gopher server probably links to others, you may find information that www/html doesn't have, or connect to search engines that are not easy to find. It is useful when your server is a local campus facility, otherwise use http.


Archie - what is it?

Archie is a system where all the universities cooperate to produce a huge index of files, of their anonymous ftp servers. If you know the name of a file (or part name!), you ask archie to find it for you.

Archie returns a list of matching files, and where they are. To save time, archie only returns the first find on each server, so a loose search might locate a README that hasn't changed, when the .tgz file has.

You can use a local archie client, or you can use somebody's WWW to Archie interface (eg ArchiePlex), a WWW/CGI form.

University ftp machines include mirrors of other ftp machines. src.doc.ic.ac has many files from many sources, and is useful for UK clients.



ArchiePlex is a WWW/CGI fromt-end form for archie (There must be others). You point your browser at http://www.lib.ox.ac.uk/internet/archieplex/, fill in the form and it does the interfacing.

http://archie.doc.ic.ac.uk//archieplexform.html for Imperial College.

Don't believe the results blindly. Archie uses the file's mtime, which could be close to accurate, but is easily wrong. EG when copying a CDROM onto disk, the system administrator must remember to tell cpio to preserve the mtime on files (-pvdm). The mtime of the original source file must also be correct!

interestingly, gzip'ing a file to file.gz, puts the file's mtime into the .gz header. Browsers (eg netscape) that fetch the file and expand them for viewing AND for processing, are not doing you any favours, when they save the file as uncompressed.

Archie stops searching a host on the first matching files found. This might be package-1.2.README, when package-2.3 is what you are looking for! There is also the risk that the file is unavailable when the index item says it should be (though I don't understand why that happens).


example archie session

By the time you read this, the bugs shown will have changed. But here's the story ...

	browser ...
	set: nicer # eg batch processes get fewer bigger chunks
	search for: xearth
	save as: ./archieplex.xearth
	set: Not so nice
	search again sorted by date
	save as file: 9602/sess1/archieplex.xearth.sorted
The file you get back is an HTML pagem with hotlinks to the servers and files. It makes it very easy to fetch the file you have just found. This looks great, but needs a bit of manual interpretation.


Unexpected errors

With the sorted results for search xearth, follow the first link to

<-! very long line hated by groff ->


Note that the Oct_94 tells you that something is wrong. When the SysAdmin copied the CDROM to disk, the mtime of the file was set to 'today', not the original date retained.

To avoid this when you copy a tree of files, use the -m option to cpio, and do this as root. To preserve existing file ownerships ls -l /cdrom do

	cd /cdrom
	find . -print | cpio -pvdm /hdc2/cdrom_14


Top of the list - outdated!

The list shows wlv.ac.uk at the top of the list, followed by sites claiming to have 1.14, followed by sites with 0.92 (then 0.6 - bless them). Sorry wolves, nul points

	1	uk	scitsc.wlv.ac.uk 
	2	us	ftp.cs.umn.edu # Univ Minnesota (1.14 absent?)
	3	no	romeo-klive.nvg.unit.no
	4	se	ftp.luth.se (has both 1.14 and 0.92)
	5+	**	0.92 or 0.6 
Strangely enough, a later search for 'stretching' also produced a stray result, on the same disk! But that was because the rec.sport.misc FAQ was deleted from the infomagic disk, but the directory thoughtfully left there to see that it really does exist. (Wolf saves pack).


Looking with FTP to check the package versions

This is an example of an FTP session, looking for a file that isn't there.

	gps@trix:/tmp/pkgs_cd/xearth-0.92$ stty erase ^H
	gps@trix:/tmp/pkgs_cd/xearth-0.92$ ftp -n ftp.cs.umn.edu
	Connected to ftp.cs.umn.edu.
	220 ftp FTP server (Version wu-2.4(14) Mon Dec 11 10:01:35 CST 1995) ready.
	Remote system type is UNIX.
	Using binary mode to transfer files.
	331 Guest login ok, send your complete e-mail address as password.

	230-Welcome to the FTP Site for the University of Minnesota, 
	230-Department of Computer Science.
	230-Commonly used packages and their paths on our site:
	230-Package             Mirror site             Local Directory
	230-Netscape    ftp.netscape.com        /packages/ftp.netscape.com
	230-Tcl-Tk              ftp.smli.com            /pub/misc/tcl
	230-GNU         prep.ai.mit.edu         /packages/gnu
	230-X11: All located in /packages/X11/
	230-Packages mirrored from ftp.x.org:
	230-  R5        R5-Contrib      R6      R6-Contrib
	230-Linux:      All located in /packages/linux/
	230-  Slackware ftp.cdrom.com           /slackware
	230-  Kernel    ftp.cs.helsinki.fi      /kernel 
	230-  Docs              sunsite.unc.edu         /docs
	230-NetBSD: All located in /packages/NetBSD/
	230-Packages mirrored from ftp.netbsd.org:
	230-  Current NetBSD-1.1
	230-NetBSD-Amiga is mirrored from ftp.uni-regensburg.de in /pub/NetBSD-Amiga
	230-If you have comments or questions please mail ftp-admin@cs.umn.edu
	230-Please read the file README
	230-  it was last modified on Thu Mar 16 15:32:28 1995 - 336 days ago
	230 Guest login ok, access restrictions apply.

	ftp> hash

	ftp> cd pub/NetBSD-Amiga/contrib/X11
	250 CWD command successful.

	ftp> ls xe* 
	,..	ls xe* shows only 0.92 !! where has 1.14 gone?

	ftp> quit


No fake dates, please!

This gives you the date of the package, rather than the date of the tar file.

Now locate the xearth files in the directory. Look-at (and save to disk) the 91-92 diffs. The first lines are the times and dates of the HISTORY file ie:

	diff -r -c xearth-0.91/HISTORY xearth-0.92/HISTORY
	*** xearth-0.91/HISTORY Wed May 25 02:05:17 1994
	=-- xearth-0.92/HISTORY Wed Jun  1 19:38:25 1994
This shows that the HISTORY file is being used properly (probably), so the correct date is in the HISTORY file line 1, and for a patch file it's the top 3 lines. IE I don't know what is more recent than Jun 1994, but thats the date of this package (it agrees with all sub files).

xearth 0.92 is the version on the CDROM, and it works, so why worry?


Strange Netscape Dialog Box - press [Filter

The Netscape "Save As ..." dialog has a split mode. The top half is in one directory, the other half in another. To get them the same press FILTER every time you highlight a directory. If you lose the filename, carry on navigating, then cancel and repeat.