This is tarscan, a program that can read a tarfile and break it
up into smaller self-contained tarfiles that do not exceed a
specified size limit.  I originally wrote it many years ago to
scan old news archive tapes for Subject lines.  It was posted
to one of the old moderated Unix sources newsgroups, where Rich
$alz did some radical clean-up on my sloppy code.  I have added
the tarfile splitting logic since then.  It's been a loong time
since I've looked at this code.  Any ugly kluges you find should
properly be attributed to me, and not to Rich $alz.

The program is invoked like this:

tarscan [-v] [-f filename-prefix] [-s size]

The -v switch causes tarscan to output (on stdout) a verbose listing
of the members of the input tarfile and any Subject: line found in
the first block of the file.

The -f switch causes tarscan to write a new tarfile on the specified
filename with "000" appended to it.

The -s switch specifies a maximum size (in 512-byte blocks) for the
output tarfiles.  If more than one output file is created, the three-
digit file suffix is incremented for each file.


Typically, I use tarscan to package up a bunch of files into floppy-
sized hunks for transfer to my home system.  My usual usage is like
this:

tar cf - ./* |tarscan -v -f ../myfiles. -s 2400 >../myfiles.lis

This command will tar up everything in the current directory into
tarfiles that will all fit on 5.25" HD floppies.  The tarfiles and
the listing file will be created in the parent directory.  The
tarfiles will have names myfiles.000, myfiles.001, myfiles.002, ...
The listing file will contain a table of contents for each 
tarfile.  

Bugs:

Tarscan doesn't try to break up members of the input tarfile between
more than one output tarfile.  A member file that is larger than a
floppy will be written by itself in a single output tarfile, and it
still won't fit on a floppy.

A corrollary to this is that packaging up relatively large files,
such as a bunch of several-hundred-kilobyte compressed tar's,
can potentially result in considerable variation in the sizes of
the output tarfiles and consequent wasted space on the floppies.
Such is life, eh?  :-)

The -v listing logic doesn't handle directories correctly.  No data
is lost, but the listing is sometimes confusing.  I've known about
this for years, but have never had the energy to fix it.  :-)

Tarscan sometimes spits out a complaint about a file name too long
on the last member of the input tarfile.  The output tarfiles are
always correct and complete, so I've never bothered to squash this
bogus message.

It is probably possible to confuse tarscan by giving it a set of
arguments that don't make sense n combination with each other.  
The program tries to protect itself, but this code is not 
well-tested since I never give it arguments that don't make sense.  
:-)



Tarscan is probably of most use for those who have an easy route
to floppies from their Internet-connected machine.  A SPARCstation
with a 3.5" floppy is ideal, since one can simply dd the tarfiles
directly to /dev/fd0.  On a PC with an ethernet card, one can ftp
the tarfiles to DOS-formatted floppies.  (And read them under
linux with tar -i.) 

These sources are known to work on Suns running SunOS revs ranging
from about 3.2 up to 4.1.2, although they will probably compile fine
on other systems.  I haven't tried compiling tarscan under Linux,
but a variant works under Minix.


I hope this is useful for some of you.  If you fix any of the known
bugs, or find new ones, I would appreciate hearing about it.

Paul Allen
pallen@atc.boeing.com
4/14/92



