Make sure you have a tool to open ZIP files, such as WinZip, installed; most of the files on this page require this.

First of all, install the Perl interpreter on your system. Sitescooper is written in Perl, so you can't avoid this, and I'm afraid it's a hefty download at 5.4 Mb. Get it from ActiveState here (via HTTP) or here (via FTP). (Those are direct links to the ActiveState download area. If those links don't work, the version number has probably changed, so get the "complete package" from this page instead.)

Install Perl. It should not matter what directory you install it into. Be sure to check the associate .pl extension with Perl and add Perl to the command path checkboxes.

You need to have the LWP Perl module installed, version 5.43 or later. If you don't have this, or you don't know what I'm talking about, then you should be using the version of sitescooper that comes in the sitescooper-full.zip ZIP file; it includes this stuff. If you downloaded the smaller sitescooper.zip ZIP file, don't worry -- you can just extract the full zipfile on top of it (i.e. into the same directory you extracted the smaller one into). Nothing bad will happen and you can carry on from here.

Next, if you will be reading scooped documents on a Palm handheld, you need the document converters.

OPTIONAL: If you plan to use Plucker format files, you will need to ensure you have the Plucker package installed.

OPTIONAL: If you're converting to iSilo format, you will need to download and install iSiloC32.zip (78K zip file). Double-click this zip file and extract the contents into your command path -- C:\Windows for example.

OPTIONAL: If you're converting to DOC format, you'll need MakeDocW, which can be found here or here (zip file). Again, double-click this zip file and extract the contents into your command path -- C:\Windows for example. Also read this note on using MakeDocW.

OPTIONAL: RichReader mode requires HTML2Doc.exe, which can be found at the RichReader site.

Unless you install those helper applications into your path, you will need to tell sitescooper where they live. To do this, run Notepad, and open the sitescooper.cf file, which is in the directory where you unzipped the sitescooper.zip file. Find and change the line in the configuration section which reads

# MakeDoc: makedocw.exe        # CUSTOMISE
MakeDoc: C:\Path\To\Your\MakeDocW.exe
(note the initial # sign needs to be trimmed off). Obviously, for iSilo you'll need to use iSilo: as the parameter, and for RichReader, use HTML2Doc: .

OPTIONAL: Sitescooper supports diffed sites (sites where only the "newest" bits of unread news are scooped), using the Algorithm::Diff Perl module, which is built in. Algorithm::Diff is quite slow though, so if you want to speed it up, you should download a good, external, 'diff' tool; sadly the only one I could find that was any good at all, in terms of behaving properly with respect to (a) long filenames, (b) running from perl, and (c) capturing its output in a perl script, was the Cygnus one. This can be installed as part of the excellent CygWin toolkit. A bit of a long download, mind, so unless you already have this installed it may not be worth it!

Finally, set up a shortcut to run the following command:

perl c:\path\to\sitescooper.pl


Next: Running Sitescooper

[ Installing ]|[ on UNIX ]|[ on Windows ]|[ on a Mac ]
[ Running ]|[ Command-line Arguments Reference ]
[ Writing a Site File ]|[ Site File Parameters Reference ]
[ The rss-to-site Conversion Tool ]|[ The subs-to-site Conversion Tool ]
[ Contributing ]|[ GPL ]|[ Home Page ]