2005-11-10 02:28 akkana * site_samples/science/new_scientist_news.site: Use RSS instead of html feed, because of stories that weirdly don't show up in Plucker 2005-11-10 02:27 akkana * site_samples/: opinion/pulpit.site, opinion/slate.site, palmsized/the_register_rss.site, tech/slashdot_top.site: Updates for sites that have changed 2005-11-10 02:24 akkana * site_samples/: business/lazarus_at_large.site, opinion/alanmiller.site, tech/paulgraham.site: New sites from me 2005-11-10 02:24 akkana * site_samples/tech/pcmag_firstlooks.site: New site from Goh Boon Nam 2005-11-10 02:21 akkana * lib/Sitescooper/URLProcessor.pm: Add application/*xml to allowed types, for newer RSS sites 2005-08-08 07:43 barrygonzaga * site_samples/bsd/openbsd_journal.site: update url, remove postprocess magic, update email 2005-08-08 06:50 barrygonzaga * site_samples/regional_philippines/ctc-movies-metro.site: Add clickthecity.com Metro Manila Movie Guide; note: huge site 2005-08-08 06:33 barrygonzaga * site_samples/palmsized/inq7-mobile.site: 3 level inq7.net site 2005-08-08 06:31 barrygonzaga * site_samples/regional_philippines/: inq7.site, pdi.site: replace pdi.site with inq7.site 2005-08-08 06:26 barrygonzaga * site_samples/linux/gwn.site: add logo imageurl; update author email 2005-08-08 06:21 barrygonzaga * site_samples/business/businessweek.site: Reflect web site title, update author email 2005-08-08 06:15 barrygonzaga * site_samples/palmsized/: ny_times.site, salon.site: remove nonworking site 2005-07-06 06:19 akkana * lib/Sitescooper/Main.pm: Add << ^^ >> links at end of story as well as beginning 2005-07-06 01:43 akkana * site_samples/: lib/layouts.site, humor/jon_carroll.site, news/wired_news/wired_news_politics.site, opinion/salon.site, science/new_scientist_news.site, tech/newsforge.site: Some updates for sites that have changed. 2005-07-06 01:42 akkana * site_samples/regional_boston/bostonglobe.site: New site: Boston Globe City & Region sections. From Bruce Zohn 2005-07-06 01:40 akkana * site_samples/: news/USNews.site, news/newsweek_intl.site, tech/pcmag_images.site: Updates from BoonNam Goh 2005-01-26 04:12 akkana * site_samples/science/new_scientist_news.site: Changes to track the recent site changes 2005-01-17 19:13 akkana * site_samples/regional_israel/: haaretz.site, jpost-columns.site, jpost-international.site, jpost-israel.site, jpost-me.site, jpost-opinion.site: David Resnick : Jerusalem Post and Haaretz site files 2005-01-17 19:02 akkana * site_samples/regional_uk/bbc_news_sci_tech.site: Add ContentsDiff 2005-01-17 18:58 akkana * site_samples/sport/GSR/: GSR_Appearance_Mods.site, GSR_Bike.site, GSR_General_Disc.site, GSR_Owners.site, GSR_Performance_Mods.site, GSR_Stories.site, GSR_Technical.site, GSR_Tips-n-Tricks.site: Delmer Wells : GSR motorcycle information sites 2005-01-05 21:54 akkana * site_samples/opinion/slate.site: Anthony Foglia : New site, Slate 2005-01-05 21:42 akkana * site_samples/linux/slashdot.site: B. M. Sleight : minor changes to pick up ask.slashdot.org it.slashdot.org 2005-01-05 21:18 akkana * site_samples/weblog/kevin_sites.site: New site from Delmer Wells : Kevin's War Blog 2005-01-05 21:13 akkana * site_samples/tech/pcmag_images.site: Goh Boon Nam: Update to track site changes and grab images better 2005-01-05 21:12 akkana * site_samples/business/the_economist.site: Goh Boon Nam: Remove Subscription-only pages which cause problem to Plucker 2005-01-05 21:07 akkana * site_samples/: humor/dave_barry.site, linux/debian_weekly_news.site, news/wired_news/wired_news_tech.site, tech/newsforge.site, tech/the_register.site, weblog/riverbend.site: Updates to track changes in the web sites 2004-06-22 20:21 akkana * site_samples/weblog/riverbend.site: Fixed StoryStart 2004-06-03 01:22 akkana * site_samples/linux/: kc_debian_hurd.site, kc_gimp.site: Remove no longer extant debian, hurd and gimp kernel cousins 2004-05-20 23:30 akkana * site_samples/regional_australia/yourmovies_canberra.site: Your Movies, Canberra: from Ken Russell 2004-05-14 04:11 akkana * site_samples/news/USNews.site: Update from Goh Boon Nam 2004-05-14 04:09 akkana * site_samples/: science/archaeology_org.site, science/grahamhancock.site, tech/slyck.site: New sites from Ken Russell 2004-05-14 04:02 akkana * site_samples/palmsized/the_register_rss.site: New palmsized register from Ken Russell 2004-05-14 03:58 akkana * site_samples/palmsized/: the_register.site, the_register_rss.site: Rename palmsized The Register to The Register RSS, so as not to conflict with the non-palmsized Register 2004-05-14 03:45 akkana * site_samples/: news/atlantic.site, tech/slashdot_top.site: New sites 2004-05-14 03:42 akkana * site_samples/opinion/salon.site: Comment out StoryToPrintableSub -- it was causing errors 2004-04-27 23:05 akkana * site_samples/: linux/desktoplinux.site, science/smithsonian.site, tech/joelonsoftware.site, tech/newsforge.site, weblog/riverbend.site, weblog/where_is_raed.site: New sites, from me 2004-04-27 23:01 akkana * site_samples/lib/layouts.site: Fix BBC news information 2004-04-27 22:54 akkana * site_samples/: linux/kernel_traffic.site, opinion/i_cringely.site, tech/the_register.site: Update URL, content start, and other minor fixes 2004-04-26 04:46 akkana * site_samples/news/yahoo/: yahoo_business.site, yahoo_entertainment.site, yahoo_politics.site, yahoo_tech.site, yahoo_top_stories.site: Re-adding yahoo sites, fixed thanks to Jonathan Becker 2004-04-26 04:45 akkana * site_samples/comics/: boondocks.site, doonesbury.site, tedrall.site: New comics from Ignatz Sol 2004-04-25 07:38 akkana * site_samples/: news/newsweek_intl.site, tech/pcmag_images.site: Updates from Goh Boon Nam 2004-04-25 07:30 akkana * site_samples/humor/dave_barry.site: Update from Alan Hoyle : fix story start, end, headline 2004-04-23 11:27 cwerner * site_samples/opinion/pulpit.site: New site for Bob Cringely's weekly column: The Pulpit. This is the same site scooped by i_cringely.site, except that he old i_cringely site did a 2 level scoop that attempted to get a set of columns, whereas the new one gets a single column and only on Fridays. The old one can probably be removed, but I didnt want to mess with it in case someone is relying on it. 2004-03-22 16:16 cwerner * default_isilox.ixl, sitescooper.cf, doc/site_params.html, lib/Sitescooper/Main.pm, lib/Sitescooper/SCF.pm: Improved support for isiloXC: 1. Added a new param to sitescooper.cf "ISiloDefaultIxlFile" that points to an .ixl file in the file system. This means that users can change the iSiloX options by using the iSiloX GUI tool to create a new document, change all the options, then save as a .ixl file. The and tags of the document are stripped and replaced by sitescooper but the rest is used for generating the isilox pdb. More details are given in the comments in sitescooper.cf. The most common likely use for this is to allow the users of -isilox to specify global settings for things like image depth, color, inclusion, dithering etc, and perhaps for category too. 2. Added a new site param called "ExtraISiloIxlTags", to allow ixl settings specific to a site. Updated doc/site_params.html, so see this for more details. This is a little different in that the user has to specify a set of top-level tags for the .ixl file. These get appended to the generated file thus overriding the defaults (or overriding the global options if the new config param is used). This takes advantage of the fact that isilox tolerates the tags appearing more than once by simply taking the last tag and ignoring earlier copies (or at least its xml parser does). So you can set general options in your .ixl file and override specific options in the .site files. The fact that you have to override the whole tag such as means that you can't override, say bitdepth separately from dithering, but its still pretty powerful. And simpler and more durable (ie resitant to changes in isilox) than adding a bunch of new site params. : Modified Files: : sitescooper/sitescooper.cf sitescooper/doc/site_params.html : sitescooper/lib/Sitescooper/Main.pm : sitescooper/lib/Sitescooper/SCF.pm : Added Files: : sitescooper/default_isilox.ixl 2004-02-19 20:14 jmason * lib/Sitescooper/: Robot.pm, StoryURLProcessor.pm: some glitches in RSS output fixed; now does not search for sub-stories after html_to_text conversion 2004-02-18 00:30 jmason * site_samples/science/new_scientist_news.site: New Scientist News site updated 2004-02-16 04:42 akkana * site_samples/: cinema/ebert_1min.site, cinema/roger_ebert.site, humor/dave_barry.site: Contributions from Alan Hoyle, alanh at email.unc.edu 2004-02-13 03:27 jmason * lib/Sitescooper/: Main.pm, SCF.pm: added patch from Robert Fuhge, robert.fuhge.at.epost.de, assign categories to Plucker documents using the Category: line in the site file 2004-02-13 01:10 jmason * site_samples/tech/risks.site: updated risks.site to use new 'mobile device' rendering 2004-02-11 20:35 akkana * site_samples/business/the_economist.site: The Economist, from BoonNam Goh 2004-02-11 20:31 akkana * site_samples/news/: newsweek.site, newsweek_intl.site: Newsweek updates from BoonNam Goh 2004-02-07 03:44 jmason * site_samples/security/: crypto_gram.site, crypto_gram.site: cryptogram site fixed 2004-01-31 01:27 jmason * lib/Sitescooper/Robot.pm: handle undef headlines 2004-01-31 01:25 jmason * lib/Sitescooper/Robot.pm: oops; RSS output headline was not being HTML-encoded correctly 2003-11-15 04:29 akkana * site_samples/: tech/computer_world.site, news/newsweek_intl.site: Contributions from BoonNam Goh 2003-11-04 00:29 barrygonzaga * site_samples/linux/gwn.site: add Gentoo Weekly News 2003-10-31 18:27 akkana * site_samples/: news/Newsweek.site, news/NewsweekIntl.site, regional_israel/jpost.site: Remove inconsistently named files 2003-10-31 18:26 akkana * site_samples/news/: newsweek.site, newsweek_intl.site: Newsweek, from Goh Boon Nam 2003-10-31 18:25 akkana * site_samples/regional_israel/jerusalem_post.site: Jerusalem Post, from David Resnick 2003-10-31 17:33 akkana * site_samples/tech/wiredmag.site: Previous commit only got one specific date. So I've substituted my own Wired site file, which doesn't get entire stories yet, but it does get Wired every day. 2003-10-31 06:06 akkana * site_samples/tech/wiredmag.site: One issue of Wired Magazine, from richard_html2pdb at yahoo dot com 2003-10-31 06:03 akkana * site_samples/tech/pcmag_images.site: Update from Goh Boon Nam: Get full-sized images 2003-10-31 05:57 akkana * site_samples/news/: Newsweek.site, NewsweekIntl.site: Newsweek updates (US and Intl) from BoonNam Goh 2003-10-31 05:54 akkana * site_samples/regional_israel/jpost.site: Jerusalem Post, from David Resnick 2003-10-29 07:03 akkana * site_samples/news/: Newsweek.site, USNews.site: New sites contributed by BoonNam Goh 2003-09-17 08:59 hubidubi * site_samples/regional_hungary/linuxonline_hu.site: new site file for linuxonline.hu 2003-07-08 20:58 jmason * lib/Sitescooper/StoryURLProcessor.pm: fixed some wierdness in error messages 2003-06-25 21:20 jmason * site_samples/science/new_scientist.site: fixed NS site 2003-06-25 21:10 jmason * lib/Sitescooper/Main.pm: bug fixed 2003-06-25 19:48 jmason * sitescooper.pl, lib/Sitescooper/Main.pm, site_samples/science/new_scientist.site: bug on win32, noted by Robert P. Nix 2003-06-11 21:29 jmason * site_samples/culture/world_new_york.site: fixed and re-added World New York site 2003-06-11 20:37 jmason * site_samples/science/new_scientist.site: added headline support for New Scientist 2003-06-11 20:20 jmason * site_samples/: business/economist.site, business/stocksmart.site, business/wsj.site, cinema/coaxialnews.site, cinema/coolnews.site, cinema/forcenet.site, comics/girls_and_sports.site, comics/horrorscope.site, comics/i_need_help.site, comics/new_breed.site, comics/pops_place.site, comics/wildwood.site, culture/plastic.site, games/bluesnews.site, humor/alexei_sayle.site, humor/dave_barry.site, humor/ditherati.site, linux/linuxtoday.site, linux/linuxworld.site, linux/mandrakeforum.site, linux/mysql_newsletter.site, linux/weekly_news.site, news/gallup_poll.site, news/world_new_york.site, news/wired_news/wired_news_top_stories.site, news/yahoo/yahoo_business.site, news/yahoo/yahoo_entertainment.site, news/yahoo/yahoo_health.site, news/yahoo/yahoo_oddly_enough.site, news/yahoo/yahoo_politics.site, news/yahoo/yahoo_public_opinion.site, news/yahoo/yahoo_science.site, news/yahoo/yahoo_sports.site, news/yahoo/yahoo_technology.site, news/yahoo/yahoo_top_stories.site, news/yahoo/yahoo_world.site, odd/morbid_fact_du_jour.site, odd/snopes.site, opinion/salon_archives.site, opinion/tbtf.site, opinion/tbtf_log.site, opinion/unblinking.site, palm/memoware.site, palm/palmguru.site, palm/palminfocenter.site, palm/pdalive.site, palm/pencomputing.site, palmsized/beyond2000-pda.site, regional_australia/abc_news_online.site, regional_australia/fairfax_it.site, regional_california/mercury_center.site, regional_california/la_times/latimes_local.site, regional_california/la_times/latimes_nat.site, regional_california/la_times/latimes_oc.site, regional_california/la_times/latimes_science.site, regional_california/la_times/latimes_tech.site, regional_california/la_times/latimes_world.site, regional_croatia/KSET_monthly.site, regional_francais/libe_portrait_du_jour.site, regional_francais/libe_rebonds.site, regional_francais/sia_fr.site, regional_germany/bundesregierung.site, regional_germany/de_excite.site, regional_germany/de_heute.site, regional_germany/de_zdnet.site, regional_germany/de_zeit/de_zeit_media.site, regional_hungary/hirek.site, regional_israel/jerusalem_post.site, regional_north_carolina/cats_cradle.site, regional_north_carolina/charlotte_observer.site, regional_north_carolina/news_observer.site, regional_philadelphia/phillynews.site, regional_seattle/seattletimes.site, regional_spain/es_zdnet.site, regional_spain/marca_soccer.site, regional_spain/marca_sports.site, regional_uk/digiguide_tv_listings.site, regional_uk/times_britain.site, regional_uk/times_world.site, science/cosmiverse.site, science/nasa2go.site, science/sciam.site, security/securityportal.site, sport/fox_sports.site, tech/beyond2000.site, tech/mit_tech_review.site, weblog/joel_on_software.site, weblog/tsluts.site: removed all sites that now give HTTP errors when used 2003-06-11 18:46 jmason * sitescooper.pl, lib/Sitescooper/LWPHTTPClient.pm, lib/Sitescooper/Main.pm, site_samples/web/alistapart.site, site_samples/web/asktog.site, site_samples/web/webmonkey.site: added -timeout parameter 2003-06-11 07:01 jmason * rss-to-site.pl: another patch from Adrian Colley 2003-06-11 02:48 jmason * site_samples/: cinema/ebert_answer_man.site, cinema/ebert_features.site, cinema/ebert_great_movies.site, cinema/roger_ebert.site, opinion/nro.site: updated sites from John Straw 2003-06-10 06:51 jmason * site_samples/regional_germany/: de_cert.site, de_cyberkino.site, de_gazette.site, de_heise_mobil.site, de_heise_tp.site, de_heute.site, de_pdassi_news.site, de_pdassi_software.site, de_spiegel.site, de_stern.site, de_tagesschau.site, de_teltarif.site, de_tvspielfilm.site, mobile2day.site, palmfaq_de.site, pda_debitel_net.site, windows2000faq.site, zdnet_news.site, bundesregierung.site: a whole lot of new regional_germany sites from Stefan Schwingeler 2003-06-10 00:30 jmason * lib/Sitescooper/Main.pm, site_samples/comics/thismodernworld.site, site_samples/security/crypto_gram.site: patch for Plucker; now able to handle big images. also added thismodernworld site. patch from Adrian Colley 2003-06-09 07:49 jmason * lib/Sitescooper/: Main.pm, StoryURLProcessor.pm: remove non-required hashing 2003-06-09 07:23 jmason * sitescooper.pl, lib/Sitescooper/Main.pm, lib/Sitescooper/Robot.pm: description now encoded; RSS 1.0 the default 2003-06-06 19:18 jmason * lib/Sitescooper/StoryURLProcessor.pm: added SubStoryPermalink conf setting so that permalinks are picked up 2003-06-06 19:17 jmason * lib/Sitescooper/: Robot.pm, SCF.pm: added SubStoryId conf setting so that permalinks are picked up 2003-06-05 18:49 jmason * lib/Sitescooper/URLProcessor.pm: relative links became relative to sitescooper.org; fixed 2003-06-05 03:31 jmason * site_samples/tech/pcmag_images.site: updated PC Magazine site from Goh Boon Nam 2003-06-05 03:29 jmason * lib/Sitescooper/Robot.pm: oops, forgot escaping in description tags 2003-06-04 19:40 jmason * lib/Sitescooper/Main.pm: guid fix; use the real URL as much as poss 2003-06-04 19:38 jmason * lib/Sitescooper/LinksURLProcessor.pm: remove HTML comments before looking for links 2003-06-04 06:36 jmason * lib/Sitescooper/StoryURLProcessor.pm: added -maxstories support for substory mode 2003-06-04 06:33 jmason * lib/Sitescooper/StoryURLProcessor.pm: lib/Sitescooper/ 2003-06-03 06:13 jmason * lib/Sitescooper/Main.pm: fixed invalid RSS 2003-06-03 06:07 jmason * lib/Sitescooper/Robot.pm: rss with -dump works 2003-06-03 05:58 jmason * lib/Sitescooper/: Main.pm, Robot.pm, SCF.pm, StoryURLProcessor.pm, URLProcessor.pm: can now extract 'sub-stories' from within a story page 2003-05-31 04:18 jmason * lib/Sitescooper/: Main.pm, Robot.pm: added -rss switch for RSS output 2003-05-28 09:47 hubidubi * site_samples/regional_hungary/hirek.site: Site URL update 2003-04-29 02:23 jmason * lib/Sitescooper/Main.pm: fix for plucker from Rik Wehbring 2003-03-31 09:40 barrygonzaga * site_samples/regional_philippines/pdi.site: update and clean 2003-03-03 14:06 jmason * site_samples/regional_australia/abc_news_online.site: added ABC News Online site from Wayne Osborn 2003-02-26 14:55 hubidubi * site_samples/linux/mysql_newsletter.site: Site file for MySQL monthly newsletter 2003-02-06 07:56 hubidubi * site_samples/: regional_hungary/freebsd_hu.site, regional_hungary/hup_hu.site, regional_hungary/linuxforum_hu.site, linux/footnotes.site: some site logo improvements 2003-01-22 14:11 jmason * site_samples/tech/pcmag_images.site: added pcmag_images.site from Goh Boon Nam 2003-01-15 15:25 jmason * site_samples/weblog/eckes.site: added eckes.site 2003-01-15 11:39 jmason * lib/Sitescooper/CacheSingleton.pm, lib/Sitescooper/DirCacheFactory.pm, lib/Sitescooper/PerSiteDirCache.pm, site_samples/regional_hungary/freebsd_hu.site: freebsd_hu.site from Hubidubi 2002-11-15 18:36 jmason * lib/Sitescooper/Main.pm, lib/Sitescooper/SCF.pm, site_samples/languages/php_net.site, site_samples/linux/debian_weekly_news.site, site_samples/linux/footnotes.site, site_samples/regional_hungary/hirek.site, site_samples/regional_hungary/hup_hu.site, site_samples/regional_hungary/linux_hu.site, site_samples/regional_hungary/linuxforum_hu.site, site_samples/regional_hungary/metro_hu.site, site_samples/regional_hungary/pdamania_hu.site: many site updates from Hubidubi 2002-11-03 07:48 barrygonzaga * site_samples/regional_philippines/pdi.site: -fix "letters" story url -fix "business" story url -fix "business" stories 2002-10-29 02:07 barrygonzaga * site_samples/humor/: bofh-2k+1.site, bofh-2k.site: add description, clean up bad bold/italic markups, replaced
with

..

2002-10-28 09:34 barrygonzaga * site_samples/humor/: bofh-2k+1.site, bofh-2k.site: add bofh 2k and 2k+1 2002-09-03 17:06 jmason * site_samples/sport/cnn_sports.site: added cnn_sports site 2002-09-03 17:03 jmason * site_samples/linux/weekly_news.site: updated weekly_news.site 2002-07-15 15:21 jmason * lib/Sitescooper/: Main.pm, URLProcessor.pm: applied bugfix from Bernd Rellermeyer 2002-05-06 09:42 barrygonzaga * site_samples/sport/mobilebikes.site: cycling newsletter 2002-05-06 09:40 barrygonzaga * site_samples/palmsized/the_register.site: cleanup 2002-05-06 09:30 barrygonzaga * site_samples/: bsd/openbsd_journal.site, news/gallup_poll.site, palm/palminfocenter.site, palm/pdalive.site, palmsized/salon.site, business/businessweek.site: obscured email address 2002-05-06 09:26 barrygonzaga * site_samples/palmsized/ny_times_handheld.site: site restricted 2002-05-06 09:25 barrygonzaga * site_samples/regional_philippines/pdi.site: - obscured email address - cleanups 2002-05-06 09:24 barrygonzaga * site_samples/palmsized/ny_times_handheld.site: obscured email address 2002-01-25 06:00 jmason * site_samples/lib/layouts.site: updated BBC layout from Akkana's site 2002-01-22 09:20 jmason * site_samples/regional_uk/digiguide_tv_listings.site: Digiguide site re-submitted from Andy Carlson 2002-01-22 03:50 jmason * site_samples/linux/linuxtoday.site: updated linuxtoday 2002-01-22 03:49 jmason * site_samples/: science/new_scientist_news.site, security/hacker_news_network.site: hackernews gone 2002-01-21 03:25 jmason * site_samples/comics/: better_half.site, between_friends.site, crock.site, curtis.site, dinette_set.site, edge_city.site, girls_and_sports.site, grin_and_bear_it.site, horrorscope.site, i_need_help.site, katzenjammer_kids.site, lockhorns.site, mallard_fillmore.site, moose_and_molly.site, new_breed.site, piranha_club.site, pops_place.site, redeye.site, rhymes_with_orange.site, safe_havens.site, sam_and_silo.site, six_chix.site, theyll_do_it_every_time.site, trudy.site, tumbleweeds.site, zippy_the_pinhead.site: re-added fixed comics from Yoon Fui Thean 2002-01-19 04:04 jmason * site_samples/: admin/sitescooper_archive.site, bsd/oreillynet_bsd.site, business/cnn_financial.site, business/cnnfn.site, cinema/filmink-online.site, palmsized/cnn.site, regional_seattle/seattle_p_i.site, weblog/tim_oreilly.site: fixed some redirected links; removing duplicate CNN sites 2002-01-19 03:56 jmason * site_samples/: business/hottips.site, linux/linuxplaza.site, opinion/feed.site, regional_germany/de_spiegel.site, regional_north_carolina/weather24_raleigh.site: more dead sites pruned 2002-01-18 04:44 jmason * site_samples/: languages/aspwire.site, languages/news_perl_org.site, languages/perlmonth.site, languages/sqlwire.site, languages/vbwire.site, opinion/simson_garfinkel.site, tech/sendmail_net.site: removed lots of dead sites 2002-01-18 04:39 jmason * site_samples/: business/financial_times.site, business/fox_market_wire.site, business/the_standard.site, business/the_street.site, cinema/cinescape.site, comics/better_half.site, comics/between_friends.site, comics/crock.site, comics/curtis.site, comics/dinette_set.site, comics/girls_and_sports.site, comics/grin_and_bear_it.site, comics/horrorscope.site, comics/i_need_help.site, comics/katzenjammer_kids.site, comics/lockhorns.site, comics/mallard_fillmore.site, comics/moose_and_molly.site, comics/new_breed.site, comics/piranha_club.site, comics/pops_place.site, comics/redeye.site, comics/rhymes_with_orange.site, comics/safe_havens.site, comics/sam_and_silo.site, comics/six_chix.site, comics/theyll_do_it_every_time.site, comics/trudy.site, comics/tumbleweeds.site, comics/zippy_the_pinhead.site, games/oswalds_6th_floor.site, humor/modern_humorist.site, languages/perlnews.site, linux/mandrake_pda.site, news/csmonitor.site, news/my_excite.site, palm/palmgear.site, palmsized/mercury_center_mobile.site, palmsized/the_standard.site, regional_chicago/chicago_tribune_arts_and_entertainment.site, regional_chicago/chicago_tribune_books.site, regional_chicago/chicago_tribune_cars.site, regional_chicago/chicago_tribune_commentary.site, regional_chicago/chicago_tribune_editorials.site, regional_chicago/chicago_tribune_friday.site, regional_chicago/chicago_tribune_good_eating.site, regional_chicago/chicago_tribune_health_and_family.site, regional_chicago/chicago_tribune_home_and_garden.site, regional_chicago/chicago_tribune_jobs.site, regional_chicago/chicago_tribune_kidnews.site, regional_chicago/chicago_tribune_magazine.site, regional_chicago/chicago_tribune_metro_chicago.site, regional_chicago/chicago_tribune_metro_dupage.site, regional_chicago/chicago_tribune_metro_lake.site, regional_chicago/chicago_tribune_metro_mchenry.site, regional_chicago/chicago_tribune_metro_northwest.site, regional_chicago/chicago_tribune_metro_southwest.site, regional_chicago/chicago_tribune_new_homes.site, regional_chicago/chicago_tribune_real_estate.site, regional_chicago/chicago_tribune_tempo.site, regional_chicago/chicago_tribune_transportation.site, regional_chicago/chicago_tribune_travel.site, regional_chicago/chicago_tribune_tv_week.site, regional_chicago/chicago_tribune_woman_news.site, regional_chicago/chicago_tribune_your_money.site, regional_chicago/chicago_tribune_your_place.site, regional_croatia/DHMZ_Hrvatska_danas.site, regional_croatia/DHMZ_Hrvatska_sutra.site, regional_croatia/DHMZ_Jadran.site, regional_croatia/DHMZ_Zagreb_danas.site, regional_croatia/DHMZ_Zagreb_sutra.site, regional_denmark/politiken.site, regional_denmark/valutakurser.site, regional_francais/01_informatique.site, regional_francais/afp.site, regional_francais/cinenouba.site, regional_germany/de_br_news.site, regional_germany/de_dwelle.site, regional_germany/de_kalenderblatt.site, regional_ireland/irish_times.site, regional_north_carolina/wral-tv.site, regional_philippines/manila_bulletin.site, regional_spain/telebasket_nba_spanish.site, regional_spain/telebasket_spain.site, regional_toronto/globe_and_mail_business.site, regional_uk/digiguide_tv_listings.site, security/securityfocus.site, sport/EurosportTV.site, sport/cnn_sports.site, sport/telebasket_nba.site, sport/thatsracin.site, sport/uk_sports_com.site, tech/cnet.site, tech/geeknews.site, tech/techweb.site: removed sites which now give HTTP 404s 2002-01-18 04:36 jmason * site_samples/: regional_california/ocregister.site, science/hotair_features.site, sport/sportingnews.site: moved broken sites to 'broken' dir 2002-01-18 04:35 jmason * site_samples/: linux/kde-dev-news.site, linux/rhad_rumor_mill.site, news/gallup_poll.site, opinion/idler.site, opinion/jaundiced_eye.site, opinion/slate_todays_papers.site, regional_denmark/sslug-nyheder.site, regional_francais/libe_q.site, science/spaceref.site, tech/rcfoc.site, web/webreference_experts.site: thoroughly outdated dead sites removed 2002-01-18 01:38 jmason * site_samples/bsd/daemonnews.site: removed broken site 2002-01-18 01:38 jmason * site_samples/humor/jon_carroll.site: added jon_carroll.site from Jan Lund Thomsen 2002-01-14 02:16 jmason * site_samples/regional_germany/de_tecchannel.site: added de_tecchannel.site from Michael Schubart 2002-01-14 02:10 jmason * site_samples/: cinema/imdb_studio_briefing.site, weblog/jason_pettus.site: added sites from Jan Lund Thomsen 2002-01-07 04:04 jmason * site_samples/regional_denmark/geekculture.site: added sites from Jan Lund Thomsen 2002-01-07 03:56 jmason * lib/Sitescooper/Main.pm: committed patch from Akkana to silence Plucker warning 2002-01-04 09:37 jmason * site_samples/regional_japan/jp_daily_yomiuri_english.site: added jp_daily_yomiuri_english.site from Michael Schubart 2002-01-02 04:40 jmason * site_samples/regional_japan/jp_japan_times/: jp_japan_times_business.site, jp_japan_times_news.site: added jp_japan_times sites from Michael Schubart 2001-12-30 01:43 jmason * lib/Sitescooper/Main.pm: added fix for iSilo on win2k 2001-12-16 23:34 jmason * site_samples/comics/calvin_and_hobbes.site, t/html/newstories/index.html, t/html/newstories/1/page1_1.html, t/html/newstories/2/page2.html: updated calvin and hobbes site from Gary Paulson 2001-12-04 05:40 jmason * site_samples/regional_denmark/politiken_daily_summary.site, t/html/scdiff2.html: added politiken_daily_summary.site from Jan Lund Thomsen 2001-12-04 05:39 jmason * site_samples/humor/alexei_sayle.site, t/html/scdiff2.html: added Alexei Sayle site from Jan Lund Thomsen 2001-12-04 05:36 jmason * lib/Sitescooper/CacheSingleton.pm, lib/Sitescooper/PerSiteDirCache.pm, t/html/http_redirect/front/currentdate/index.html, t/html/newstories/index.html, t/html/newstories/1/page1_1.html, t/html/newstories/2/page2.html: backed out prev change; already fixed in CVS 2001-12-04 05:27 jmason * lib/PDA/PilotInstall.pm, lib/Sitescooper/CacheSingleton.pm, t/html/http_redirect/front/currentdate/index.html: added fixes for problems reported by Andy Carlson 2001-12-03 17:52 alastair * site_samples/regional_australia/fairfax_it.site: Fixed to work with the latest Fairfax site changes. 2001-12-03 16:07 alastair * site_samples/tech/zzz.site: Updated site to include ContentsDiff (d'oh!) 2001-12-02 07:16 jmason * t/html/newstoriesdiff/index.html: .. 2001-12-02 07:11 jmason * sitescooper.pl, lib/Sitescooper/Main.pm, t/html/newstoriesdiff/index.html: added Torsten Uhlmann's isilo-X support patch 2001-11-25 23:31 jmason * lib/Sitescooper/Robot.pm, site_samples/regional_denmark/politiken.site: updated politiken, from Claus Hindsgaul 2001-11-12 04:01 jmason * site_samples/linux/kc_kde.site: updated kc_kde from Torsten Uhlmann 2001-10-31 00:05 jmason * site_samples/comics/family_circus.site: family_circus.site from Thean Yoon Fui 2001-10-26 04:27 barrygonzaga * site_samples/palm/palminfocenter.site: removed advertisement from contents fixed (P|p)olls.asp link in contents 2001-10-06 05:17 jmason * default_templates.html, lib/Sitescooper/Main.pm, lib/Sitescooper/PerSiteDirCache.pm, lib/Sitescooper/URLProcessor.pm, site_samples/palmsized/the_guardian_palmsized.site: fixed bug using -fromcache with shared cache 2001-10-02 09:06 jmason * site_samples/: regional_uk/the_guardian.site, science/new_scientist.site: a few site updates 2001-10-02 08:54 jmason * site_samples/science/new_scientist.site: updated newscientist site 2001-10-02 07:57 jmason * sitescooper.cf, lib/Sitescooper/DirCacheFactory.pm, lib/Sitescooper/Main.pm: added __OUTPUTFORMAT__ support 2001-10-02 07:35 jmason * site_samples/science/sciam.site: updated sciam site to honor caching 2001-09-27 05:59 jmason * lib/PDA/PilotInstall.pm: fixed PDA::PilotInstall to work with later palm desktops and activeperls 2001-09-25 06:53 jmason * lib/PDA/PilotInstall.pm: fixes from Tim Steele 2001-09-21 11:26 barrygonzaga * site_samples/palmsized/the_register.site: used rss file, palm-friendly site is/was not updated regularly. 2001-09-21 11:18 barrygonzaga * site_samples/palmsized/beyond2000-pda.site: used by2k's palm edition. 2001-09-20 10:02 barrygonzaga * site_samples/regional_philippines/pdi.site: fixed sites erroneous links 2001-09-19 10:27 barrygonzaga * site_samples/palmsized/cnn.site: added contents logo, removed duplicate
's on stories. 2001-09-19 10:25 barrygonzaga * site_samples/regional_philippines/: manila_bulletin.site, pdi.site: renamed/moved category, regional_philippines *not* regional_phillipines. 2001-09-17 04:20 jmason * site_samples/: bsd/openbsd_journal.site, palm/palminfocenter.site, palmsized/cnn.site, palmsized/ny_times_handheld.site, palmsized/the_register.site: site files from Barry Dexter A. Gonzaga 2001-09-14 05:49 jmason * site_samples/palmsized/the_guardian_palmsized.site: Guardian site updated by Stewart C. Russell (stewart /at/ ref.collins.co.uk) 2001-09-06 05:51 jmason * site_samples/business/businessweek.site: oops, forgot busweek 2001-09-05 05:45 jmason * site_samples/: palm/pdalive.site, palmsized/ny_times.site, palmsized/salon.site, news/gallup_poll.site, palm/palminfocenter.site: added sites from Barry Dexter A. Gonzaga 2001-08-27 06:13 jmason * site_samples/regional_denmark/politiken.site: added Politiken site from Claus Hindsgaul 2001-08-20 11:27 jmason * lib/Sitescooper/UserAgent.pm: fixed http auth support 2001-08-18 12:43 jmason * site_samples/regional_toronto/: globe_and_mail_columnists.site, globe_and_mail_national.site, globe_and_mail_thearts.site, globe_and_mail_toronto.site: globe+mail sites updated by Michael Graham (magog@the-wire.com) 2001-08-17 12:03 jmason * site_samples/regional_california/: la_times.site, latimes_nat.site, latimes_oc.site, la_times/la_times_frontpage.site, la_times/latimes_local.site, la_times/latimes_nat.site, la_times/latimes_oc.site, la_times/latimes_science.site, la_times/latimes_tech.site, la_times/latimes_world.site: added new LA Times sites from Mark Beckman (mbeckman at jps.net), and reorged them into a directory 2001-08-16 07:20 jmason * site_samples/comics/: flash_gordon.site, prince_valiant.site: Yoon Fui Thean: comics update 2001-06-28 14:47 jmason * site_samples/: business/cnn_financial.site, news/cnn_mobile.site, science/sciam.site, sport/cnn_sports.site: added SciAm site from Marko, and some CNN sites from David's PODS system translated by Marko 2001-06-28 14:45 jmason * lib/Sitescooper/Main.pm: added support for escaped-hashes in site files from Jeff Hecker 2001-06-21 23:12 jmason * site_samples/opinion/unblinking.site: fixed typo 2001-06-20 18:40 jmason * TODO: added Manila Bulletin site from Eric Pareja 2001-06-19 13:14 jmason * sitescooper.cf, doc/running.html: fixed doco a little 2001-06-16 23:00 jmason * lib/Sitescooper/LWPHTTPClient.pm, lib/Sitescooper/Main.pm, lib/Sitescooper/SCF.pm, lib/Sitescooper/URLProcessor.pm, lib/Sitescooper/Util.pm, site_samples/tech/firstmonday.site: added First Monday site, and worked around webserver bug 2001-06-11 16:49 jmason * Makefile: fixed MANDIR in sitescooper make install 2001-06-08 12:57 jmason * site_samples/tech/the_register.site: added sites 2001-06-08 12:55 jmason * site_samples/regional_germany/: de_heise.site, de_sueddeutsche.site, de_sz/de_sz.site, de_sz/de_sz_drei.site, de_sz/de_sz_politik.site, de_sz/de_sz_sport.site, de_sz/de_sz_wissen.site, de_zeit/de_zeit.site, de_zeit/de_zeit_alternate.site, de_zeit/de_zeit_kultur.site, de_zeit/de_zeit_leben.site, de_zeit/de_zeit_media.site, de_zeit/de_zeit_politik.site, de_zeit/de_zeit_reisen.site, de_zeit/de_zeit_wirtschaft.site, de_zeit/de_zeit_wissen.site: new de_sz, de_zeit and de_heise sites from Peter Marschall 2001-06-08 12:51 jmason * site_samples/regional_germany/de_sz/: de_sz.site, de_sz.site-halbwegs-ok, de_sz_bay.site, de_sz_bayern.site, de_sz_berlin.site, de_sz_beruf.site, de_sz_drei.site, de_sz_feuill.site, de_sz_feuilleton.site, de_sz_hochschule.site, de_sz_immobilien.site, de_sz_kultur.site, de_sz_literatur.site, de_sz_medien.site, de_sz_meinung.site, de_sz_muenchen.site, de_sz_nche.site, de_sz_pano.site, de_sz_panorama.site, de_sz_politik.site, de_sz_reise.site, de_sz_sonder.site, de_sz_sonderbeilage.site, de_sz_sport.site, de_sz_streifl.site, de_sz_streiflicht.site, de_sz_verkehr.site, de_sz_verm.site, de_sz_vier.site, de_sz_wirt.site, de_sz_wirtschaft.site, de_sz_wissen.site, de_sz_wochenende.site: new de_sz and de_zeit sites from Peter Marschall 2001-06-05 14:39 jmason * sitescooper.pl, lib/Sitescooper/Main.pm, lib/Sitescooper/PerSiteDirCache.pm, site_samples/languages/use_perl.site: added mod to not copy up .cvsignore