iVia Notes on Expert Guided Crawlers

I.) Expert Guided Crawlers

  1. Expert Guided Crawler:
  2. Expert Guided Crawler + Drill Down/Out:
    The above are their official names now.

II.) Modes that We may Want to Design In:

There are many applications for the following possible modes of Expert Guided Crawling:

We can have these crawlers be interactive.

We could combine them with rich text, aboutness seeking.

Determinant Mode:
e.g., hierarchical vl (drill down all levels and go out 1 link)
e.g., preprint collections (drill down 1 level and go out 0 levels) do the whole paper (no aboutness)

In Process Interactive Determinant Mode:
e.g., hierarchical vl (drill down all levels and go out 1 link) then return ti/URL results list for manual review with NO options to eliminate bogus sites before record building

Explore Mode:
e.g., hierarchical vl (drill down all levels and out 3 links): explore mode (rich text/aboutness mode ON)
e.g., preprint collections (drill down 1 level and go out 0 levels) but use explore mode (rich text/aboutness mode ON) to id and work with "abstracts" and "conclusions" only

In Process Interactive Explore Mode:
e.g., hierarchical vl (drill down all levels and out 3 links): explore mode (rich text/aboutness mode ON) then return ti/URL results list for manual review with NO options to eliminate bogus
sites before record building

III.) Input/Output:

IV.) Targets:

First Target:
http://www.indiana.edu/~cheminfo/

Collections as Lists
http://www.newscientist.com/weblinks/
http://www.nap.edu/csarc.html
http://pests.ifas.ufl.edu/bestbugs/
http://www.100topsciencesites.com/
http://www.education-world.com/science/
http://www.peebles.scoca-k12.org/links/science.htm
http://scorescience.humboldt.k12.ca.us/fast/kids.htm
http://faculty.washington.edu/chudler/neurok.html

Collections offering some kind of Search Engine
http://www.sciencemag.org/netwatch/
http://www.monarchwatch.org/
http://pages.britishlibrary.net/charles.darwin/
http://www.bbc.co.uk/webguide/science/index.shtml
http://agrifor.ac.uk/