Current R&D Projects
INFOMINE's research and development concentrates on two main areas: Web Crawling and Metadata Assignment. Unless otherwise noted, the software is part of iVia and freely available for download.
Research Project Overview
- Publications
- Web Crawling
- Metadata Assignment
- Other Metadata Tools
- Project Planning
- Archived Projects
Web Crawling
INFOMINE uses a range of Web crawlers to discover new Internet resources:
- The Nalanda iVia Focused Crawler is a focused Web crawler based on Dr. Soumen Chakrabarti's pioneering work in this field. This software is available as a separate project on the download page.
- The INFOMINE Virtual Library Crawler is a "Web robot" that uses the academic virtual libraries cataloged in INFOMINE as starting points to discover new resources.
- The INFOMINE Automatic Focused Crawler extracts a set of topics from INFOMINE, and uses the Nalanda iVia Focused Crawler to look for new resources similar to each topic for inclusion in INFOMINE. This software is part of the iVia package, but requires Nalanda iVia Focused Crawler.
- The Expert-Guided Crawler with Drill-down is a tool that enables indexers to crawl a Web site (or a list of Web sites) and discover new resources. This software is part of the iVia package.
- The Creme De La Crawlers feature automatically rates robot-created resources as to their likely value in the collection. The most highly rated are flagged to indexers, to save them time in discovering significant new resources. Records from both the Virtual Library Crawler and Automatic Focused Crawler may be suggested.
Metadata Assignment
There are currently modules in iVia for assigning a range of metadata fields, including Title, Creator, Contributor, Publisher, Key phrases, Library of Congress Subject Headings, Library of Congress Classification, Description, Language Format (i.e. Media Type) and INFOMINE Categories.
- The iVia Automatic Metadata Assignment tools are used to assign metadata values to Internet resources.
- The iVia Automatic Metadata Evaluation tool is used to measure the quality of automatically assigned metadata.
- A variety of Fielded Metadata Assignment Tools are employed to assign metadata to specific fields.
Other Metadata Tools
The iVia Project has developed some other metadata tools that stand apart from our main metadata assignment library.
- The LCSH to LCC LCC assignment module will automatically assign documents a classification from the LCC Outline. The project is led by Dr. Eibe Frank (Department of Computer Science of The University of Waikato) and Dr. Gordon W. Paynter at INFOMINE. This software is written in Java, and available as a separate project on the download page.
- PhraseRate is a tool developed by Keith Humphreys for extracting a set of meaningful, attractive key phrases from a Web page.
Project Planning and Exploration
Project Planning and Exploration notes.