Our research and development, funded by an IMLS National Leadership Grant (U.S. Institute for Museum and Library Services), the National Science Digital Library (National Science Foundation) and the Fund for the Improvement of Post-Secondary Education (U.S. Department of Education) addresses one of the biggest challenges facing virtual libraries. That is, multi-subject, expert-built, high quality virtual library portal collections cannot scale to the huge size and rapid growth of the Internet, and the concordant expectations of users for greater content and granularity in searching. Our work will help enable expert-built, multi-subject portals to scale well.
We believe that much of the solution can be found in encouraging cooperation and reducing redundant effort as well as in developing technologies that make multi-institutional work in the same project efficient and complementary. The remainder of the solution can be found in using machine learning techniques to automate or semi-automate the more laborious tasks in resource identification and metadata generation. Machine assistance can be used effectively to amplify the efforts of subject expert based collection building.
We have created and continue to develop iVia, a system for building hybrid portals that includes both expert as well as automatically created records. iVia has the advantages of both major types of Internet finding tool: the quality of portal records (comprised of expert selected and described resources) together with some of the reach or quantity of records found in search engines (automatically identified and described resources). In our newest project, we are working with the National Science Digital Library to improve and bring iVia into their core system. The iVia software, upon which INFOMINE runs, is open source, free and can be downloaded at: http://ivia.ucr.edu/.
Data Fountains is a new project that we've been working on through IMLS support. It consists of an array of improved iVia systems, one for each cooperating virtual library collection, that will discover new resources on the Internet (collection development) and generate metadata for them (indexing). A large emphasis in Data Fountains is on exploring subject expert interventions in machine processes that yield more accurate machine processes and, conversely, areas where machine assistance can successfully amplify expert effort. Exploring the bounds and types of semi-automated collection building is a major goal in this project. The Data Fountains work, now has a stable version release.
Our views on automation and collection building are found in S. Mitchell, et. al., 1/03, iVia: Open Source Virtual Library Software, D-Lib Magazine and in S. Mitchell, et. al., 9/04 (in press), Enabling Technologies and Service Designs for Collaborative Internet Collection Building, Library Hi Tech, v. 22 (3).