Chapter 9: Remote Services

Ivia is a large software project with many generally-useful capabilities, but is so complex that it cannot always be integrated with legacy systems. The Remote iVia Services Interface (RiSI) is a Web Service interface that allows external programs (and people) to exploit the capabilities of an iVia installation without requiring a lot of system integration.

RiSI is described more fully in an accompanying manual, the Remote iVia Service Interface Guide (or "RiSI Guide") that accompanies this manual. This chapter provides a summary of RiSI; see the RiSI Guide for a full description.

Overview

The Remote iVia Service Interface provides a set of services that external people or programs might want to request from iVia.

Three major remote services are currently offered: the Assign Metadata Service is used to automatically assign metadata to specified URLs; the Expert-Guided Crawler Service is used for crawling Web sites to discover new resources; and the Augment NSDL Collection Service is used to import collections for the National Science Digital Library and add them (with augmentations) to iVia.

Interaction

The Remote Services share a common interaction style. Services are requested with an HTTP Request over the Web. The iVia Adders Web site also provide simple forms that human users can use to request services.

Result sets are also returned over the Web, either directly to your Web browser, or indirectly (at a later time) using the iVia Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) server.

Tasks

Each service request is called to as a task. Each task request will be assigned a task identifier, which uniquely identifies the task, and can be used to monitor it and check its status.

Log files

All Remote Service requests are logged. The main log file can be viewed from the Adders' homepage by choosing using the iVia Log File Viewer (in the Remote Services project, in the the remote_services.log file).

Background and Foreground Tasks

Tasks can either be foreground or background operations. The results of a foreground task are returned immediately in your Web browser; this mechanism is simple, but not suitable for long-running tasks because the HTTP connection between the Web server and your browser may time out.

As the name implies, background tasks do not provide immediate feedback; instead, they are run as a background process. Each task must provide a harvest tag, which is used to retrieve the results with OAI-PMH. An email completion notification can be requested to let you know that a task is complete. Some long-running services only work in background mode.

The Remote Service Status Script

The remote_services_status CGI script is used to get the current status of a task. It is passed a task identifier, and returns a simple descriptive value.

OAI-PMH and Harvest Sets

The results of background tasks are stored as records in the iVia database. The request must specify a harvest_tag parameter, which is used to define an OAI-PMH set, and can be used to harvest the results from the iVia OAI-PMH server. The full set of results is not available until the task is finished, but OAI-PMH can be used to download the available records at any time (i.e. records can be harvested incrementally during the task).

Installing RiSI

RiSI is installed with iVia, but must be enabled before it can be used. The service is enabled in the Remote Services section of the iVia.conf file by setting enabled=true. Then individual services must be independently configured. For example, in the following extract, the Assign Metadata service is enabled, but the other two are not.

 [Remote Services]
 enabled  = "true"
 risi_assign_metadata    = "true"
 risi_expert_guided_crawl= "false"
 risi_augment_nsdl_collection = "false"
 email_sender_address    = "joe.user@example.org"

Chapter 8 explain how to configure the OAI-PMH server for harvesting result sets. You must enable iVia's Dynamic subsets feature.

The iVia Log File Viewer must also be configured to allow the logs to be inspected. Update the view_log.conf file by adding the value remote_services to the projects variable in the Projects section, and adding the following line to the file:

 include view_log.remote_services.conf