Report prepared for the Experts Meeting Towards the Implementation of a Global Invasive Species
Information Network (GISIN), 6-8 April, 2004. Baltimore, Maryland, USA.
Page 39
8/30/2004
application with a user interface (perhaps on
a browser or plug-in) is trying to solve a
scenario like the Karnataka question. In
Web Services, databases register
themselves in the Uniform Description,
Discovery, and Integration (UDDI) registry
(arrows between IS1, IS2, and DG in the
Web Services Architecture diagram below).
The registry, a database of databases, is
populated with enough information to permit
applications to discover what kinds of
information are contained within each of the
databases cataloged by the registry.
The DG would for example register the fact
that it is a database that contains
information about the geographic footprints
of the State of Karnataka. The IS1 and IS2
may register that they contain metadata
about basic species in/about India. These
databases may not describe what the
metadata is, but where to find it. At this
point the Application has to perform whats
referred to as discovery. It has to ask the
UDDI registry: Where are the things that
solve my needs? This is a two step
process. The first step is the discovery of
the DG data, which the application queries
for metadata. The second step is discovery
of the actual invasive species data, which
the Application also has to query. Once
these two steps have been completed, the
information has to be combined because in
this case, it came from two different
sources. That step is called integration or
you may hear it referred to as federation of
databases.
Its very important to understand that
integration or federation is not really part of
Web Services. The process succeeds or
fails on the quality of the information that is
contained in the envelope and Web
Services has nothing to do with the quality
of the results. Web Services are as good at
transporting garbage as they are at
transporting good quality information.
The last step is to present the results. This
may involve delivery to someone viewing a
Web browser, or to a software application
that is going to perform further operations
with the resulting data.
This scenario represents the type of
distributed model that Web Services are
designed for. Web Services are simply
protocols for those parts of the information
retrieval process that are represented by the
arrows (i.e., registration, discovery, and
query). There are other examples of
software systems that address the same
functions, such as the Distributed Generic
Information Retrieval (DiGIR) protocols that
have been adopted by GBIF. These
systems were stabilized sooner than Web
Services so in some cases digital architects
chose to implement the simpler DiGIR
system over the more powerful Web
Services whose infrastructure was
incomplete. GBIF has however committed
to and begun working on developing Web
Services interfaces.
Biologists should not be deterred by the
seemingly abstract or technical nature of
Web Services. Web Services are meant to
be interoperable in that they should not be
dependent on the hardware or Web servers
that are in the chain of communications.
Apache is the manufacturer of the most
widely used Web servers on the Web.
Microsoft.net has a very similar story in that
its Web Services are functionally the same
and usually interoperable with the java-
based Web Services that Apache has been
standardizing.
IS1
IS2
UDDI
Registry
Application
DG
Register (R)
Discover
Query
Integrate (I)
Present
(R)
(R)
(I)
(I)