johnson et al._2005. a collaborative system for sharing
TRANSCRIPT
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 1/17
For permission to copy, contact [email protected]
2005 Geological Society of America 61
Geosphere; October 2005; v. 1; no. 2; p. 61–77; doi: 10.1130/GES00011.1; 7 figures.
A collaborative system for sharing paleontology collections data
Kenneth G. Johnson1
Harry F. Filkorn
Mary Stecheson Department of Invertebrate Paleontology, Natural History Museum of Los Angeles County, 900 Exposition Boulevard, Los
Angeles, California 90007, USA
ABSTRACT
Museum collections provide primary
data for paleontologists, and recent advanc-
es in information technology have revolu-
tionized how museums collect and share
this information. However, many natural
history museums have huge collections and
small budgets, so museum scientists arechallenged to keep these critical data cur-
rent and available to the public. We suggest
that establishing an open collaboration
through the Internet is one possible solution
to this challenge. To achieve this solution,
we have implemented a Web-based collec-
tions catalog to encourage collaborative
maintenance of collections data as a shared
resource. Anyone can search the catalog via
a simple interface designed for any stan-
dard Web browser, and Web users can also
be authorized to add information or update
records as stratigraphic and taxonomic
concepts change. The goal is to establishtwo-way communication between our cata-
log and the scientific community wherein
the museum shares its collections and re-
lated data, and in return the community
contributes new data acquired through use
of the collections. The catalog also provides
a basic function for building links with on-
line publications and other data sources. As
data exchange standards become accepted,
these links can be used to create metada-
tabases that could lead to global networks
of collections, taxonomic, stratigraphic, and
bibliographic information. By providing an
efficient mechanism to locate and synthesizelarge volumes of disparate information,
such loosely integrated systems have result-
ed in rapid progress in disciplines of the life
1Current Address: Department of Palaeontology,Natural History Museum, Cromwell Road, LondonSW7 5BD, UK.
and physical sciences, and they represent
one way forward into a data-rich future for
paleontology.
Keywords: geoinformatics, paleontology,
collections.
INTRODUCTION
Fossil specimens are the best record of the
occurrence of a particular organism at a spe-
cific time and place (Allmon and Poulton,
2000), so collections are the raw data of pa-
leontology. Collections are required for sub-
sequent researchers to check and reinterpret
previous work, and they are an important
source of new information that can be released
by the arrival of new technologies and new
research questions. For example, collections
have been used in studies based on morpho-
metric analysis, molecular methods including
DNA sequencing, and various geochemical
techniques (Suarez and Tsutsui, 2004; All-mon, 2005). Collections held by museums be-
come especially important in cases where
original exposures are no longer available for
collecting, as is commonly the case for man-
made exposures produced during road build-
ing, quarrying, or construction. However, col-
lections of fossils are only useful if they are
accessible to potential users. Traditional use of
paleontology collections required researchers
to visit museums and work with material on-
site or resort to secondary sources in the pub-
lished literature. In reality, much of the infor-
mation about the contents of paleontology
collections is passed along by word of mouth,as a kind of folklore: for example, Heinz Low-
enstam was a professor at the California In-
stitute of Technology, so his collections might
be held by an institution in Southern Califor-
nia. Obviously, this is not the most efficient
method to advertise the availability of impor-
tant research collections. For at least a decade,
it has been clear that the World Wide Web is
an ideal forum to publish collections catalogs.
Besides widespread availability and ease of
access, the Internet offers the additional ben-
efit of allowing databases to be integrated into
new networks of bioinformatics and geoinfor-
matics (Graham et al., 2004). Such networks
enable researchers to address questions re-
garding the large-scale history of regional orglobal diversity in response to global environ-
mental change (e.g., Jackson and Johnson,
2000; Alroy et al., 2001), and are an inevitable
part of the future of paleontology.
Most natural history collections belong to
public or nonprofit institutions that hold their
collections in the public trust (American As-
sociation of Museums, 2005). However, many
of these institutions have recently been subject
to budget shortfalls (Dalton, 2003; Suarez and
Tsutsui, 2004) that have reduced support for
collections. At the same time, changing or-
ganizational priorities has resulted in the
transfer of collections to a smaller number of
institutions (Gropp, 2003). For example, the
Department of Invertebrate Paleontology at
the Natural History Museum of Los Angeles
County (LACMIP) currently contains collec-
tions that formerly belonged to the University
of Southern California, the University of Cal-
ifornia at Los Angeles, the California Institute
of Technology, and California State Univer-
sity, Northridge. The consequence of these
transfers is that relatively small staffs are car-
ing for many large and important collections
that are critical to the future of paleontology.
Besides limitations in manpower, there is anincreasing shortage of expertise. With reduced
staff, most institutions do not have in-house
experts that can serve as taxonomic authorities
in the entire spectrum of fossil groups repre-
sented in their enormous combined collec-
tions. Without this expert knowledge in-house,
it is difficult to adequately maintain and im-
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 2/17
62 Geosphere, October 2005
JOHNSON et al.
Figure 1. An example of a specimen lot from the Department of Invertebrate Paleontology at the Natural History Museum of Los
Angeles County (LACMIP) collections, including paper labels containing potentially useful information that should be incorporated into
the LACMIP specimen catalog.
prove collections without enlisting the support
of experts in the broader paleontological com-
munity. This outside assistance must come
from the researchers using museum collec-
tions to address questions in their own spe-
cialized fields, whether Cambrian trilobites of
the Great Basin or Pleistocene mollusks of
western North America. Collections managers
provide free access to specimens and data, but
sharing must become a two-way street. The
research community using these resources
must contribute its expertise to ensure contin-
ued access to high-quality information. Other
fields of research within bioinformatics are
reaching the same conclusion (Eiden, 2004;
Wilson, 2005). To help achieve this, we have
developed a Web-based collections catalog
that can be jointly managed by the museum
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 3/17
Geosphere, October 2005 63
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 2. A schematic model illustrating the architecture underlying the Department of Invertebrate Paleontology at the Natural History
Museum of Los Angeles County (LACMIP) catalog. All information is stored in a relational database and is accessible through four
user interfaces. Web forms and REST-style Web services can be used to search, browse, and add information into the system. Software
underlying each component is indicated in parentheses, including Apache Web server, PostgreSQL database management system, and
SQL and PHP programming languages. Connections between Web server and clients on the World Wide Web can be encrypted using
the mod ssl module available with the Apache Web server software.
collections staff and research community as a
shared resource.The LACMIP holds more than five million
specimens, primarily from the western United
States, including the world’s largest collec-
tions of Cretaceous and Neogene mollusks
from western North America. Our collections
have been built over the past 90 yr and include
the important university collections mentioned
above that were transferred to the museum as
local universities decided to eliminate their re-
search collections. The department is currently
housed in an off-site facility about a half mile
from the main museum. This site contains col-
lections storage as well as laboratories and
staff offices. Within the collections space, the
fossils are stored in 674 steel cabinets. Spec-
imens collected from the same locality are
stored together, and the entire main collections
are arranged first according to geologic age
(Cambrian to Quaternary) and then by geo-
graphic place (country, state, county) within
each age. Each steel cabinet has a set of draw-
ers containing specimen lots. These are groups
of specimens from a single collecting locality
that have been identified as belonging to the
same taxon. Over the years, each lot may haveaccumulated a group of paper labels that con-
tains information regarding the fossil collect-
ing locality and sometimes multiple taxo-
nomic determinations made by different
researchers who have studied the material. For
example, the gastropod illustrated in Figure 1
has four different hand-written and typed la-
bels that contain such data. One of our chal-
lenges is to capture these data and make them
available to the public.
Cataloging of the collection was started in
the 1960s with the development of a card-
based locality register. In this system, each lo-
cality was given a unique number and a card
with essential geographic and stratigraphic in-
formation. These numbers were attached to
specimens and became the primary identifi-
cation of specimen lots in the collection. A
similar card file system was developed for
type and figured specimens. Each type speci-
men was associated with a unique number and
was cross-referenced with specimen identifi-
cation and bibliographic information. During
the late 1980s these data were entered by hand
into a custom collections management systemdeveloped in Borland Paradox. Nontype spec-
imens that had never been figured in publi-
cations were not cataloged. However, the card
system continued to be maintained in parallel
with the computer database and was consid-
ered the standard. In 2002 we extracted the
data from the legacy database and reformed it
into a new system.
THE LACMIP COLLECTIONS
CATALOG
Our goal was to build an electronic catalog
that could meet the following objectives: (1)
The catalog must allow the rapid acquisition
of basic taxonomic, stratigraphic, geographic,
and bibliographic information. The majority
of these data need to be entered manually by
part-time staff with little training, mainly vol-
unteers and work-study students, so we have
developed entry forms with pick lists to min-
imize typing of long and unfamiliar scientific
names; (2) The catalog must be accessible
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 4/17
64 Geosphere, October 2005
JOHNSON et al.
Figure 3. Forms for searching for collecting localities in the Department of Invertebrate Paleontology at the Natural History Museum
of Los Angeles County (LACMIP) catalog. A: Search form allows users to specify values for various fields. Continued on next page.
from any computer connected to the Internet.
To achieve this, we decided to take advantage
of a Web architecture approach and theexisting mature technology developed for e-
commerce sites on the World Wide Web. This
decision was made both to streamline the de-
velopment process and to allow access for the
broad community of research scientists con-
tributing to the site as well as museum staff
working in other locations; (3) The system
must be able to share information with outside
data networks in geoinformatics and other sys-
tems in our own institution. Therefore, we
used a multitiered application architecture to
facilitate this sharing; (4) The system must al-
low links to be made directly from online tax-
onomic publications to the type and figured
specimens in our collections. These are among
the most important materials in our collec-
tions, and we strive to maximize their expo-
sure for convenient use by the research com-
munity; and (5) Images of specimens,
collecting localities, and digital copies of field
notes, maps, and other resources must also be
available for remote use.
With these objectives in mind, we have de-
veloped a flexible, modular system that can be
adapted to changing technology because in-
dividual components can be added, modified,
or removed as necessary. This will allow the
system to be improved incrementally as new
technologies become available. For example,
the current system does not include collections
management functions so it cannot be used to
track loans, insurance values, or the physical
location of specimen lots. Our institutional
Office of the Registrar performs many of these
tasks, and we are building automated links
from their registration system to our collection
catalog. Our system also does not include a
sophisticated geographic information system
to allow mapping or geospatial analysis, norhave we attempted to track complex synony-
mies and changes in taxonomic practice. In-
stead we plan to take advantage of other tools
developed especially for these purposes. For
example, we would likely cede responsibility
for maintaining taxonomic information to oth-
er systems when distributed taxonomic dictio-
naries become available for fossils. The
LACMIP electronic catalog has been designed
to publish information regarding our collec-
tions only.
The new LACMIP system has been devel-oped as a Web-based, client-server database
with multiple interfaces (Fig. 2). The data are
stored in a relational database as a backend,
using the PostgreSQL database system
(PostgreSQL Global Development Group,
2005). Some of the basic business logic is im-
plemented on this server including checks for
referential integrity and triggers that enforce
data updates. At the moment there are four
interfaces to the data. The most simple is an
interface that communicates via the SQL da-
tabase programming language (Wikipedia,
2005) used for administration and mainte-
nance. Three interfaces written in the PHP
scripting language (PHP Group, 2005) run via
an Apache Web server (Apache Software
Foundation, 2005). Two of these interfaces are
Web forms that allow input, searching, and
browsing of the data using standard Web
browsers on any machine connected to the In-
ternet. One is composed of simple read-only
forms accessible to the public, and the second
interface includes data input forms and ac-
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 5/17
Geosphere, October 2005 65
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 3. (Continued.) B: Results for a search for localities from Redding Formation in Shasta County include 88 localities. Note that
there can be multiple entries for each field, for example, the age of locality LACMIP 10726 has been refined from Cretaceous to Turonian
by Harry Filkorn in August 2004. Public view cannot be modified. Continued on next page.
cepts user authentication using secure proto-
cols. The third interface is a set of basic Web
services built under a Web architecture (Ja-
cobs, 2004) or ‘‘REST-like’’ philosophy
(Fielding, 2000) that allows integration with
other systems.
Data models for systematics collections
have been described in detail elsewhere (As-
sociation of Systematics Collections Commit-
tee on Computerization and Networking,
1992; Morris, 2000; Pullan et al., 2000; Ra-
guenaud et al., 2002), and further analysis is
not warranted here. Our underlying database
structure is loosely based on these other mod-
els. The goal was to keep the schema rela-
tively simple but to capture as much useful
information as possible. The subject areas in-
clude localities, taxonomy, lots, people, im-
ages, and a bibliography. One critical differ-
ence between our model and many other
systems is that we track multiple interpreta-
tions for most data fields. That is, data are
never deleted as new information is added.
This is in keeping with the fundamental par-
adigm of collections data as tools for online
collaboration. All additions are time stamped
and marked with the name of the person that
made the contribution. This allows researchers
to know who added the information and when
it was added. Therefore, anyone who is inter-
ested can track changes in the system.
Locality associated information includes
geographic, stratigraphic, and collection data.
Our use of locality is similar to the concept
of collecting event used in the ASC model
(Association of Systematics Collections Com-
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 6/17
66 Geosphere, October 2005
JOHNSON et al.
Figure 3. (Continued.) C: In contrast, authorized users may add additional information using controls along left margin of form.Continued below.
Figure 3. (Continued.) D: Clicking the control for Unit results in a new form that can be used to add additional information regarding
stratigraphic units. This simple mechanism allows researchers to update the catalog as they use it from any computer connected to the
Internet.
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 7/17
Geosphere, October 2005 67
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 4. A new collecting locality can be added using this Web form.
mittee on Computerization and Networking,
1992). In theory it would be possible to make
multiple collections from the same geographic
and stratigraphic context, but in practice many
repeated collections are not from precisely the
same context. Therefore, we consider each
new collection as a new locality in our system.
The collector, field number, and date of col-
lection are associated with the collecting lo-
cality in the LACMIP data model. Geographic
data are categorized as political place names
(city, county, state or province, country) and
supplemented by detailed written descriptions
provided by collectors. Geospatial data are in-
cluded where available and provided by the
collector (usually in the form of United States
township/range system or latitude/longitude),
but standardized georeferencing remains to be
completed. Stratigraphic information is limit-
ed to stratigraphic units (member, formation,
group) and associated age range. The chron-
ostratigraphic units used in the system are the
internationally accepted standard stage names
(Geological Society of America, 1999). Ad-
ditional information on stratigraphy and age
can be included in the text description for each
locality.
A specimen lot is a group of specimens
from a collecting locality that has been sorted
out and identified as belonging to a particular
species or higher taxon. In theory all speci-
mens identified as the same species from a
single locality would be contained in one lot,
but in practice there might be more than one
lot of this species because of specimen abun-
dance, limitations in container size, or special
use of individual specimens from a lot (illus-
tration, geochemical analysis, etc.). Informa-
tion associated with specimen lots includes
taxonomic determinations, the number of
specimens in the lot, and whether the speci-
men has been cited in a published work. Dig-
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 8/17
68 Geosphere, October 2005
JOHNSON et al.
Figure 5. A simple pick list mechanism can be used to select taxonomic names. A: For example, when adding a new lot, determination
is selected using a pick list. In this form, user is searching for the genus Chione. Continued on next page.
ital images of specimens are provided for
some specimen lots.
Managing taxonomic data is a complex
problem, and data models have been devel-
oped to track synonymies, changes in rank,
splitting, and the detailed consequences of
changing taxonomic concepts and practice
(Taxonomic Databases Working Group, 2004;
Shattuck, 2005). The LACMIP catalog records
updates to determinations of specimen lots
and allows users to search for lots using su-
praspecific classification. We use a combina-
tion of our legacy database and data from the
United States Department of Agriculture In-
tegrated Taxonomic Information System
(ITIS) (ITIS, 2005) as the starting point for
mollusks and corals, and we could easily in-
tegrate other taxonomic dictionaries as they
become available for fossil groups. Multiple
determinations can be included for each spec-
imen lot, and we have implemented a basic
system for tracking synonyms to aid in the
consistent application of taxon names.
Although collecting localities and specimen
lots are the basic units of information in our
catalog, we also maintain information regard-
ing associated personnel, digital images, and
a bibliography relevant to the LACMIP col-
lections. These supplementary modules have
been kept simple. People associated with the
collections include collectors, collections
maintenance staff, authorized users of the cat-
alog, and specialists who have contributed
data to the system. A basic bibliographic table
that allows publications to be associated with
localities and specimen lots is also main-
tained. Most collection localities are associ-
ated with maps, and these are referenced as
publications in the bibliography. In our current
catalog, images are maintained as digital files
on a fileserver at two resolutions. Thumbnails
are small compressed files with widths of 150
pixels for photographs and 300 pixels for field
maps or other scanned images. High-resolution
images are also available to the public at
widths of 450 pixels for specimens and 800
pixels for scanned materials. Image file data
are maintained in a basic image database as-
sociated with our catalog so that they can be
published over the World Wide Web.
A WEB INTERFACE TO THE
COLLECTIONS CATALOG
There are both public and restricted Web
interfaces to the LACMIP collections catalog
(Fig. 3A–D). The public interfaces allow re-
searchers to browse the catalog over the World
Wide Web (Johnson et al., 2005a; Fig. 3B).
Note that we track multiple interpretations for
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 9/17
Geosphere, October 2005 69
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 5. (Continued.) B: One genus is found by that search and can be selected by choosing the Yes control. Continued on next page.
most data fields. The name of the person who
made each entry and the date of entry are in-
dicated in parentheses. Researchers can
browse through a set of localities or specimen
lots or can download the information for local
use. Data can be downloaded as delimited text
files that include only the most up-to-date in-
formation, because the full information asso-
ciated with any particular locality cannot be
represented in a simple two-dimensional table
if multiple interpretations are present for any
piece of information. For printing specimen
lot labels or hard copies of locality informa-
tion, portable document format (PDF) files
can be downloaded. A thumbnail is shown if
images are available, and higher-resolution
images can be viewed by clicking on the
thumbnail. The restricted Web forms can be
accessed using our secure Web server. Muse-
um collections staff and researchers interested
in contributing to the system are assigned user
names and passwords that are required to ac-
cess this part of the site. Restricted forms for
searching and browsing the catalog are similar
to the public pages except they allow input of
additional data.
The initial entry of locality and lot records
into the catalog can only be performed by mu-
seum collections staff. There are data entry
forms for each of the main subject areas (Fig.
4), written as standard hypertext markup lan-
guage (HTML) Web forms. An online data en-
try guide is provided to ensure consistent data
input, and pick lists have been implemented
where possible to minimize typographical er-
rors. For example, when a determination is
made, there are several steps to selecting a
taxon name (Fig. 5A–D). Also, modern Web
browsers have autocomplete functions that
may reduce typographic errors. There is a sim-
ple mechanism to increase the consistent use
of taxonomic names via tracking synonyms.
Junior synonyms can be associated with senior
synonyms so that when a junior synonym is
requested as a taxonomic determination, both
that name and the senior synonym are re-
turned as determinations. In general, this
interface has been designed to minimize
potential data-entry errors because much in-
formation is hand keyed into the catalog by
assistants who may have limited geological or
taxonomic expertise. However, information is
not proofed and all data entered into the sys-
tem are immediately available to the research
community.
Both the public and authorized researchers
can search and browse the data using the pro-
vided set of Web forms. For example, to find
all localities in Shasta County from the Redd-
ing Formation (Fig. 3A), a researcher needs to
fill in the appropriate fields on the locality
search form. In this case, a total of 88 local-
ities is returned, and users may browse
through them one by one (Fig. 3B). Alter-
nately, a researcher could return to the search
form and limit or refine the search (using the
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 10/17
70 Geosphere, October 2005
JOHNSON et al.
Figure 5. (Continued.) C: The pick list can then be used to select the appropriate subgenus and species within the genus. If a particular
taxon is not found in the system, it can be added by selecting entry for New in pick list. Continued on next page.
Modify Search control), or the researcher
could download the entire data set either as a
text file or a PDF-formatted file that is ready
to be printed. Authorized researchers see a
slightly different view (Fig. 3C), because they
are able to add information. The labels asso-
ciated with each line of data are now controls
that may be used to access additional forms
for data entry. For example, to add new in-
formation regarding the stratigraphic unit of a
locality, a contributor would click on the but-
ton marked Unit to use the appropriate form
(Fig. 3D).
Searching and browsing for specimen lots
is similar to working with locality data and
can be performed using a similar set of Web
forms (Fig. 6A–C). A search can be per-
formed for both lot information and the lo-
cality from which the lots were collected. For
example, a search for the gastropod genus
Paosia from the Redding Formation (Fig. 6A)
returns eight lots from a selection of localities
(Fig. 6B). Data for this list of lots can then be
downloaded as a text file by selecting the
Download Lot List, or labels for specimen
trays can be produced by selecting Create
Labels. Information for one of the lots (lot
LACMIP 10726-2) is shown in Figure 6C.
However, the downloaded data will not in-
clude all of the information associated with
this lot because this information cannot be or-
ganized into a simple two-dimensional table.
This lot has been identified several times, first
as Oonia? californica (Gabb, 1864), later as
Paosia colusaensis (Anderson, 1958), and
most recently as Paosia californica (Gabb,
1864). In addition, the specimen lot has been
cited in two publications (Jones et al., 1978;
Squires and Saul, 2004) as type specimen
LACMIP 10810. Several images are also
available that can be downloaded in high res-
olution. Information about locality LACMIP
10726 is at the bottom of the lot page, includ-
ing a map. This series of Web forms provides
the primary interface for the catalog. Similar
forms exist to search, browse, and add biblio-
graphic and biographic information. A com-
prehensive user guide that will assist research-
ers with use of the system, including standards
for data entry, is available through a link on
all of the forms.
As of May 2005, our entire locality register
of 27,970 collections has been included in the
catalog. To date 28,197 specimen lots have
been cataloged comprising 601,409 individual
specimens. We estimate that this includes
20% of our complete collection, but we do
not have precise estimates for the total size of
the collection. In fact, during the cataloging
process we are finding that the previous at-
tempts to estimate collection size probably are
25%–30% lower than the true figure. A sim-
ilar result may be obtained during cataloging
of other large paleontological collections. The
majority of these records is derived from our
extensive holdings of Neogene Mollusca from
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 11/17
Geosphere, October 2005 71
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 5. (Continued.) D: In this case Chione (Chionista) fluctifraga is selected as the determination for the new lot.
Southern California. Cataloging of this mate-
rial was determined to be a priority due to the
potential use for studies of the impact of re-
gional environmental change on shallow ma-
rine communities. In addition, our complete
set of type and figured specimen lots has been
incorporated, including 10,429 specimens.
These are the most important components of
the collection, so they were a priority for
cataloging.
WEB SERVICES
There are several problems with the type of
Web forms interface outlined above. Most se-
rious is the requirement for human intervention
to locate a particular piece of information re-
garding a particular locality or specimen lot.
This means that it is difficult to generate direct
links to information, for example to link from
another Web site to one particular locality. Sec-
ondly, the ‘‘Web spider’’ programs used by
standard Web search engines to index Web pag-
es cannot access Web forms easily. To over-
come these limitations, we have designed a
simple Web interface to the LACMIP catalog
that allows direct linking to individual locality,
specimen lot, type specimen, and digital image
records. We have followed a REST-like archi-
tecture (Fielding, 2000) that takes advantage of
existing Web protocols to allow access to our
data from outside systems. Each data resource
is represented by a Web address or unique re-
source locator (URL). These addresses are stat-
ic and easy to construct if the user knows what
he or she is looking for. For example, the URL
http://ip.nhm.org/ipdatabase/locality/17575 will
return whatever information we have regarding
locality 17575, and the URL http://ip.nhm.
org/ipdatabase/lot/10762–2 will link directly to
information about specimen lot 10762–2. Sim-
ilar links exist for type specimens and images
of specimen lots. For example, the URL http://
ip.nhm.org/ipdatabase/type/9786 links directly
to type specimen LACMIP 9786. The returned
pages are not static Web pages but are gener-
ated by the Web server at each request so they
are always up to date. As standard schemas for
the publication of paleontological specimen
data become available, we will be able to pub-
lish extensible markup language (XML)–
formatted information using this mechanism.
As a test of this Web services interface, we
developed a system that allows joint queries
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 12/17
72 Geosphere, October 2005
JOHNSON et al.
Figure 6. Specimen lot data can be browsed and new information can be added to the Department of Invertebrate Paleontology at the
Natural History Museum of Los Angeles County (LACMIP) collections catalog using Web forms. A: A search for the gastropod genus
Paosia from the Redding Formation is performed by entering Paosia and Redding in the appropriate fields. Continued on next page.
across both the Holocene and fossil mollusk
collections at the Natural History Museum of
Los Angeles County (LACM; Johnson et al.,
2005b). In our institution, most departments
use different systems that are appropriate for
the needs of each department. For example,
the LACM Holocene malacology database re-
quires no treatment of stratigraphy, and the in-
vertebrate paleontology database has no way
to track water depth. In the joint search tool,
searches are performed on a subset of fields
from the LACM malacology and LACMIP da-
tabases, and the results include links back into
the original databases so users can access
more complete information that might not be
contained in both systems (Fig. 7A–C).
DISCUSSION
Developing any information system re-
quires compromise. Our priority has been to
publish as much collections-related informa-
tion as possible with limited resources. The
quality of these data varies, but even imperfect
data can be useful (Lieberman and Kaesler,
2000). Furthermore, we acknowledge that as
museum curators we will never have the re-
sources to fully verify the immense volume of
information held in our catalog. Instead, the
paleontological community must help with
this never-ending task. The LACMIP catalog
is a living document that is constantly being
improved by museum staff and database users,
and the information published in it should not
be used uncritically in large compilations. Al-
though effort is made to publish only accurate
information, there is large variation in the
quality of the information included in the cat-
alog. Stratigraphic and taxonomic concepts
change with time, and these updates are not
always included in the system. Indeed, we
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 13/17
Geosphere, October 2005 73
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 6. (Continued.) B: This search returned eight lots from various collecting localities; selecting one of the buttons on the left of
page will return more information for a particular lot. Continued on next page.
hope that users will help improve the data, and
we ask that the authors publishing studies
based on information in the LACMIP cataloghelp update the catalog with new information
and interpretations resulting from their re-
search. We also expect authors to include ci-
tations to the LACMP catalog if they have
used it as a data source.
Besides sharing information with our com-
munity of researchers, we encourage links to
the LACMIP catalog from online versions of
publications that make use of our collections.
Such links should enhance greatly the utility
of research collections catalogs (National Re-
search Council, 2002). For example, papers
published in the online version of the Journal
of Paleontology or Geosphere could contain
direct links from specimen or locality citations
to the LACMIP catalog. Such links allow
readers rapid access to the most up-to-date in-
formation available. Changes in the interpre-
tation of stratigraphy, environment, or taxo-
nomic classification cannot be tracked in a
static document, but the static document can
provide links back to systems that can be
changed. Similar links could be incorporated
into compilations of paleontological occur-
rences based on published records, thus allow-
ing users direct access to the underlying dataand allowing database administrators to auto-
matically track revisions in data associated
with museum collections. In the current im-
plementation we have adopted a Web archi-
tecture approach rather than a more complex
Web services approach. The benefit of this
type of interface is that it can be implemented
right now—the protocols exist, and they are
simple to use. The only software required to
view the catalog is a standard Web browser.
Furthermore, as new data standards and mes-
saging protocols develop we will be able to
accommodate them into new versions of the
LACMIP collections catalog.
Probably the main obstacle to the wide-
spread adoption of community-based catalogs
is encouraging qualified researchers to con-
tribute hard-earned data to a collaborative sys-
tem. There are several potential models to rec-
tify this problem, some of which offer a
‘‘carrot,’’ and others that threaten a ‘‘stick.’’
For example, we could require some level of
contribution as a condition for providing loans
of specimens or access to the collections cat-
alog, but so far we reject this approach be-
cause it might result in reduced collectionsuse. An alternative is to provide a mechanism
by which contributors could receive some
form of professional credit in the form of mea-
sures that could be added to curricula vitae or
management reports used in professional per-
formance reviews. To achieve this, we plan to
implement an electronic recorder or score-
board that lists the number and type of data
contributed by each member of the commu-
nity using the catalog. As links develop into
the catalog from online journals or other pub-
lications, this scoreboard could be used to
track usage of particular types of information,
and the resulting track record could be used
to weight contributions from individual re-
searchers in the same way that publications
are weighted based on the number of times
that they are cited in works of other authors.
However, in the end, researchers and other us-
ers of information in the LACMIP catalog
must take on part of the responsibility for
maintaining this shared resource. As a com-
munity, we all require high-quality informa-
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 14/17
74 Geosphere, October 2005
JOHNSON et al.
Figure 6. (Continued.) C: For example, the complete record for lot 10726-2 includes taxonomic determination, type status, citation
information, and collecting locality details. In this case images of the specimen and a map of the collecting locality are available.
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 15/17
Geosphere, October 2005 75
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 7. The Web services interface to the Department of Invertebrate Paleontology at the Natural History Museum of Los Angeles
County (LACMIP) catalog has been used to construct a joint search tool for the catalogs of the Department of Invertebrate Paleontology
and the Malacology Section (LACM) of the Natural History Museum of Los Angeles County. A: This search form allows researchers
to locate specimens from two different data sets. Continued on next page.
tion to place fossils in the proper taxonomic,
stratigraphic, and geologic context because the
scientific value of paleontological collections
lies as much in this context as in the fossils
themselves. Unfortunately, with the funding
levels currently available for the support of
collections, museum staff will never be able
to maintain and update all of the information
for researchers and other users of the catalog.
The resulting bottleneck will impede progress
by limiting the availability of up-to-date in-
formation. To avoid this, the community of
paleontologists must perform as much of the
required maintenance and updating as their
collections use dictates. These data must be-
come a shared resource maintained by all as
we move together into a data-rich future for
paleontology.
ACKNOWLEDGMENTS
Much of the data in the LACMIP collections cat-alog was entered by our predecessors including J.M.Alderson, L.T. Groves, G. Kennedy, P.G. Owen, andE.C. Wilson. Our team of work study students, re-search associates, and volunteers includes M. Alon-so, J. Cline, S. Cowles, A. Fu, B. Gillies, L. Moore,H. Murdock, L.R. Saul, J. Severe, R.J. Stanton Jr.,and J. Wiggins. We thank C.M. Kelly for producingmany of the photographs. W. Allmon, W. Kiessling,D. Pentcheff, A. Valdes, and R. Wetzer provideduseful suggestions for improving this contribution.We gratefully acknowledge the support of the Unit-ed States National Science Foundation (grant DBI-0237337).
REFERENCES CITED
Allmon, W.D., 2005, The importance of museum collec-
tions in paleontology: Paleobiology, v. 31, p. 1–5.
Allmon, W.D., and Poulton, T.P., 2000, The value of fossil
collections, in White, R.D. and Allmon, W.D., eds.,
Guidelines for the management and curation of inver-
tebrate fossil collections: Paleontological Society Spe-cial Publication 10, p. 5–24.
Alroy, J., and 24 others, 2001, Effects of sampling stan-
dardization on estimates of Phanerozoic marine diver-
sification, Proceedings of the National Academy of
Sciences, v. 98, p. 6261–6266.
American Association of Museums, 2005, Code of ethics for
museums: http://www.aam-us.org/museumresources/
ethics/coe.cfm (May 2005).
Anderson, F.M., 1958, Upper Cretaceous of the Pacific coast:
Geological Society of America Memoir 71, 378 p.
Apache Software Foundation, 2005, Apache http server,
v. 1.3.33: http://www.apache.org (February 2005).
Association of Systematics Collections Committee on Com-
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 16/17
76 Geosphere, October 2005
JOHNSON et al.
Figure 7. (Continued.) B: A search for genus Terebralia in both LACM and LACMIP collections results in a total of six lots, including
four from fossil localities and two sites from the Holocene. Continued on next page.
puterization and Networking, 1992, An information mod-el for biological collections: http://www.nscalliance.org/
bioinformatics/asc%20model/Ascmodrpt.pdf (December
2004).
Dalton, R., 2003, Natural history collections in crisis as
funding is slashed: Nature, v. 423, p. 575.
Eiden, L.E., 2004, A two-way bioinformatic street: Science,
v. 306, p. 1437, doi: 10.1126/science.1107196.
Fielding, R.T., 2000, Architectural styles and the design of
network-based software architectures [Ph.D. thesis]: Ir-
vine, University of California, http://www1.ics.uci.edu/
%7Efielding/pubs/dissertation/top.htm (October 2004).Gabb, W.M., 1864, Description of the Cretaceous fossils:
Palaeontology, v. 1, p. 55–236.
Geological Society of America, 1999, 1999 geologic time
scale: http://www.geosociety.org/science/timescale/
timescl.htm (January 2005).
Graham, C.H., Ferrier, S., Huettman, F., Moritz, C., and
Peterson, A.T., 2004, New developments in museum-
based informatics and applications in biodiversity
analysis: Trends in ecology and evolution, v. 19,
p. 497–503, doi: 10.1016/j.tree.2004.07.006.
Gropp, R.E., 2003, Are university natural science collec-
tions going extinct?: Bioscience, v. 53, p. 550.Integrated Taxonomic Information System, 2005, Integrated
Taxonomic Information System: http://www.itis.usda.
gov (February 2005).Jackson, J.B.C., and Johnson, K.G., 2000, Life in the last
few million years, in Erwin, D.H., and Wing, S.L.,
eds., Deep time: Paleobiology’s perspective: Paleobi-
ology, supplement to v. 26, p. 221–235.
Jacobs, I., ed., 2004, Architecture of the World Wide Web,
volume one: http://www.w3.org/TR/2004/REC-
Webarch-20041215 (December 2004).
Johnson, K.G., Filkorn, H.F., and Stecheson, M., 2005a,
Collections catalog of the Department of Invertebrate
Paleontology, Natural History Museum of Los An-geles County: http://ip.nhm.org (February 2005).
Johnson, K.G., Valdes, A., and Groves, L.T., 2005b, Extinct
and extant molluscs in the collections of the Natural
History Museum of Los Angeles County: http://
ip.nhm.org/nhmsearch/findlots.php (May 2005).
Jones, D.L., Sliter, W.V., and Popenoe, W.P., 1978, Mid-
Cretaceous (Albian to Turonian) biostratigraphy of
northern California: Annales de Museum l’Histoire
Naturelle de Nice, v. 4, p. xxii.1–xxii.13.
Lieberman, B.S., and Kaesler, R.L., 2000, The scientific
value of natural history museum collections, in White,
R.D. and Allmon, W.D., eds., Guidelines for the man-
agement and curation of invertebrate fossil collections:
Paleontological Society Special Publication 10,
p. 109–117.
Morris, P.J., 2000, A model for invertebrate paleontologycollections information, in White, R.D. and Allmon,
W.D., eds., Guidelines for the management and cura-
tion of invertebrate fossil collections: Paleontological
Society Special Publication 10, p. 155–260.
National Research Council, 2002, Geoscience data and col-
lections: National resources in peril: National Acade-
my of Sciences, 128 p.
PHP Group, 2005, PHP version 4.3.10: http://www.php.net
(February 2005).PostgreSQL Global Development Group, 2005, Postgre-
SQL, version 8.0: http://www.postgresql.org (Febru-
ary 2005).
Pullan, M.R., Watson, M.F., Kennedy, J.B., Raguenaud, C.,
and Hyam, R., 2000, The Prometheus Taxonomic
Model: A practical approach to representing multiple
classifications: Taxon, v. 49, p. 55–75.
Raguenaud, C., Pullan, M.R., Watson, M.F., Kennedy, J.B.,
Newman, M.F., and Barclay, P.J., 2002, Implementa-
tion of the Prometheus Taxonomic Model: A compar-
ison of database models and query languages and an
introduction to the Prometheus Object-Oriented Mod-
el: Taxon, v. 51, p. 131–142.
8/6/2019 Johnson Et Al._2005. a Collaborative System for Sharing
http://slidepdf.com/reader/full/johnson-et-al2005-a-collaborative-system-for-sharing 17/17
SHARING PALEONTOLOGY COLLECTIONS DATA
Figure 7. (Continued.) C: Links are available on the page along with results that allow researchers convenient access to the additional
information in the LACMIP catalog. For example, the full information for specimen lot 26814-1 is available from http://ip.nhm.org/
ipdatabase/lot/26814-1.
Shattuck, S., 2005, Biolink, version 2.0: http://www.ento.
csiro.au/biolink/index.html (January 2005).
Squires, R.L., and Saul, L.R., 2004, The pseudomelaniid
gastropod Paosia from the marine Cretaceous of the
Pacific slope of North America and a review of the
age and paleobiogeography of the genus: Journal of
Paleontology, v. 78, p. 484–500.
Suarez, A.V., and Tsutsui, N.D., 2004, The value of mu-
seum collections for research and society: Bioscience,
v. 54, p. 66–74.
Taxonomic Databases Working Group, 2004, International
Working Group on Taxonomic Databases: http://
www.tdwg.org (November 2004).
Wikipedia, 2005, SQL, in Wikipedia, the free encyclopedia:
http://en.wikipedia.org/wiki/SQL (April 2005).
Wilson, E.O., 2005, Systematics and the future of biology:
Proceedings of the National Academy of Sciences of the United States of America, v. 102, p. 6520–6521,doi: 10.1073/pnas.0501936102.
MANUSCRIPT RECEIVED BY THE SOCIETY 18 FEBRUARY 2005REVISED MANUSCRIPT RECEIVED 10 MAY 2005MANUSCRIPT ACCEPTED 17 MAY 2005
Printed in the USA