ray denenberg ralph levan interoperability standards & searching multiple repositories workshop...

76
Ray Denenberg Ray Denenberg Ralph LeVan Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 Workshop 20 March 25, 2006; Washington March 25, 2006; Washington

Upload: ruben-rench

Post on 14-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Ray Ray DenenbergDenenberg

Ralph LeVanRalph LeVan

Interoperability Standards & Searching Multiple Repositories

Workshop 20Workshop 20

March 25, 2006; WashingtonMarch 25, 2006; Washington

Page 2: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

SRW (Brief) History

Page 3: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

late 90’s …..Initiatives to make Z39.50…. Simpler More comprehensible More easily implemented Web compatible …..while retaining the rich

semantics developed over many years

Page 4: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Fast Forward ......Fast Forward ......

Page 5: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

SRW/U SRW

Search and Retrieve Web Service

SRU Search and Retrieve via URL

SRW/U Search and Retrieve for the Web

Page 6: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Z39.50Z39.50

TCPTCP

Classic Classic Z39.50Z39.50

Page 7: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

SRW: over SOAP/HTTP

Z39.50Z39.50

TCPTCP

Classic Classic Z39.50Z39.50

SRWSRW

TCPTCP

““Search and Search and Retrieve web Retrieve web Service”Service”

SOAPSOAP

HTTPHTTP

Page 8: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

SRU: over HTTP

Z39.50Z39.50

TCPTCP

Classic Classic Z39.50Z39.50

SRUSRU

TCPTCP

““Search and Search and Retrieve via URL”Retrieve via URL”

HTTPHTTP

Page 9: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

<SOAP:Envelope> <SOAP:Body> <SRW:searchRetrieveRequest xmlns:SRW="http://www.loc.gov/zing/srw/v1.0/"><SRW:query>dinosaurs</SRW:query><SRW:startRecord>1</SRW:startRecord> <SRW:maximumRecords>10</SRW:maximumRecords> <SRW:recordSchema>mods/</ SRW:recordSchema> </SRW:searchRetrieveRequest> </SOAP:Body></SOAP:Envelope>

SRW -in a SOAP EnvelopeSRW -in a SOAP Envelope

Page 10: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

…..Same request via SRU:

http://acme.com/sru?http://acme.com/sru?query= query= dinosaurs &&maximumRecords=maximumRecords=1010&&startRecord=startRecord=11&&recordSchema=recordSchema=modsmods

Page 11: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

Page 12: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

““Z39.50 Z39.50 (international) Next (international) Next

Generation”Generation”

Page 13: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

Page 14: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru

Page 15: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru

cqlcql

Page 16: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru

cqlcql

““Common Query Language”Common Query Language”

Page 17: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru

cqlcql

““Common Query Language”Common Query Language” --> --> ““Contextual Query Language”Contextual Query Language”

Page 18: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru

cqlcql

ZOOMZOOM

ez3950ez3950

zeeRexzeeRex

RecordRecordUpdateUpdate

Page 19: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru zeeRexzeeRex

Z39.50 Explain: Z39.50 Explain: explained and re-explained and re-engineered in XMLengineered in XML

Page 20: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru zeeRexzeeRex

Z39.50Z39.50

Page 21: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru zeeRexzeeRex

Page 22: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru ZOOMZOOM

Z39.50 Object Z39.50 Object Oriented ModelOriented Model

Page 23: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru ZOOMZOOM

Z39.50Z39.50

Page 24: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru ZOOMZOOM

Page 25: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

ZINGZING

srwsrw srusru

Page 26: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

SRW/U retains these Z39.50 concepts……….

result sets abstract access points abstract record schemas application level diagnostics “Explain”

Page 27: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

….. But differs from Z39.50 in these respects: Web-based Connectionless XML CQL

Page 28: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

….. But differs from Z39.50 in these respects: Web-based Connectionless XML:

Protocol (no ASN.1) + records (no “record syntax”)

CQL user-friendly query

Page 29: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Z39.50 SRW/U Connections/Sessions/

State Multiple services bound

together in a single protocol

Distinct Search and Present services

Databases Record Syntaxes RPN ASN.1/BER

Connectionless, stateless Different Z39.50 services

are different web services

Search/Present bound in a single web service

Servers Just one: XML String query language XML

Page 30: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

request via SRU:

http://acme.com/sru?http://acme.com/sru?query= query= dinosaurs &&maximumRecords=maximumRecords=1010&&startRecord=startRecord=11

&&recordSchema=recordSchema=modsmods

Page 31: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Record Schemas dc mods onix marcxml ead

www.loc.gov/sru/record-schemas.html

Page 32: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

request via SRU:

http://acme.com/sru?http://acme.com/sru?query= query= dinosaurs &&maximumRecords=maximumRecords=1010&&startRecord=startRecord=11&&recordSchema=recordSchema=modsmods

Page 33: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

request via SRU:

http://acme.com/sru??version=1.1version=1.1& operation=searchRetrieve& operation=searchRetrieve

&query= &query= dinosaurs &&maximumRecords=maximumRecords=1010&&startRecord=startRecord=11&&recordSchema=recordSchema=modsmods

Page 34: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager

http://z3950.loc.gov:7090/voyager?version=1.1& operation=explain

http://z3950.loc.gov:7090/voyager?version=1.1& operation=searchRetrieve&query=dinosaur

explain:explain:

Search for Search for “dinosaur”:“dinosaur”:

Page 35: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=dinosaur

&maximumRecords=1

http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=dinosaur

&maximumRecords=1&recordSchema=dc

Search for “dinosaur”, return 1 record, marcxml:Search for “dinosaur”, return 1 record, marcxml:

Search for “dinosaur”, return 1 record, dc:Search for “dinosaur”, return 1 record, dc:

Page 36: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=dinosaur&&operation=searchRetrieve&query=dinosaur&

startRecord=2&maximumRecords=1&startRecord=2&maximumRecords=1&recordSchema=dcrecordSchema=dc

return second record:return second record:

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=dinosaur&&operation=searchRetrieve&query=dinosaur&

startRecord=3&maximumRecords=2&startRecord=3&maximumRecords=2&recordSchema=dcrecordSchema=dc

Records three and four:Records three and four:

Page 37: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

<zs:records><zs:records> <zs:record><zs:record> <zs:recordSchema>info:srw/schema/1/dc-v1.1</zs:recordSche<zs:recordSchema>info:srw/schema/1/dc-v1.1</zs:recordSchema>ma> <zs:recordPacking>xml</zs:recordPacking><zs:recordPacking>xml</zs:recordPacking> <zs:recordData><zs:recordData> <srw_dc:dc xsi:schemaLocation=“... <srw_dc:dc xsi:schemaLocation=“... <title>Abbott & Costello cartoons.</title><title>Abbott & Costello cartoons.</title> <creator>Copyright Collection (Library of Congress) <creator>Copyright Collection (Library of Congress) DLC</creator>DLC</creator> (etc.) ..........(etc.) .......... </srw_dc:dc></srw_dc:dc> </zs:recordData></zs:recordData><zs:recordPosition>3</zs:recordPosition><zs:recordPosition>3</zs:recordPosition></zs:record</zs:record>><<zs:record>zs:record> <zs:recordSchema>info:srw/schema/1/dc-v1.1</zs:recordSche<zs:recordSchema>info:srw/schema/1/dc-v1.1</zs:recordSchema>ma> <zs:recordData><zs:recordData> ............ </zs:recordData></zs:recordData><zs:recordPosition>4</zs:recordPosition><zs:recordPosition>4</zs:recordPosition></zs:record></zs:record></zs:records></zs:records>

Page 38: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=dinosaur&operation=searchRetrieve&query=dinosaur&maximumRecords=1&recordSchema=mods&maximumRecords=1&recordSchema=mods

MODS record:MODS record:

Page 39: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=title=dinosaur&operation=searchRetrieve&query=title=dinosaur

&maximumRecords=10&recordSchema=mods&maximumRecords=10&recordSchema=mods

Fielded Query: ‘title=dinosaur’Fielded Query: ‘title=dinosaur’

Page 40: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

CQLCQL

Page 41: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Sample CQL Queries Cat

cat and dog

title = cat

dc.title = cat

Page 42: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Sample CQL Queries Cat (simple)

cat and dog

title = cat

dc.title = cat

Page 43: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Sample CQL Queries Cat (simple)

cat and dog (boolean)

title = cat

dc.title = cat

Page 44: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Sample CQL Queries Cat (simple)

cat and dog (boolean)

title = cat (index)

dc.title = cat

Page 45: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Sample CQL Queries Cat (simple)

cat and dog (boolean)

title = cat (index)

dc.title = cat (index qualified)

Page 46: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Boolean cat and dog

cat or dog

Cat not dog

cat not dog and fish or frog

Page 47: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

cat not dog and fish or frog

Page 48: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

cat not dog and fish or frog

((cat not dog) and fish) or frog

(cat not dog) and (fish or frog)

Page 49: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Fielded/index Search

title = cat

Page 50: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Fielded/index Search

title = cat dc.title = cat bib.title = cat Bath.keyTitle

Page 51: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

title = cat Search Clause

Page 52: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

<index> <relation> <search term>

Search Clause

Page 53: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

title = cattitle = cat

indexindex relationrelation search termsearch term

Page 54: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

title = cattitle = cat

subject = dogsubject = dog

andand

search clauses linked by a booleansearch clauses linked by a boolean

Page 55: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Relations Title = "the complete dinosaur" title all "complete dinosaur“ title any "dinosaur bird reptile" title exact "the complete dinosaur"

Page 56: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Relations

Title = "the complete dinosaur" title all "complete dinosaur“ title any "dinosaur bird reptile" title exact "the complete dinosaur"

Page 57: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

= Title = "the complete dinosaur“

Page 58: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

= Title = "the complete dinosaur“

matches “a day in the life of the complete dinosaur“

and“the complete dinosaur goes to Paris“

Page 59: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

= Title = "the complete dinosaur“

matches “a day in the life of the complete dinosaur“

and“the complete dinosaur goes to Paris“

but not “the complete and unabridged dinosaur"

Page 60: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

All Title all "complete dinosaur“

matches “the complete and unabridged dinosaur"

Page 61: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Title all "dinosaur bird reptile“

does not match “the complete dinosaur"

Page 62: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Any

Title any "dinosaur bird reptile“

does match “the complete dinosaur"

Page 63: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Exact title exact "the complete dinosaur" matches"the complete dinosaur"

Page 64: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Exact title exact "the complete dinosaur" matches"the complete dinosaur"

(but does not match:“ a day in the life of the complete

dinosaur" )

Page 65: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

The anchor character ^

Page 66: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Recall ……. Title = "the complete dinosaur“

matches “a day in the life of the complete dinosaur“

Page 67: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Anchoring title="^the complete dinosaur“

would not match “a day in the life of the complete dinosaur”

would match “the complete dinosaur goes to paris”

Page 68: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

Right Anchoring

title="the complete dinosaur^" would not match

“the complete dinosaur goes to Paris”

Page 69: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=&operation=searchRetrieve&query=titletitle=dinosaur=dinosaur

&maximumRecords=10&recordSchema=mods&maximumRecords=10&recordSchema=mods

Fielded Query (title)Fielded Query (title)

Recall......Recall......

Page 70: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=&operation=searchRetrieve&query=dc.titledc.title=dinosaur=dinosaur

&maximumRecords=10&recordSchema=mods&maximumRecords=10&recordSchema=mods

Same as:Same as:

http://z3950.loc.gov:7090/voyager?http://z3950.loc.gov:7090/voyager?version=1.1& operation=explainversion=1.1& operation=explain

Page 71: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&&operation=searchRetrieve&query=bath.name=dinosaurquery=bath.name=dinosaur

&maximumRecords=10&recordSchema=mods&maximumRecords=10&recordSchema=mods

Qualified Index: bath.name=dinosaurQualified Index: bath.name=dinosaur

Page 72: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&&operation=searchRetrieve&

query=bath.name all “dinosaur%20barney”query=bath.name all “dinosaur%20barney”&maximumRecords=10&recordSchema=mods&maximumRecords=10&recordSchema=mods

bath.name all “dinosaur bath.name all “dinosaur barney”barney”

Page 73: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

http://z3950.loc.gov:7090/voyager?version=1.1http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&&operation=searchRetrieve&

query=bath.name exact dinosaurquery=bath.name exact dinosaur&maximumRecords=10&recordSchema=mods&maximumRecords=10&recordSchema=mods

bath.name exact dinosaurbath.name exact dinosaur

Page 74: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

www.loc.gov/z3950/agency/zing/

www.loc.gov/z3950/agency/zing/srw/

www.loc.gov/standards/sru/

Page 75: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

SRU CQL Explain Operation Scan Operation SRW

Page 76: Ray Denenberg Ralph LeVan Interoperability Standards & Searching Multiple Repositories Workshop 20 March 25, 2006; Washington

“SRW/U” “SRU”

SRW “SRU over SOAP”

and in addition “SRU via Post”