iahx elluminate antwerp abcd 20120524
DESCRIPTION
Manual IAHXTRANSCRIPT
-
IAHxSearch Interface using Apache Solr/Lucene
ABCD & CDS/ISIS WorkshopElluminate Session
24 May 2012
Vinicius de AndradeDesarrollo de SistemasKMC/BIREME
BIREME / PAHO / WHO
-
Data Level
Index Level
ISISLucene
InterfaceLevel
Services Interfaces
LayersCapas
-
MetadataMetadatos
Conversion of information sources for a set of metadata (single schema)
Identification of elements for organization into "clusters"
Data LevelCapa de los datos
-
Indexes
Index LevelCapa de los ndices
Boolean queryBsqueda booleana
Boolean query, ranking and clustersBsqueda booleana, ranking y clusters
-
Multiples interfaces for present result
Interface Level
-
What is LuceneHigh performance, scalable, full-text search libraryFocus: Indexing + Searching DocumentsDocument is just a list of name+value pairsNo crawlers or document parsingFlexible Text Analysis (tokenizers + token filters)100% Java, no dependencies, no configfiles
-
What is SolrA full text search server based on LuceneXML/HTTP, JSON InterfacesFaceted Search (category counting)Flexible data schema to define types and fieldsIndex ReplicationExtensible Open Architecture, PluginsWeb Administration InterfaceWritten in Java, deployable as a WAR
-
Lucene Architecture
-
admin update select
Standard request handlerCustom request handler
XML response writerJSON response writer
XML Update HandlerCSV Update Handler
Lucene
Basic AppDocument
title: Genomeauthor: Matt Ridleytype: book...
Query Response(matching docs)Query(title:genome)
http://solr/update http://solr/select
Servlet
Contain
er
Solr
HTML
WebappIndexer
-
DocList
Search(Query,Filter[],Sort,offset,n)
language:en
year:2008
genomeyear asc
subject:chromosomes
subject:diseases
DocSet
type:article
type:book
journal:Rev. A
journal:Rev B
Journal: Rev C
intersection
Size()
= 594
= 382
= 247
= 689
= 104
= 92
= 75
Query Response
Clusters / Grupos
-
Indexing DataHTTP POST to http://localhost:8080/solr/update
05991GenoneMatt Ridleygenomediseasechromosomesen
-
Index Key Generation
-
Deleting DocumentsDelete by Id, most efficient
0559132552
Delete by Query
subject:disease
-
Commit makes changes visible
same as commit, merges all index segments for faster searching
_0.fnm_0.fdt_0.fdx_0.frq_0.tis_0.tii_0.prx_0.nrm
_0_1.del
_1.fnm_1.fdt_1.fdx[]
Lucene Index Segments
-
Searchinghttp://localhost:8080/solr/select?q=genome
&start=0&rows=2&fl=title,author
GenomeMatt Ridley
-
Update and Query Index
:8080/index/update
/index/select
XML
QUERY
http://localhost:8080/index/select?q=saude&fq=type:article&wt=json
-
IAHx - Architecture
Client Interface Controller Index
-
Update scripts
index.shindex.sh [arquivo xml] [indice]commit.shcommit.sh [indice]optimize.shoptimize.sh [indice]deletedocs.shdeletedocs.sh [indice] [query]
-
lil-7320LILACSBR1.1regionalarticleRibeiro, M. VGallina, R. ASato, THidranencefalia: estudo clinicopatologico de 6 casos.Hydranencephaly: clinicopathological study of 6 cases184-92Arq Neuropsiquiatr;40(2)1982. Arq Neuropsiquiatr0004-282X402pt1982BR1982000000.0671982Foram estudados 6 casos de hidranencefalia do ponto de vista de sua semiologia clinica, de seus
exames complementares e das verificacoes anatomopatologicas. Os autores concluem que a transiluminacao e de grande utilidade no diagnostico precoce destes casos. O seguimento dos pacientes e as verificacoes anatomopatologicasdemonstram que a hidranencefalia teve como origem lesoes encefaloclasticas (inflamatorias, mecanicas e vasculares) que levaram, antes ou apos o nascimento, a destruicao total do cerebro com preservacao das estruturas sub-tentoriais
^d6984SCAD
Solr Update XML formatrelevancy
cluster
order
-
Solr XML Config schema (1/2)
.....
-
Solr XML Config schema.xml (2/2)
.....
....
-
Solr XML Config solrconfig.xml
.....
truetype type_of_studymh_clusterta_clusteryear_cluster201
-
010
oniahx
BVS-3700iAHx integrated searchpresentation
Solr XML result
-
{"responseHeader":{"status":0,"QTime":1,"params":{
"wt":"json","rows":["1","1"],
"start":"0","indent":"on","q":iahx","version":"2.2"}},"response":{"numFound":2,"start":0,"docs":[
{"id":"BVS-3700",au":"Antonio, Vinicius de Andrade",ti":" iAHx integrated search ","type":"presentation"}]}}
Solr JSON result
-
IAHx - Search Interface
-
Project Source Code RedDes (tickets, documentation)
http://reddes.bvsalud.org/projects/iahx/
GitHub (source code) http://github.com/bireme/iahx-opac/ http://github.com/bireme/iahx-server/ http://github.com/bireme/iahx-controller/
-
IAHx - Instalation
http://reddes.bvsalud.org/projects/iahx/wiki/Install* Available only in Portuguese at this time
-
Running iAHx Solr server installed and running
iahx-server is a custom installation of tomcat6 with solr deployment and shell scripts for executing basic solr REST commands
Tomcat6 iahx-controller is a war module used for dispatch
and receive solr requests
Webserver + PHP iahx-opac interface that convert JSON Solr result
using smarty template
-
Prepare data for Solr
ISIS SOLR Conversion via PFT
OAI-PMH XML SOLR Conversion via XSL
-
ISIS SOLRif p(v2) then,
' ',| |v2||/,(| |v4||/),(| |v16^*||/),(| |v18^*||/),
' '/,fi,
-
OAI-PMH XML SOLR
-
Questions
-
Vinicius de AndradeBIREME/OPS/OMS
Thank you