+ the open access publisher agenda oracle intermedia overview open access for the life science...

46
+ The Open Access Publisher

Upload: dennis-norton

Post on 18-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

+

The Open Access Publisher

Page 2: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Agenda

Oracle interMedia Overview Open Access for the Life Science Community BioMed Central Business Model Oracle Technologies used by BioMed Central

Page 3: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Oracle interMedia

Multimedia DatabasesMulti-Terabyte Performance

Page 4: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Agenda

• The Media-enabled Oracle Platform

Benefits

Customer Experience

Oracle Database 10g New Features

Proposed Enhancements

Page 5: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

The Media-enabled Oracle Platform Oracle Database 10g

– Storage, management, & retrieval of image, audio, video data

– Native format understanding, metadata extraction, methods for image processing

– Support for leading streaming media servers

Oracle Application Server 10g

– JSP, Servelet and PL/SQL application development support

– Media Adaptation Services for Wireless – JDeveloper (BC4J) and Portal integration

Oracle Collaboration Suite– Metadata extraction for OCS Files

Page 6: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Benefits: Save labor, time and money

BioMed Central:•Automated media processing, serving & integration

New Mexico Department of Transportation:• A Single DBA designed, created, deployed, and maintains a 5 TB

image management system

Palazzo Braschi Museum - Rome:• Reduced image processing time by 90% to bulk load and process

images as compared to client side tools.

A US Central bank• On-line processing and rapid resolution of 26,000 bad checks per

day reduces handling and float costs.

Page 7: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Fast & Scalable 1TB image repository renders images in

Web browser in less than 0.4 second

Loads at device speeds Multi-terabyte multimedia databases

– 5 TB database– 140 million images

Scalable bulk load and process– Parallel processes load 300,000 images/hour– Bulk process – tiff to gif conversion, scale to

thumbnail

* USB Paine Webber, Caixa Economica Federal, NM DOT

Page 8: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Secure and Manageable Use all Oracle Database security features

– authentication, auditing, encryption, access control, etc.

Banks and Commercial Web sites use it

One management environment for all data

– Single DBA for 5TB database

– 3TB financial database

* A US Central Bank, BioMed Central, Cre8tiv - UK, Spa

Microsystems – UK, NM DOT, Caixa Economica Federal

Page 9: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Oracle Simplifies Code

With JSP Tag Library: (14 point font)

<ord:embedImage connCache = <% java.util.Vector otherValuesVector = new java.util.Vector();

otherValuesVector.add(fd.getParameter("desc")); otherValuesVector.add(fd.getParameter("loc"));

%> “ mediaParameters = "photo" otherColumns = "description, location" otherValues = "<%=otherValuesVector%>" />

Image Insert using Multimedia JSP Tag Library– An Example

Page 10: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Without: (in 10point font) <FORM ACTION="PhotoAlbumInsert.jsp" METHOD="POST"

ENCTYPE="MULTIPART/FORM-DATA"> Description: <INPUT TYPE="text" NAME="desc"><BR> Location: <INPUT TYPE="text" NAME="loc"><BR> Photo: <INPUT TYPE ="file" NAME="photo"><BR> <INPUT TYPE ="submit" VALUE="submit"></FORM> try { // Parse multipart/form-data formData.setServletRequest( request ); formData.parseFormData();

// Insert new row into database stmt = (OraclePreparedStatement)conn.prepareStatement( "insert into spec_photos ( description, location, photo ) " + " values ( ?, ?, ORDSYS.ORDImage.init() )" ); stmt.setString( 1, formData.getParameter( "description" ) ); stmt.setString( 2, formData.getParameter( "location" ) ); stmt.executeUpdate(); stmt.close(); // Fetch OrdImage object from database stmt =

(OraclePreparedStatement)conn.prepareStatement( "select photo from spec_photos where description = ? for update" );

stmt.setString( 1, formData.getParameter( "description" ) ); rset = (OracleResultSet)stmt.executeQuery(); rset.next(); OrdImage photo = (OrdImage)rset.getCustomDatum( 1, OrdImage.getFactory()); rset.close(); stmt.close(); // Load the photo into the database and set the properties. formData.getFileParameter( "photo" ).loadImage( photo ); // Update object in database stmt = (OraclePreparedStatement)conn.prepareStatement( "update spec_photos set photo = ? where description = ?" ); stmt.setCustomDatum( 1, photo ); stmt.setString( 2, formData.getParameter( "description" ) ); stmt.execute(); stmt.close();

// Commit changes conn.commit(); } finally { // Ensure JDBC connection is released and any temp files are deleted. album.release(); formData.release(); }%>

Page 11: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

New Oracle10g Multimedia Features

Standards Support – SQL/MM Still Image

New version of Java Advanced Imaging and

additional image processing operators

Support for additional media formats

– Microsoft ASF, MPEG2 & MPEG4

• Microsoft Windows Media Server Plugin

• Real Server Plugin for Helix Server

• XML DB integration

Page 12: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Proposed Enhancements Parse TIFF headers for user-specified attributes

Metadata mgt., e.g. microarrays, gels, mass spec.

Characterize a region of interest for an image

Plug-in 3rd party algorithms & utilities

Manage media metadata in XML DB

Describe user-defined file formats

Keep a history of changes to images

Handle 3-D images (time/volume)

DICOM Support

Page 13: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Multimedia Database Improves the Bottom Line

Matthew Cockerill

Technical Director

BioMed Central

Session id: 40363

Page 14: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

BioMed Central and Oracle

BioMed Central is an Open Access publisher of biomedical research

Oracle database technology used to deliver a cost-effective online publishing solution

Goals– Make the publishing process more efficient

through online tools and automation– Increase accessibility of research by removing

subscription barriers

Page 15: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Oracle technology used by BioMed Central

– XML DB– Oracle interMedia

– Real Application Clusters– Data Guard– Oracle Text

BioMed Central’s database– 70 gigabytes of data (and growing rapidly)– Lots of traditional relational data

(e.g. 250,000 registered users)– Also serves as a repository for images, movies, PDFs

and other rich media

Key technologies used

Page 16: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Oracle technology used by BioMed Central

– XML DB– Oracle interMedia

– Real Application Clusters– Data Guard– Oracle Text

BioMed Central’s database– 70 gigabytes of data (and growing rapidly)– Lots of traditional relational data

(e.g. 250,000 registered users)– Also serves as a repository for images, movies,

PDFs and other rich media

Key technologies used

Page 17: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

What is wrong with traditional science publishing? Subscription-only access to scientific research is a legacy

of the economics of print Scientists do all the hard work

– performing the research– writing up the article– acting as peer reviewers– acting as journal editors

Traditional publishers take ownership of the copyright and sell limited access back to the scientific community

In the age of the web that makes no sense for science Open Access publishers make research freely accessible

and redistributable by scientists

Page 18: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Benefits of Open Access

Research instantly accessible to the entire scientific community

Digital permanence (many copies) A route off the subscriptions treadmill

– Subscriptions to traditional journals have increased at 10-15% per annum

Data mining Grid computing

Page 19: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Tony Blair

“[The] national e-science grid … intends to make access to

computing power, scientific data repositories and

experimental facilities as easy as the web makes access to

information.”- Tony Blair, May 2002

Page 20: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

The Open Access movement

Public Library of Science– New not-for-profit publisher formed by a group of scientists– Has received $9m from Gordon and Betty Moore

Foundation to start new Open Access journals Soros Foundation

– Has provided $3m to support Open Access publishing in developing and transitional countries

Sabo bill– Congressman Martin Sabo recently introduced the Public

Access to Science Act in Congress– If passed it would ensure that all US federally funded

research would be published with Open Access

Page 21: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

BioMed Central architecture Oracle9i Database

– Stores relational data (e.g. user registration info)

– Also acts as repository for files associated with submitted manuscripts published articles

Web server farm– Runs many different journal websites,

all driven by the same Oracle database– Extensive use of Java and XSLT– Media content streamed from the

database using servlets

9i

Page 22: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia

Page 23: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia

Page 24: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Importance of high availability

Science is a global enterprise, so BioMed Central’s websites are busy 24 hours a day

Scientists entrust their research and reputation to us - they must have confidence that their research will be available

Major institutional customers demand high reliability

BioMed Central delivers high availability using a combination of RAC and Data Guard

Page 25: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Real Application Clusters

BioMed Central was one of the first organizations in the UK to deploy 9i RAC

Main database runs on a pair of dual CPU Sun Fire V480 servers

Delivers high availability in the event of single node failure

Oracle upgrades/patches do currently require downtime however (for now!)

Page 26: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Data Guard

BioMed Central uses Data Guard to maintain a standby database

Standby database kept up to date by automated application of log files

Standby database can be used for reporting (in read-only mode)

If a prolonged outage of live db occurs (planned or unplanned), standby database can be activated

Data Guard makes it easy to roll back to the live configuration after planned outages

Page 27: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

RAC/Data Guard configuration

RAC Cluster Standby DB(Data Guard)

Web server farm

Main hosting location Standby location

Reporting

logfiles

Page 28: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

RAC/Data Guard configuration

RAC Cluster Standby DB(Data Guard)

Web server farm

Main hosting location Standby location

Reporting

Page 29: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia

Page 30: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Use of Oracle Text

High performance full text article search Key benefits

– Ease of maintenance (incremental online indexing)

– Structured searching of XML– XPath support– Unicode aware (smart base-character indexing)– Filter procedures can be used to transform XML

to be indexed

Page 31: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Structured search

Page 32: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

XPath search

Prior to Oracle9i Database Release 2, relatively basic field restrictions based on XML tags were possible

Complex nesting of tags, or specific attribute values were difficult or impossible to search for

Oracle9i Database Release 2 support for Xpath field restrictions takes XML searching to another level

Now possible to search for all XML articles that contain a certain path (HASPATH), or that match a certain text expression at that path (INPATH)

Page 33: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

XPath example

Article metadata identifying a series of related articles

<meta> <classifications> <classification type="BMC" subtype="review_series_title" id="ar-cell-cell">Cell-cell interactions in synovitis</classification> </classifications> </meta>

SQL syntax to retrieve all articles in that review series

SELECT ARX_ID FROM ARX WHERE CONTAINS (ARX_FULL, 'HASPATH (//classification[@type="BMC“ AND @subtype="review_series_title" AND @id="ar-cell-cell"])')>0;

Page 34: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Smart handling of Unicode

Page 35: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia

Page 36: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

XML DB

Oracle support for XML standards in the database allows BioMed Central to manage article XML data within database

Examples of use– Re-validate article XML against DTD after any

update– Application of XSLT transformations within

database (e.g. as a pre-indexing filter)

Page 37: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Article XML (pre-transform)<bibl> <title> Genetic variability in MCF-7 sublines</title> <aug> <au id="A1"> <snm>Nugoli</snm> <fnm>Melanie</fnm> <mi>JK</mi> <email>[email protected]</email> </au> <au id="A2"> <snm>Chuchana</snm> <fnm>Paul</fnm> <email>[email protected]</email> </au> </aug> <source>BMC Medical Research Methodology</source>…</bibl>

Page 38: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Article XML (post-transform)<bibl> <title> Genetic variability in MCF-7 sublines</title> <aug> <au id="A1"> <snm>Nugoli</snm> <fnm>Melanie</fnm> <mi>JK</mi> <bnm>Nugoli_MJK</bnm>

<email>[email protected]</email> </au> <au id="A2"> <snm>Chuchana</snm> <fnm>Paul</fnm> <bnm>Chuchana_P</bnm> <email>[email protected]</email> </au> </aug> <source> <sourcefull>BMC Medical Research Methodology</sourcefull> <sourceabbr>BMC Med Res Methodol</sourceabbr> </source> …</bibl>

Page 39: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia

Page 40: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

interMedia: Oracle as a media repository Manuscript submission and workflow involves

a complex interplay of files and metadata Storing files directly in the database as

BLOBs makes their management and manipulation much simpler

interMedia provides a powerful set of tools to work with images in the database

– Extracting image metadata– Scaling/cropping/format conversion

Page 41: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Full text article

Page 42: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Figure streamed from db

Page 43: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

PDF streamed from database

Page 44: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Processing submitted files

Page 45: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

Using interMedia to manipulate images

Page 46: + The Open Access Publisher Agenda  Oracle interMedia Overview  Open Access for the Life Science Community  BioMed Central Business Model  Oracle

ASpeaker NameSpeaker TitleSpeaker TitleOracle Corporation

Q&Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S

Q&A