338 seminar4 keithlawrenz
TRANSCRIPT
Los Angeles | London | New DelhiSingapore | Washington DC
One approach to an online digital content repository
Automating Workflow from Acceptance to Publication
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
SAGE Publications
● Books, journals and reference publishing programs for higher education• 560+ journals
● Publishing offices in • Los Angeles, CA• London, England• New Delhi, India• Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
My background
● Keith Lawrenz• Senior Business Analyst, Publishing Technologies• 4 years with SAGE
● Specialties• Business process engineering• Content modeling• XML, XQuery, XSLT, XProc, relaxNG schema…
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview
● Why SAGE invested in a repository● The SAGE repository workflow and how we
implemented phase one repository with RSuite
● Results and lesson learned
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
The business opportunity
● Secure SAGE online digital assets● Reliably deliver online content● Provide a platform for content analytics● Enable online product development
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
The business landscape
● HighWire upgrade to H2O● Migrating from proprietary DTD to NLM
schema● XML first workflow from back end XML
conversion● Online reference and book products that
require archive support
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
SAGE Online Content Repository
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Attributes in a repository solution
● Flexibility• Analyst implemented business rules
● Scale• To support SAGE journal content• To support all SAGE content ongoing
● Access• Deployed worldwide
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Implementation phases
1. Current journal content
2. Journal backfile
3. Encyclopedias and handbooks
4. Electronic books
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Manuscript Management
SAGE Journals Workflow
Production Management
Online Content Repository
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
The scale
● 560+ journals● 770,000 articles since 1894● 40,000 new articles / year
• 80% PDF with header and references XML• 20% Full text
● ~70,000 unique issue deliveries / year to 50+ ftp targets
● 2 full-time headcount U.S. & UK offices
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Ingest print-ready article PDF
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Deliver to encoding vendor
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Ingest xml-encoded issue
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Deliver to hosting platform
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Support editorial approval process
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Track go-live on hosting platform
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Overview of a journal issue
XM
L C
onv
ersi
on
Ven
dor
(Jou
ve)
Onl
ine
Con
tent
E
dito
rC
ont
ent
Re
cepi
ent
sS
OC
RJo
urn
al
Pro
duct
ion
Unit of Content is a Journal Issue
Start
FTP (UK) or NFS (US)
Zip Final Print-Ready
PDFs
Ingest Unencoded
Issue
Store in Repository
Deliver Unencoded
Issue
FTPCreate
SAGEMeta XML
Nomalize PDF Files
Zip Issue
FTP Store in Repository
Deliver HW Issue
Ingest Encoded
Issue
FTP – HighWire Express
Process Issue for Hosting
Quality Check Issue
Changes? Edit ArticlesApprove
Issue
Deliver Full Issue
Deliver PubMed Abstract
XML
FTP and/or NFS sites
Online Preview Issue Online
Issue Online?
End
Deliver XML Issue
End
Yes
Yes
OK to Host?Yes
Deliver to additional targets
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
The results
● 100,000+ issue deliveries● 99.5% plus uptime● Aggressive expansion plans
• NLM XML-first workflows• Ingesting back content ~770,000 journal articles• ~200 encyclopedias and handbooks• SAGE Research Methods Online ~600 books
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Important features
● Rich & accessible workflow• Human readable error messaging• SAGE configurable
● Supports key XML technologies● CMS functions● Web-delivered application● MarkLogic inside● Active Directory integration
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Lessons learned
● Understand and document your workflows● Use proven software● Start small● Iterate● Technologists must partner with users
Automating Workflows from Acceptance to Publication, SSP, 2010 Los Angeles | London | New DelhiSingapore | Washington DC
Keith Lawrenz
Senior Business Analyst, Publishing Technologies
RSuite – Booth #218