metadata: standards basics for the independent publishing community, with graham bell, executive...
Post on 13-Sep-2014
557 views
DESCRIPTION
The better your metadata, the better your sales: that's the simple truth. Books with complete metadata sell almost three times better than a book with incomplete metadata, so there's a very good reason to learn about how to format and transmit this information to your industry partners. But where to begin? In this session, Graham Bell, Chief Data Architect at EDItEUR, will offer practical guidance on writing, formatting, and transmitting metadata in accordance with industry standards and best practices, and help to make your metadata work for you. This is the third in a three-part series, co-produced by IBPA and hosted by BISG, aimed at demystifying several of the core book industry standards through "101"-style sessions presented by experts in the field. TRANSCRIPT
Metadata standards basics for independent publishers
Graham BellEDItEUR
IBPA / BISG webinar series30th April 2014
About me• 20 years experience at the point where
book�publishing and technology meet
• formerly senior manager in IT department for HarperCollins UK• led development of bibliographic, editorial and
digital asset management systems
• involved in e-book, e-audio, print-on-demand
and online projects
• joined EDItEUR in mid-2010, primarily responsible for ONIX development
About EDItEUR• not-for-profit membership organisation
• develops, supports and promotes metadata and identification standards for the book,e-book and serials supply chains
• based in London, but a global membership of publishers, distributors, wholesalers, subscription agents, retailers, libraries, system vendors, rights and trade associations
• acknowledged centre of expertise on standards and metadata for the industry
About EDItEUR• also provides management services to LCC,
International ISBN, ISTC, ISNI Agencies
• EDItEUR has three full-time staff, one FTE part-time staff, plus access to consultants from both the book and serials sectors
• works closely with other standards bodies, to ensure EDItEUR standards meet the needs of their stakeholders too
• EDItEUR member participation is vital, to ensure standards keep pace with evolving business requirements
What is metadata?
What is metadata?• often defined as ‘data about data’, but this is
inadequate in the publishing context
• ‘all the product management information that’s needed to make your business work’• to describe the stuff you publish
• to support commercial processes
• both internal and external requirements
• for the full product lifecycle
• data modelling and communication aspects
And what are identifiers?
Identifiers• unique labels for ‘things’ (in IT systems)
• to address
• to disambiguate
• to collocate
• ideally, persistent, meaningless, managed
• to allow unambiguous communication
• identification is contextual• identifiers have specific functions or roles
• and a scope (types of thing can be identified)
Metadata and identifier standards – critical enablers
for your business
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definition
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definition
An SEO expert walks into a bar, pub, tavern,
public house, Irish pub, drinks, beer,
alcohol…
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definition
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definition
Third editionPaperback edition
First edition
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definition
The date on which a retail consumer
may purchase and take possession of
a physical product, or the date on
which a retail consumer may access
and use a digital product
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definitionThe nominal or approximate date on which the
product is made available in the market, used
largely for planning and business process
purposes. Actual availability to the retailer may be
no more than a handful of days prior to (or after)
this date and – in the absence of a sales embargo –
retail fulfillment to consumers may begin
immediately stock is available. For titles where a
sales embargo is in place, stock must be
sequestered by the retailer until the embargo
expires (or one day prior, for mail order
fulfillment)
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definition
• tagging
• edition
• publication date
• series, collection, imprint, backlist, out of print, reissue, trade paperback…
• metadata requires care for semantics
Words in search of a definition
ISBN
ITSC
ISNI
SAN
MARC
ONIX
BISAC
Themaidentifiers
metadata
ISBN
ITSC
ISNI
SAN
MARC
ONIX
BISAC
Themaidentifiers
metadatapeople
products
places
works libraries
book trade
subject
subject
“I am Spartacus!”
ISNI public identity identifier
ISNI• International Standard Name Identifier
• ISO Standard 27729 for identification of public
identities of parties in creative industries
• parties may be people, organisations or even
fictional people (for pseudonyms, characters)
• typical use case – identify an author, establish
difference from another author of the same
name, or establish same persona as musician or
actor (possibly of different ‘name’)
• a cross-domain ‘bridge identifier’, linking data
across multiple sources
Richard Holmes
Richard Holmes
Richard Holmes
0000 0001 1768 55420000 0001 2147 5396
0000 0000 7725 4712
What does it look like?
ISNI 0000 0000 7725 4712
for display purposes only
identifies sector of initial reg
check digit may be X
15 decimal (base 10) digits for
persona
Implementing ISNI• strictly, does not identify a person or a name
• person (or party)
• persona
• personality (or presentation)
• ISNIs identify personas, or public identities
• Фёдор Достоевский = Fyodor Dostoyevski
• Richard Bachman = Stephen King (but well-
known pseudonyms can be linked)
• Sue Welfare = ***** *** (allows for anonymity)
/
/
Current status
• 2010 – International ISNI Agencyhttp://www.isni.org
• 2012 –�standard published by ISO
• central registration system developed and operated by OCLC in the Netherlands
• first eight million ISNIs already assigned, based on pre-existing data in VIAF and IPDA
Benefits of ISNI
• proprietary identifiers are not cross-
publisher, and do not provide certainty over
the entity (is it a person or a persona?)
• ISNI provides cross-publisher, cross-sector
public identity
• developing links to ORCID, ResearcherID,
PlusID, Ringgold
• in future, will likely also identify names of
organisations
A contributor identifieris a�tool to associate
your product with�othersfrom the�same creator
– authoritatively– cross-media
http://www.isni.org
Identifiers and metadata
• inextricably linked. Each type of identifier has a minimum set of metadata attributes, whereby any change in the metadata implies a change in the identifier• attributes define what is unique about the
identified thing
• an identifier can be thought of as ‘shorthand’
for one particular set of attribute values
• metadata registries (eg Bowker Books in Print)
hold both the identifiers and the metadata
ONIX for Books
What is ONIX?
• ONIX for Books is a standard data format based on XML, used to convey a rich range of information about book and book-related products between computer systems in the book and e-book supply chain• publisher to retailer
• direct, or via intermediaries such as
distributors, data collation services, data
registries, wholesalers
Typical use cases
• a publisher needs to provide information about its catalogue of products to a distributor, wholesaler, retailer or other supply chain partner• includes both current and forthcoming products
• may cover basic product information and a wide
range of collateral material
• scope extends over the full lifecycle for book, e-
book and other products – ie includes post-
publication updates to price and availability
database database
• ONIX for Books is a standardised message specification, not a database• but what you can deliver in your ONIX is
dependent on the design of your
in-house database
• ONIX data model often used to guide design of
internal applications
Roots of ONIX
• 1997 EPICS and BIC Basic
• 1998 <indecs> project
• 1998 W3C XML specification
• 1999 ‘Online Information Exchange’ initiative from AAP Digital Issues working party• ONIX developed by EDItEUR, originally in
collaboration with BISG (USA) and BIC (UK)
• March 2000 – ONIX International v1.0
Roots of ONIX
• 1997 EPICS and BIC Basic
• 1998 <indecs> project
• 1998 W3C XML specification
• 1999 ‘Online Information Exchange’ initiative from AAP Digital Issues working party• ONIX developed by EDItEUR, originally in
collaboration with BISG (USA) and BIC (UK)
• March 2000 – ONIX International v1.0
Roots of ONIX
• current status• managed by EDItEUR and international
steering committee
• June 2003 ONIX v2.1 – most widely deployed
• April 2009 ONIX v3.0 – growing in importance
• widely used in North America, Western Europe, Japan, Russia, parts of Eastern Europe, Korea, growing in China
• used by small and large organisations alike
support
for 2.1 will be
reduced at
end of 2014
ONIX business benefits
• standard is free of charge to use
• for publishers – enables supply of rich metadata in a single, standard format, for all downstream needs
• for distributors, retailers – efficient, timely delivery and aggregation of data from multiple publishers
• a shared ‘language’ enables unambiguous electronic communication
<Contributor> <SequenceNumber>1</SequenceNumber> <ContributorRole>A01</ContributorRole> <NameIdentifier> <NameIDType>16</NameIDType> <IDValue>0000000121479135</IDValue> </NameIdentifier> <PersonNameInverted>Sjöwall, Maj��������</PersonNameInverted> <BiographicalNote textformat="05"><p>Maj��������Sjöwall is a poet. She lives in Sweden.</p>��������</BiographicalNote></Contributor>
3.0
ONIX 3.0 data elements• message details
• identity and authority
• record details
• product identifiers
• 1. descriptive details
• product form
• special features
• packaging
• physical size
• DRM, usage constraints
• trade classification
• product parts
• collection titles
• titles
• contributors
• conference
• edition
• language
• extent
• subject
• audience
ONIX 3.0 data elements• 2. collateral details
• supporting text
• cited material
• supporting resources
• prizes
• 3. content detail
• 4. publishing details
• imprint and publisher
• contact details
• lifecycle dates and status
• copyright details
• territorial sales rights
• 5. related material
• related works
• related products
• 6. supply details
• market-specific details
• suppliers
• discounts
• prices and tax
Implementing ONIX
• apparently complex, but modular & consistent
• not too large for a single developer with simple software such as Filemaker
• implemented in many off-the-shelf solutions• what should you be looking for?
http://www.ipg.uk.com/?id=4815
• BISG best practiceshttps://www.bisg.org/product-metadata-best-practices
Thema – the subject category scheme for a global book trade
Thema
• launched at Frankfurt Book Fair last year, and version 1.0 released November 2013
• already gaining real traction• endorsed by International Publishers
Association, BISG, BookNet Canada and
organisations in a dozen other countries
• works alongside BISAC (but likely to replace some schemes in other countries)
• interactive http://editeur.dyndns.org/thema
<Subject> <MainSubject/> <SubjectSchemeIdentifier>93</SubjectSchemeIdentifier> <SubjectSchemeVersion>1.0</SubjectSchemeVersion> <SubjectCode>FFP</SubjectCode></Subject>
3.0
A global scheme• managed by EDItEUR with international
steering committee – same model as ONIX
• designed to reduce national biases
• multi-lingual
• national extensions for extra detail
• mapping from BISAC to Thema for backlist• including large-scale auto-mapping
http://www.booknetcanada.ca/blog/2014/4/24/
introducing-bncs-bisac-to-thema-translator.html
Increased revenues
Showing 13–24 of 841,539 results
‘Titles that meet the BIC Basic standard see average
sales 98% higher than those that don’t meet the standard.’
• based on a study of the top-selling 100,000 ISBNs in the UK in 2011
‘[For online sales, products] with progressively increasing
amounts of enhanced metadata see progressively increasing average sales.’
http://www.nielsenbookdata.co.uk/controller.php?page=1129
‘Research has proven that the more information customers have about a book, the more
likely they are to buy it.’
‘ONIX provides a way to transmit this information in a clean and seamless way across multiple trading partner relationships.’
https://www.bisg.org/onix-books
‘Didn’t we used to havea sales team.’
Metadata is all you have
…at least in thegrowing part of the market