Download - The Web Comes Alive with Data! Schema.org and Structured Data on the Web: Past, Present, Potential
The Web Comes Alive with Data!
Schema.org and Structured Data on the Web: Past, Present, Potential
@jaymyersGoogle DevFest Twin Cities
February 8, 2014
• Early adopter• Semantic Web,
Linked & Open data enthusiast
• Speaker
Web of Today
• 25 million web sites• Trillions of web pages• 5 billion web pages change every day• 1000x more web pages on the “deep web”
Structured data Transform User
Structured data
Transform
Users
Machines
Kim Kardashian
jobTitle
jobTitle
birthDate
“1980-10-21”“Actress”*
“Director”*
* questionable, but we’ll go with it
provider“HollywoodLife”
provider
“TooFab”
Goals
• Create a web for both humans and machines• Entice webmasters to make metadata
available through structured HTML• Gain access to the meaning of web sites
Early Attempts
• Meta Content Framework
• RDF• OWL
Semantic Web
“A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities” - TBL
Microformats (‘03)
Addresses, geo, blog posts, media (images/ video), news, products, recipes, reviews and more!
Microformats example<div class="item hproduct"> <ol> <li class="lister vcard"><a class="url fn" href="http://storename.com">Magers and Quinn</a></li> <li class="category"><a href="http://storename.com/categories/books">Books</a></li> </ol> <img src="http://images.storename.com/products/ramsay-fast-food.jpg" class="photo" alt="gordon ramsay fast food book" /> <p><span class="condition">New:</span> <span class="price">$27.99</span></p> <p>Pub price: $35.00</p> <p>Hardcover</p> <p class="availability">Out of stock</p> <h1 class="fn">Gordon Ramsay's Fast Food Recipes from the F Word</h1> <p>By Ramsay, Gordon</p> <dl class="identifier"> <dt>ISBN:</dt> <dd>1554700647</dd> </dl> <dl class="identifier"> <dt>Publisher:</dt> <dd>Amer Youth Hostels</dd> </dl> <h4>Publishers Comments</h4> <p class="description">A celebrity host of Hell's Kitchen features more than one hundred accessible recipes that are organized in accordance with everyday needs and special occasions, in a volume that places an emphasis on fast preparation and features complementary tips on stocking a pantry.</p></div>
Ontology Models:FOAF
Jay Myersa foaf:Person
<mailto:[email protected]>
foaf:mbox
foaf:homepage
<http://jaymmyers.tumblr.com>
foaf:knowsfoaf:nick“Jaydog”
Lloyd Cledwyna foaf:Person
<mailto:[email protected]>
foaf:mbox
“ProfessorLloyd”foaf:nick
foaf:homepage
<http://stthomas.edu>
Ontology Models:SKOS
National BasketballAssociation teams
a skos:Concept
skos:prefLabel
“National Basketball Association teams”
skos:broaderskos:broader
skos:broaderskos:broader
skos:broader
category:Defunct_National_Basketball_Association_teams
category:Atlanta_Hawks
category:Minnesota_Timberwolves
category:National_Basketball_Association_franchise_relocations
category:Boston_Celtics
Ontology Models:GoodRelations
a gr:Offering
gr:includesObject
Euro Cuisine – 8" Heart-Shape
Waffle Makera gr:ProductOrService
gr:description
“Make a delicious breakfast treat…”
gr:hasManufacturer
“Euro Cuisine”
“WM520”gr:hasMPN
gr:category
“Waffle_Makers”
RDFa<html xmlns=“http://www.w3.org/1999/xhtml” xmlns:rdfs=“Http://www.w3.org/2000/01/rdf-schema#”xmlns:dc=“http://purl.org/dc/elements/1.1/” xmlns:xsd=http://www.w3.org/2001/XMLSchema#xmlns:foaf=“http://xmlns.com/foaf/0.1/” xmlns:gr=http://purl.org/goodrelations/v1#xmlns:geo=http://www.w3.org/2003/01/geo/wgs84_pos# xmlns:v=http://www.w3.org/2006/vcard/ns#xmlns:r=http://rdf.data-vocabulary.org/#>
<div class="vcard" typeof="gr:LocationOfSalesOrServiceProvisioning" about="#store_201"> <h1 id="site_title" property="geo:lat_long" content="29.521643, -98.493599"> <a href="http://stores.bestbuy.com/201">Best Buy - San Antonio</a> </h1></div>
<div rel="v:adr"> <p class="geo" typeof="v:Address v:Work”> <strong><span property="v:street-address">125 Nw Loop 410 Ste 201</span></strong><br /> <strong> <span property="v:locality">San Antonio, </span> <span property="v:region">TX</span> <span property="v:postal-code">78216</span> </strong> <br /> Phone: <span property="v:tel"><span typeof="v:Tel v:Home">888-229-3770</span></span><br /> Email: <a href="mailto:[email protected]" rel="v:email">[email protected]</a></p> <span rel="v:geo">GEO: <span property="v:latitude">29.521643</span>, <span property="v:longitude">-98.493599</span></span></p></div>
Why?
Circa 2007
Additional content onSERPs
Data automagically extracted fromHTML
Value prop:“Give us your data in a machine-readable format and we’ll make
your stuff more attractive in search results”
Results
• 1000x increase in structured markup• Increases in user engagement (click throughs)
for SERP objects created from structured markup
• Small number of interesting applications built on top of structured data
But…
• Too many choices (syntax, ontology, etc.), fragmented
• A lot of bad markup – up to 40%
• Not easy enough for your average “Joe Webmaster”
2010
schema.org
• Common vocabularies that search engines can understand
• Lower the bar for webmasters to publish data on the web
• Improve user experience through data
Introducing: Microdata<div id="pagecontent" itemscope itemtype="http://schema.org/Person">
<a href="/media/rm974696448/nm2578007?ref_=nm_ov_ph"> <img id="name-poster" alt="Kim Kardashian Picture" title="Kim Kardashian Picture"src="http://ia.media-imdb.com/images/M/MV5BMTc0MjkzOTAxNV5BMl5BanBnXkFtZTcwNTk1NjcyNw@@._V1_SX214_CR0,0,214,317_.jpg" itemprop="image"/>
</a><h1 class="header"> <span class="itemprop" itemprop="name">Kim
Kardashian</span></h1><div class="infobar" id="name-job-categories">
<span class="itemprop" itemprop="jobTitle">Actress</span><span class="itemprop" itemprop="jobTitle">Producer</span>
</div><div class="inline" itemprop="description">
TV star, entrepreneur, fashion designer, and author (New York Times best-seller - "Kardashian Konfidential"), Kim Kardashian first burst onto the scene in 2007, after the premiere of her hit E! Entertainment reality series ...
</div><time datetime="1980-10-21" itemprop="birthDate">
<a href="/search/name?birth_monthday=10-21&refine=birth_monthday&ref_=nm_ov_bth_monthday" >October 21</a>,
<a href="/search/name?birth_year=1980&ref_=nm_ov_bth_year" >1980</a>
</time></div>
Looks Like We’ve Got Something Here!
• 15% of all sites contain schema.org markup• Many major sites• Adoption by content systems like Drupal and
Wordpress• Around 1200 object types and growing• Significant reduction in error rates
Practical Applications in SearchYahoo! Related Entities
Practical Applications in SearchYandex Islands
Practical Applications in SearchGoogle Knowledge Graph
Additional content driven by schema.org derived data
Practical Applications in SearchGoogle Knowledge Graph
Additional content driven by schema.org derived data
Other ApplicationsPinterest Rich Pins
Other ApplicationsGmail “Actions in the Inbox”
• Actions – rent a movie, buy something• Orders – post transaction order
confirmation, shipping status • Reservations – restaurant, travel, tickets
Other ApplicationsJSON-LD
{ "@context": "http://schema.org", "@type": "Person", "name": "John Doe", "jobTitle": ”Technologist", "affiliation": ”Big Boxen ‘R’ Us", "additionalName": "Johnny", "url": "http://www.example.com", "address": { "@type": "PostalAddress", "streetAddress": "1234 Freeze Drive", "addressLocality": ”Icebox", "addressRegion": ”Minnesota" }}
Thank you @jaymyers
CreditsGuha, Ramanathan V. “Light at the End of the Tunnel.” 12th International Semantic Web Conference (ISWC), Sydney, NSW, Australia. 23 October 2013. Keynote Address.