web design for literary theorists iii: machines read, too (just not well) (v 1.0)

Post on 05-Dec-2014

98 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Third (and last) in a series of workshops for graduate students in the Department of English at UC Santa Barbara. More information: http://patrickbrianmooney.nfshost.com/~patrick/ta/lead-ta/web-design/2013-2014/ YouTube screencast with audio: http://youtu.be/IwuS0K21ZoU

TRANSCRIPT

Introduction to Web DesignFor Literary Theorists

Third Workshop:Machines Read, Too (just not well)

30 May 2014Patrick Mooney

Co-Lead TA, 2013-2015Department of English

UC Santa Barbara

Objectives for this workshop series

● To learn the basic skills involved in building a small website for a course or section.

● To actually build such a web site, and to do a good job of it.

● To engage in practices that minimize the labor required to do so.

● To make your teaching practices more visible on the web.

● To be able to read various versions of HTML and CSS in other places on the web.

Objectives for today’s workshop

● In the last two workshops, I’ve argued that your markup for your site should be semantic (indicating the structure of your document) rather than presentational (because HTML is not a word-processing application).

● In the last workshop, we talked about how to control the presentation of your documents with CSS.

● Today will present the other half of that argument: semantic markup makes your pages intelligible to machines as well as humans.

More specifically …

● Today we will be talking about:– Additional HTML tags (especially in the <head>)

– Microformats (http://microformats.org)

– Sitemaps (http://sitemaps.org)

– OpenGraph (http://ogp.me)

– Some things Google likes

Details, details ...

● I’m going to be moving over a lot of details rather quickly today.

● You don’t need to memorize them all.– There are great references on the web, of course.– This presentation will be online in a few days.

– What’s important is that you pick up major concepts and work along with them.

– Come talk to me in my Lead TA office hours if you have questions!

● A collection of useful links is online at http://is.gd/todoho.

Reminders from previous workshops

● HTML is the standard language for displaying content on web sites.

● An HTML document (“web page”) is a plain text file with markup (“tags”) that indicate the structure of the document to machines that render or otherwise interpret it.

● Your HTML should focus on describing the document’s structure, rather than its appearance.– To put it another way, you should separate content from

information about its presentation.

– Describing the appearance of well-structured content is the function of CSS.

A minimally acceptable XHTML document

● The <!DOCTYPE> declaration is (to you) a string of gibberish whose purpose is to tell the browser what flavor of HTML you’re using.

● The xmlns= attribute on the <html> tag tells XML parsers how to parse the HTML.● You can just look up these values, or (even better) use existing documents as

templates.

Caddy came to the door and stood there, looking at Father and Mother. Her eyes flew at me, and away. I began to cry. It went loud and I got up. Caddy came in and stood with her back to the wall, looking at me. I went toward her, crying, and she shrank against the wall and I saw her but I pulled at her dress. Her eyes ran.

Versh said, Your name Benjamin now. You know how come your name Benjamin now. They making a bluegum out of you.

— William Faulkner, The Sound and the Fury (page 44 in the Norton Critical Edition)

So how do machines see text?

Why be nice to machines?

1. It’s helpful to your users in various ways, especially the tech-savvy ones.

2. It can enhance the visual appeal of your information when it’s been processed by machines:– Search engine results

– Shares on Facebook, Google+, etc.

3. Like any constituency, machines appreciate it when you pander to their preferences.

Some additional <head> contents

● <link> – indicates that the HTML file depends on (in some sense) another file to be properly rendered or otherwise processed.

● An example you saw in workshop two:<head> <link rel="stylesheet" type="text/css" href="styles.css"> <title>Some Books I've Read</title></head><body>[...]

● But there are other ways for HTML documents to depend on other documents. We’ll talk about some of these later today.

The <meta> tag

● Always goes in the <head> section of the document.

● As you might expect, encodes meta-information about the document.

● This information is not directly visible to the viewer, but is meaningful to various types of automatic processing.

● There is no authoritative and definitive vocabulary for meta-information, but there are very common vocabularies.

Some examples<meta name="generator" content="Bluefish 2.2.3" />

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

<meta name="author" content="Patrick Mooney" />

<meta name="copyright" content="Copyright © 2014 Patrick Mooney" />

<meta name="keywords" content="Southern literature, UCSB, spring 2014, Faulkner, Eudora Welty" />

<meta name="description" content="Notes for my discussion section in English 133SO, Spring 2014, at UC Santa Barbara." />

<meta name="date" content="2014-05-29T03:47:45-0700" />

Indicating the (natural) language of your text

● You can add the lang= attribute to any HTML tag:

<body lang="en"><p lang="en-US">In J.M. Synge’s <cite>The Playboy of the Western World</cite>, Christy Mahon refers to a shovel as a <q lang="en-IE">loy</q>.</p></body>

● You should generally be only as specific as you need to be.– jp means “Japanese”; jp-JP means “Japanese as

spoken in Japan.”

Microformats

● Are a way of easily indicating certain types of information to automatic processors while remaining invisible to users in browsers:– Identity information (name, address, phone

number, website, etc.)

– Calendar information (event times, locations, etc.)

– Relationships

– Recipes

– Etc.

● Think of them as a way of pointing and yelling “Here it is!” at parsing software.

Attributes for any tag

● <tag id="something">– Attaches a unique ID to an individual tag for some

purpose of your own.

● <tag class="something something_else">– Indicates that the tag belongs to one or more

groups that you yourself designate for some purpose of your own.

● One of these “purposes of your own” involves marking content for styling.

● There are other purposes …

Marking up personal information with hCard

● Find the HTML tag that encloses all of the relevant information or, if there isn’t one, surround the information with <span> ... </span> or <div> ... </div>. Give this element the class vcard.

● Mark up whatever relevant information is there with class names from the hCard vocabulary.– The only required piece of information is fn

(“formatted name”), but you can provide a lot of other information if you’d like: email, telephone, web page, address, birthday, photo, etc.

Example:An “about me” web page on your site.

Also:Why bother?

Some more notes

● If you have full control over your page’s code, you should add the hCard profile to your document’s <head>:

<link rel="profile" href="http://microformats.org/profile/hcard" />

● Remember that all you’re really doing is pointing out to non-browser parsers where a certain type of information is.

● You can wind up with a lot of extra <div>s and <span>s. Remember that HTML rendering collapses whitespace, and you can take advantage of this.

Another microformat: hCalendar

● Used for describing events in a machine-readable way.

● If you can, add the profile to your document’s <head>:

<link rel="profile" href="http://microformats.org/profile/hcalendar">

● Mark up the element containing all of the information as class="vevent".

● Required information: when the event starts (dtstart) and its description (summary).

Other considerations

● What if the human-readable information is not machine-friendly?– Use the <abbr> (abbreviation) tag to encode a

machine-friendly version in the tag’s title attribute:<p class="vevent"><span class="summary"> First paper due</span> at <abbr class="dtstart" title="2014-05-19T12:00">noon on May 19</abbr>.</p>

● If you have multiple events on a web page, you can be polite to the parser by giving an element that contains all of them (perhaps <div> or <body>) the class vcalendar.

Examples

Marking up events on a section guidelines handout.

Validating your semantic markup.

A final microformat: XFN

● Used for indicating relationships between people.

● If you can, add the profile to your document’s <head>:

<link rel="profile" href="http://gmpg.org/xfn/11">

● Really simple: just add rel="[something meaningful]" to your <a href> links.

● Theoretically, you should use this to talk about your relationship to a real human when you like to a web page representing them: a blog, a Facebook or LinkedIn profile.

What you can put in rel= values

relationship category

XFN values

friendship (at most one)

friend, acquaintance, contact

physical met

professional co-worker, colleague

geographical (at most one)

co-resident, neighbor

family (at most one)

child, parent, sibling, spouse, kin

romantic muse, crush, date, sweetheart

identity me

● The only one I myself use with any regularity is rel="me", which means “The page I’m pointing to also represents me.”

● There are plenty of other things you can do with rel= values, though we’re not discussing them.

Be nice to Google

● One of the reasons for using microformats: Google leverages them for deciding how to display search results.– If you know them, you can also use RDF or microdata.

● Indicate to Google that your content belongs to you with rel="author" markup:– <a href="[your Google+ profile URL]?

rel=author">, or – <link rel="author" href="[your Google+ profile URL]"> in your document’s <head>.

– Also link back to your website from your G+ profile.

Sitemaps

● A way of providing hints to search engines:– Where are files they might not otherwise find?

– How should various documents on your own website be ranked relative to each other?

– How often should search engines check for document updates?

● An XML file, called sitemap.xml, at the root of the site’s directory.– XML is just another markup language, conceptually

similar to HTML, without a completely fixed vocabulary. XML vocabularies depend on context.

A sample

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://patrickbrianmooney.nfshost.com/~patrick/</loc> <changefreq>weekly</changefreq> <priority>0.9</priority> </url>

<url> <loc>http://patrickbrianmooney.nfshost.com/~patrick/credits.html</loc> <changefreq>weekly</changefreq> <priority>0.1</priority> </url></urlset>

● xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" just specifies the location of a machine-intelligible document that explains the XML vocabulary to parsers.

Some considerations

● Sitemaps are understood by Google, Bing, Yahoo!, and other search engines.

● Information in your sitemap that points to a domain name other than the one on which your sitemap is hosted will be ignored.

● As in so many other ways, if a search engine can tell that you’re trying to deceive it somehow, this actually hurts your search engine ranking.

● You can sign up for Google Webmaster Tools (there’s one for Bing, too) to see stats about your website and make sure that your sitemap is understood.

Open Graph

● A way of providing basic information that controls how social networks display your web pages.

● Facebook-originated, but also understood by LinkedIn and Google+ (and others).

● Consists of a set of tags you put in your document’s <head>.

● If you omit it, social networks will try to guess based on other page information.

● Twitter and Pinterest understand it but prefer that you use their own metadata vocabularies.

● Add a bit of verbiage to your root <html> element:<html prefix="og: http://ogp.me/ns#">

● Alas, this is technically invalid in any version of HTML other than HTML5.

● However, even if it’s technically invalid, it still works fine.

● Four characteristics are mandatory to be minimally compliant:

<meta property="og:title" content="Discussion Notes for George Eliot's Middlemarch" />

<meta property="og:type" content="website" />

<meta property="og:url" content="[the document’s actual URL]" />

<meta property="og:image" content="[an image URL]" />

● There are other properties you can specify for more control.

Increasing your search engine ranking

● Write valid HTML.● Engage in good semantic markup practices so

that search engines understand the structure of your document.– Put meaningful values in <h1>, <h2>, etc.

– Use microformats (or microdata or RDF).

– Check the validity of your documents using validators. There are good ones available free on the Internet.

● Never ever try to deceive search engines.

● Treat your own home page as a central hub for your online presence.– Link to it from your other online presences.

– Link back to your other online presences from it.

– This process of reciprocal linking helps search engines to determine that your identity hub is, in fact, your identity hub.

● Hope for links to your content from other places.

● Use good metadata to describe your site.

How much of this is worthwhile?

● Admittedly, doing all of this for every HTML document that you write would be an awful lot of work, especially at first.

● The short answer: you should do whatever you feel comfortable doing and have time to do. Each of the techniques we’ve talked about today will benefit you in some ways.

● As with any skill, it gets easier as you do it more often.

top related