web design workshop

1

Web Design Workshop

DIG 4104c

Spring 2014Dr. J. Michael Moshell

University of Central Florida

Lecture 9:

MoranVision: A Case Studywww.redbugbblle.com

-2 -

About David Moran

* Grad student in Digital Media MFA Program

* Gay activist

* Non-driver --- fascinated with urban landscape

What are the special problems of pedestrians in Orlando?

* MFA Project: Interactive Photo Exhibition

"Dead Quare Walking"

* Concept: Halloween at Parliament House

Then walk 15 miles to UCF, all night

200+ photos, along the way

-3 -

David is a photographer

* Needed a software delivery environment

* I like to hack software, so

-- I am building the environment for him.

CONCEPT: Viewer is shown a picture, asked to hashtag it.

The hashtag is used (in some way) to select the next picture.

The study: what patterns emerge in the hashtags

when people react to David's photography?

-4 -

Design of Basic Architecture

* Front end: HTML5 & Javascript

* Back End: PHP

* Why? It's what I know, and

we want this site to be on the web.

* Front end: Arrange the pictures and user interaction.

Take in a hashtag, pass it to the back end.

* Back end: perform cosine distance measure to

find best image, inform front end of choice.

advanceblueprintservice.com

-5 -

The problems with hashtags

* No easy way to separate out the words

e. g. #sunnyplaceforshadypeople

* How do you find the next picture?

So, we decided to require CamelCase from user inputs.

SunnyPlaceForShadyPeople

(The demo is on a computer, not a mobile device)

-6 -

Organizing the ExhibitionAssume you've gathered a word-cloud (list of words)

for each of 200 pictures.

How do you organize the pictures? How do you

search for the "next picture?"

David's decision: nine

"pages" of 4 pictures each

Image groups of about 20

photos per page.

-7 -

Word Clouds and Distance MetricsEach picture should have a GROWING

WORD-CLOUD as users wander through

the exhibition.

Metaphor: how paths are (should be) designed on campuses.

1) Just plant grass and watch where

people walk

2) Then put the concrete there.

leveragepoint.typepad.com

-8 -

Word Clouds and Distance MetricsSo, we need two things:

1) A database structure to associate a large and

growing number of words with each picture

2) An algorithm to measure the "distance" between a given

hashtag (small wordcloud) and each picture's wordcloud

Then we will (a) add the hashtag to the CURRENT pic's cloud,

and (b) pick the NEXT picture whose cloud is most similar to the

hashtag.

-9 -

Word Clouds and Distance Metrics

Literature research led to a popular distance metric:

The Cosine Measure.

1) take each document and produce a frequency histogram

-10 -



The Cosine Measure.

1) take each document and produce

a frequency histogram

-11 -



The Cosine Measure.



To measure the distance between two

documents, line up their histograms

and multiply the matching terms.

-12 -



The Cosine Measure.






-13 -



The Cosine Measure.






Some are "noise words": to, the, a, etc.

-14 -

So, Job 1: Build cosine metric$histo['a']=2;

$histo['distance']=2;

$histo['and']=2; // etc

Idea: use the

SHORTER list

to search the

LONGER one.

If it's not in the

shorter list, the

product =0 anyhow.

-15 -

Problem: Long vs. ShortThe "cosine"

must always

be between 0

and 1.

Analogy: the

"angle" between

two vectors.

-16 -

Solution: NormalizationThe "cosine"

must always

be between 0

and 1.

Analogy: the

"angle" between

two vectors.

dot (a, b)

cos (a, b) =

dot(a, a) * dot (b, b)

-17 -

Solution: NormalizationThe "cosine"

must always

be between 0

and 1.

Analogy: the

"angle" between

two vectors.

dot (a, b)

cos (a, b) =

dot(a,a) * dot (b, b)

And now, if cos(a,b) =0.0,

they have NO words in common.

If cos(a,b) = 1.0, words AND frequencies match perfectly.

-18 -

Next problem: CamelCase HashTagsHow to break up such a creature into words?

Some quick research led to a regular-expression tool.

This will take CamelCaseHashTags and produce array like:

('C', 'amel', 'C', 'ase', 'H', 'ash', 'T', 'ags');

-20 -

Designing the PictionaryWe want a structure that allows us to add words without

limit, to each picture.

Here's what we came up with.

INPUT: Excel spreadsheet: (provides "starter kit" of tags)

-21 -

Pictionary:Output:

an "array of arrays"

each item in wordlist

has fields to store

WHO added it

and what PAGE they

were on at the time.

(note 'whole-tag' version

here )

-22 -

Skipping much detail, a late-stage issue:How do we discover .jpg vs .mov, and display each

one appropriately?

First Attempt: let back-end do it.

Wasted some hours. Decided to let back-end NOT KNOW

about filetype.

Why? Simplicity. Just find the best-match number

and pass it forward.

Second Attempt: Javascript must check the file extension,

and act appropriately.

-23 -

BUT: Javascript cannot read local files!Part of the Security Model

** Javascript can read URLs – it's web-oriented

** We are running in MAMP, so "files are also URLs".

** We track down a function

-24 -

We wrap it in our own 'file_exists' function:

And ... we build it into a picture-getter, to decide

if a particular file is .mov or .jpg

-25 -

Our showpicture function:

Has a bizarre feature ... the <video> tag MUST NOT CONTAIN

a newline – or we get the dreaded Mystery Syntax Error

-26 -

To show the video, I just ...

jammed the HTML for a video tag directly into the node's

innerHTML. To my amazement, it worked.

-27 -

To show the picture (.jpg), more conventionally:

I didn't figure out how to add a <video> node as a child.

That would have been more legitimate, methinks.

-28 -

Demo the work-in-progress

Note: for debugging purposes

if we get no tag-match, I

show a nice kitty.

In "real" project, we'll show

a random image if no match.

-29 -

For analysis, I show the correlations and word-clouds below (via the return-message)

-30 -

Status:

* Presenting at Information Fluency Conference next week. Defending thesis project, late March.

* Still gotta get the page-2, etc. working

* Currently using $_SESSION to accumulate tags; but must put into a Database

* Source on course website if you want tolook at it, borrow ideas, etc.

web design workshop

Documents