hybrid approaches to taxonomy & folksonmy

Post on 11-May-2015

5.032 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Hybrid Approaches to Taxonomy & Folksonomy

Semantic Technology, 2009

Stephanie LemieuxEarley & Associates

Stephanie@earley.comwww.earley.com

2

Agenda

• The taxonomy/folksonomy debate• Tagging pitfalls• Social tagging & the enterprise• Hybrid approaches to

taxonomy/folksonomy• Corporate tagging tools

3

About me

• Stephanie Lemieux– Senior Consultant at Earley & Associates, Inc.

– Masters in Library and Information Studies (MLIS),

specializing in taxonomy development, content

management, search, IA

– Developed enterprise taxonomies and helped a

variety of clients through CMS deployments

– Projects include: Motorola, Ford Foundation, Best

Buy, American Greetings, Urban Land Institute

– Blog: http://sethearley.wordpress.com/

The tired debate

Taxonomy Folksonomy

Control Democracy

Top-down Bottom-up

Arduous process Just do it

Accurate Good enough

Restrictive Flexible

Static Evolving

Expensive to maintain Low cost – “crowdsourced”

4

The relevance problem

• Search results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term*

• Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps

5

*Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Managementhttp://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf

Taxonomy

• Added by a small number of individuals: author/originators or “authorized” persons (e.g.librarian)

• Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary

• Relationships between terms defined– Hierarchical (e.g. Computer hardware > Keyboard)– Associative (e.g. Computer hardware – Software)– Equivalent (e.g. Laptop = Notebook Computer)

6

Tags

• Added by authors and consumers (individual motivation)

• Can connote any type of meaning or purpose

• No compression around a single viewpoint, no control of vocabulary

• Self-correcting through volume

7

Why tagging is so interesting…

• Adding individual value to the act of classification – user control over findability

• Reducing the cognitive burden (i.e. it’s easy)

• Reduced technological investment (i.e. it’s cheap)

• Can leverage emergent structure (folksonomy)

8

Reno|Reno|TagsTags

9

The downside…

Neither tags nor taggers are perfect…• No language control

Guy & Tonkin, 2006.http://www.dlib.org/dlib/january06/guy/01guy.html

Study: 40% of flickr tags and

28% of del.icio.us tags were flawed in

these ways

Misspellings Library vs. libaryPlam pilot

Compound words TimBernersLee

Case & number Folksonomy,Folksonomies

Personal tags To readMy dog@work

Single-use tags Billybobsdog

The downside…

• Varying levels of granularity

• Same tag, different meanings• Lack of relationships between

tags – which is broader? Narrower?• Lack of consistency/approach to change –

even single user can change language and hamper own personal retrieval

10

RobinRobin

BirdBird

Turdus migratorinus

Turdus migratorinus

…Known as “tag noise”

11

The downside…

• Most tag search does not account for stemming, plurals, etc.

E.g.

Search on Delicious:

Folksonomy: 16049

Folksonomies: 4404

Both: 2642

12

The tagging hype cycle

http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html

13

The web vs. the enterprise

• Shirky: “there is no shelf”– Traditional organization schemes are built to deal with

physical collections and constraints.– They don’t work well on the web

• large corpus• no clear edges• no formal categories• no authority

• The enterprise is much more defined• smaller corpuses• formal entities• coordinated users, clear tasks• need for reliable retrieval

E.g.FlickrDelicious

Social tagging works well in this

context

Social tagging is more of a

challenge, needsclear arena

14

Role of folksonomy in the enterprise?

• Tagging external links– Seeing what colleagues are interested in– Sharing links with a specific team– Subscribing to link feeds– Monitoring news/blog coverage of the company– Consumer/competitor research– Tracking industry trends

• Tagging internal links– Finding/facilitating access to most popular pages on the

intranet– Seeing what intranet pages mean to staff

15

Role of folksonomy in the enterprise?

• Social aspects– Identifying subject matter experts– Connecting people who share interests– Encouraging collaboration & resource sharing

• Improve your taxonomy, information retrieval– User tagging to refine the corporate taxonomy

• New concepts• New terminology

– Seeing what employees find interesting– Distributing tagging tasks

16

The downside…

• Potential issues of security, inappropriateness– Can implement some level of vetting

• Privacy concerns– Can be anonymous tagging, although this removes

some social value– Can create role or team-based collections

• Need higher ratio of active participants due to population size

17

Message text

External News Reports

Discussion postings

Links

Engineering document repositories

Success Stories

Policies

Approved Methods

Best Practices

Lower Cost Higher CostTagging/Organizing Processes

Unfiltered Reviewed/Vetted/Approved

Lower Value Higher Value

Key concept: Not all content is created equally

The content continuum

18

What if we blended the two?

Folksonomy/TaxonomyLow cost

Findability

Flexible

Structured relationships

User terminologyOversight

Social sharing

Consistency

Hybrid approaches

19

Co-existence Tag-influenced taxonomy

Taxonomy influenced tagging

Tag hierarchies/ontologies

Co-existence

• Taxonomy and folksonomy are used side by side

• Strengths of each approach preserved, philosophy of each kept “pure”

20

Web example: Flickr & Library of Congress: http://www.flickr.com/photos/library_of_congress/

Co-existence – public library

21

22

Raytheon – corporate example• Used in Raytheon employee portal - website lists

(“Suggested sites” feature box)

• How does it work: – inserted “Suggested Sites” in a "feature" box to the right

of the regularly ranked results – website suggestions (URLs) submitted along with

recommended tags/keyword which are subsequently verified and approved by librarians

http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation

23

Variation: Tag mediation

• Vetting & editing tags• Pros:

– Weeds out potentially inappropriate tags– Eliminates misspellings, plural issues, etc.– Some can be done automatically (spell-checker,

e.g.)– Enhances findability

• Cons: – Higher effort/cost– Perceived lack of trust– Who knows better?

Tag-influenced taxonomy

• Taxonomy & tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current– Find new terminology (synonyms, popular language)– Find new concepts

• Performed as separate processes (taxonomy tagging=formal, tagging=informal) or combined in single interface

24

Tag-influenced taxonomy

• Requires formal vetting process• Can be supported by automation (e.g.

candidate tags pulled & filtered with script to remove taxonomy terms, stop words)

• Evaluate candidates based on – Frequency (“literary warrant”)– Salience within context

• Look at tags used in conjunction with taxonomy

25

Taxonomy-influenced tagging

• Presenting choices/suggestionsto user from controlled set of terms/tags– Sometimes users prefer easy choice

• Drop-down menus• Check boxes• Type ahead• Tree view

– “influenced” – option to enter own tag? Good source of new terms

– Enforces consistency– Offers structure

26

27

WWW example: ZigTag (2008)

Definitions from Wikipedia & Wordnet

Tagging with type-ahead against database of 3M unique concepts & 8M synonyms

28

Zigtag

• Type ahead & synonyms encourage consistency• Users can enter new tags• Synonyms based on Wikipedia, so can be “dirty

data”• No hierarchy, only equivalent relationships so far

29

Zigtag search

Still get problems with uncontrolled tags & recall

Interesting relationships from Wikipedia

Browesable tag cloud

Example: myedna (Education.au)

http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt

Fully taxonomy-directed tagging

© 2008 31

Buzzillions.com

• Review site: tags are “controlled” not against a taxonomy, but against other tags – reduces redundancy

• Only popular tags exposed as faceted navigation

SharePoint?

• Plugins make taxonomy easy, present like tags

E.g. KWizCom: plugin manages taxonomy and tags in easy interface… can opt-out of letting users create own tags

32

33

Taxonomy-directed tagging

• Pros:– More consistency– Better support for findability– Relationships, definitions leveraged – adding

meaning to the tags– Realistic for the enterprise

• Cons:– Not really folksonomy anymore..– Can be forcing terminology on user– Need to develop reference list of concepts –

manually through taxonomy or need large corpus to derive automatically

Tag hierarchies

• 2 flavors: user-powered, automatic derivation

• User-powered– Social approach– Bogus hierarchies possible– Small population will contribute

• RawSugar tried it (no longer around): taggers could specify hierarchy in own account, tags clustered in a based on common groups

34

Raw Sugar example

35

36

More user-powered tag relationships• E.g. LibraryThing

LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent.

www.librarything.com

Automatic derivation

• Tag hiearchies, facets, ontologies, or “folksontology”

• Done through statistical/clustering algorithms

37

http://www.pui.ch/phred/automated_tag_clustering/

Delicious & citeulike hiearchy

38

http://heymann.stanford.edu/taghierarchy.html

Clustering at Flickr

39

40

Auto clustering/facets

• Still not very mature• Time-sensitive• Community-

sensitive• Ambiguous tags• Improve with volume

(self-correcting)

http://www.pui.ch/phred/automated_tag_clustering/

Intelligent tags

• Moving toward more semantic tagging with machine readable tags– Flickr: can tag images with machine tags

e.g. “geo:quartier=“SoHo” namespace:predicate=value

e.g. “lastfm:event=34640” – makes your photo appear on a lastfm event page

41

Intelligent tags

• MOAT: Meaning of a tag – part of linked data movement, mapping tags to semantic web– http://moat-project.org/

• Adding to the triplet– User – resource – tag – meaning– Meaning = URI to a resource containing

meaning (e.g. DBPedia)

42

<tag:RestrictedTagging> <tag:taggedResource rdf:resource="http://example.org/post/1"/> <foaf:maker rdf:resource="http://apassant.net/alex"/> <tag:associatedTag rdf:resource="http://tags.moat-project.org/tag/apple"/> <moat:tagMeaning rdf:resource="http://dbpedia.org/resource/Apple_Records"/></tag:RestrictedTagging>

Conclusion

• Not all content is created equal – tags and taxonomies have their sweet spots

• Hybrid approaches are emerging– taxonomy-influenced tagging leading the pack

in popularity on the web– co-existence in the enterprise

• Look for more developments on the semantic web/linked data front for making tags more intelligent

43

Corporate social tagging tools

44© 2008

45

Corporate social tagging software

• http://www.connectbeam.com/

46

Corporate social tagging software

• http://www.cogenz.com/

47

Corporate social tagging software•

http://www-306.ibm.com/software/lotus/products/connections/dogear.html

© 2008 48

Corporate social tagging software• BEA AquaLogic Pathways

• http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/

Corporate social tagging software

• http://www.newsgator.com/business/socialsites/default.aspx

49

50

Stephanie Lemieuxstephanie@earley.comwww.earley.com781-444-0287

Blog: sethearley.wordpress.comTwitter: stephlemieux

Send an email to stephanie@earley.com for a free pass to one of our conference calls.

Questions?

top related