final report - social web - w3c

8/8/2019 Final Report - Social Web - W3C

1/22

FinalReport

From Social Web XG Wiki

EditorsHarry Halpin, W3C and University of Edinburgh

Authors

Daniel Appelquist, VodafoneDan Brickley, Vrije University AmsterdamMelvin CarvahloRenato Iannella, Semantic IdentityAlexandre Passant (http://apassant.net) ; DERI, NUI Galway (http://deri.ie)Christine Perey, PEREY Research & Consulting

Henry Story (http://bblfish.net/) , Sun MicrosystemsMischa Tuffield (http://mmt.me.uk/) ; Garlik

Abstract

This document is the final report of the W3C Social Web Incubator Group. This report presents systems and technologies that are working towards enabling a

Social Web, and is followed by a strategy for standardizing this work in order to ensure the Social Web is open, decentralized, and royalty-free. This report focuses

on work that permits the description and identification of people, groups, organizations, as well as user-generated content in extensible and privacy-respecting

ways. This report describes a common framework for the concepts behind the Social Web and the state of the art in 2010, including current technologies and

standards. We conclude with an analysis of where future research and standardization will benefit users and the entire Social Web ecosystem's growth. We also

suggest a strategy for the role of the W3C in the Social Web.

Status of this Document

Copyright (http://www.w3.org/Consortium/Legal/ipr-notice#Copyright) 2010 W3C (http://www.w3.org/) (MIT (http://www.csail.mit.edu/) , ERCIM(http://www.ercim.org/) , Keio (http://www.keio.ac.jp/) ), All Rights Reserved. W3C liability (http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer) ,

trademark (http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks) and document use (http://www.w3.org/Consortium/Legal/copyright-documents)rules apply.

Contents

1 Overview

2 State of the Social Web in 20103 Social Web Frameworks

3.1 The Problem of Walled Gardens3.2 The Social Web Vision3.3 The Terminology

3.3.1 Social Web User and Profiles3.3.2 Single Distributed Social Graph3.3.3 Multiple Distributed Social Graphs

3.4 The Case for Open Social Web Standards4 Identity

4.1 Problem: Usernames and Passwords are Insecure

4.2 Use-case: No more passwords (or only one)4.3 Identity Standards

4.3.1 Browser-based Password Management4.3.2 OAuth4.3.3 OpenID

4.3.4 WebID4.3.5 InfoCard4.3.6 XAuth

5 Profiles5.1 Problem: Can not Describe Yourself

5.2 Use-case: Keep Your Profile and Friends Across Networks5.3 Profile Standards

5.3.1 XRD5.3.2 VCard5.3.3 FOAF

5.3.4 PortableContacts5.3.5 OpenSocial

6 Social Media6.1 Problem: Fined for Consuming Social Media6.2 Use-case: Safely Drag-and-Drop Social Media Across Multiple Platforms

6.3 Social Media Standards6.3.1 Content6.3.2 Tagging6.3.3 Microformats6.3.4 Open Graph Protocol

6.3.5 Payswarm6.3.6 OExchange

6.3.7 The Semantic Web7 Privacy

7.1 Problem: Violation of Privacy

7.2 Use-case: Your Own Terms of Service7.3 Privacy and Privacy Standards

7.3.1 P3P

alReport - Social Web XG Wiki http://www.w3.org/2005/Incubator/socialweb/wiki/index.php?title=Fi...

22 25/10/2010 20:00


2/22

7.3.2 POWDER7.3.3 AIR

7.3.4 XACML7.3.5 Rule Interchange Format7.3.6 Device APIs and Policy Working Group7.3.7 Mozilla Privacy Icons7.3.8 ODRL

8 Activity8.1 Problem: Can not Integrate Conversat ions8.2 Use-Case: Real-time Collaboration8.3 Activity Standards

8.3.1 XMPP8.3.2 Atom and Pubsubhub

8.3.3 ActivityStreams8.3.4 Salmon Protocol8.3.5 OStatus

8.4 Emerging Frameworks9 Accessibility Concerns

10 Decentralised Social Networking Projects10.1 Status.net10.2 GNU Social10.3 Google Buzz10.4 OneSocialWeb

10.5 Higgins Project10.6 Diaspora10.7 Diso Project10.8 SMOB10.9 Appleseed

10.10 OpenLink Data Spaces10.11 Personal Data Stores

11 Business Considerations11.1 Size of Social Networking in 201011.2 Current business models for social networks

11.3 New business models12 Conclusions: A Strategy for the Social Web

12.1 Investigate Identity in the Browser12.2 Co-ordinate the Core of Profile Data12.3 Combining Social Media and the Semantic Web

12.4 Re-engage Privacy Activity Focusing on the Social Web12.5 Support the Federated Social Web12.6 Open the World Wide Web Consortium

13 Acknowledgements14 References

Overview

The Social Web is a set of relationships that link together people over the Web. While the best known current social networking sites on the Web limit themselves to

relationships between people with accounts on a single site, the Social Web should extend across the entire Web. Just as people can call each other no matter which

telephone provider they belong to, just as email allows people to send messages to each other irrespective of their e-mail provider, and just as the Web allows links to any

website, so the Social Web should allow people to create networks of relationships across the entire Web, while giving people the ability to control their own privacy and

data.

The Social Web is not just about relationships, but about the applications and innovations that can be built on top of these relationships. Social-networking sites and other

user-generated-content services on the Web have a potential to be enablers of innovation, but cannot achieve this potential without open and royalty-free standards for

data portability, identity, and application development.

The Social Web Incubator Group (SWXG) was founded as an outcome of the [Workshop on the Future of the Social Networking (http://www.w3.org/2008/09/msnws

/|W3C) ] to uncover and document existing technologies, software, and standards (both proposed and adopted) needed to enable a universal and distributed Social Web.The group also sought to identify gaps, conflicts, and other areas for future standardization and research to increase adoption of the Social Web.

Over the course of the SWXGs activity the, approximately thirty, participants on the conference calls discussed a wide variety of topics and heard from over thirty

invited guests from within and outside the W3C. We conclude that while the Social Web is a space of innovation, it is still not a "first-class" citizen of the Web: Social

applications currently largely evolved as silos and thus implementations and integration are inconsistent, with little guarantees of privacy and enforcement of terms-

of-service.

Further, the members of the XG conclude:

The Social Web does not suffer from a lack of potential standards. A large number of diverse groups have evolved data models, communication protocols, and dataformats at tangents to one another, addressing a large number of communities, each of which has its own terminology and viewpoint.

1.

While there has been a large amount of work done in this area, in terms of both current potential and standards, these tend to address basic issues around identityand portability, but do not address more complex and vital issues such as privacy, policy enforcement, and provenance. All of these issues are present scope forfurther research and the development of future standards.

2.

The creation of a decentralized and federated Social Web, as part of Web architecture, is a revolutionary opportunity to provide both increased social dataportability and enhanced end-user privacy.

3.

One key to make ordinary users take advantage of a decentralized Social Web is to build identity and portability into the browser and other devices.4.

We respectfully recommend to the W3C areas of future work in which the W3C should play a pivotal role:

Investigating the benefits of existing identity solutions for the Web that would allow for a high-level of security, multiple identities, and that are decentralized innature. This work should be coordinated with existing identity work.

1.

Defining mappings between existing data-formats for social profiles on a semantic level, making sure that a common core is available in a consistent manner across2.


22 25/10/2010 20:00


3/22

various syntactic serializations (such as RDFa, JSON, and XML).Making sure that future work on the Semantic Web can help standardize methods of tracking provenance, as well as defining best practices for finding suitable

vocabularies needed to power the Social Web.

3.

Beginning an activity investigating distributed privacy/policy languages that are capable of phrasing common "terms of service" rules, and licensing information forthe Social Web.

4.

Create a more "light-weight" and open process so that groups working on the Social Web feel welcome and are able to work with the W3C. This will allow for theW3C to tightly liaison with groups and other standards bodies working in the area of the Social Web.

5.

This work could form the basis of new Working Groups, improved liaising with non-W3C efforts and standardization bodies, and increased co-ordination and focus on

the Social Web among existing W3C working groups.

State of the Social Web in 2010

2010 has been a tumultuous year for the Social Web. However, the Social Web is not a new phenomenon that has no precedent, but the result of a popularization of

existing technologies. Many social features were available over the Internet before the Web, ranging from the blog-like features of Engelbart's "Journal" system in NLS

(oN-Line System, the second node of the Internet), messaging via e-mail and IRC, the Well (1984), and the "Member Profiles" of AOL. The "list of friends", that is

ubiquitous on the Social Web, existed in the hand-authored links on the earliest webpages. The Web has always been social. As shown by this diagram below by

Berners-Lee in his original 1989 proposal to create the World Wide Web, the Web from its inception was meant to include connections between not only hypertext

documents, but the relationships between people.

What was missing was an easy-to-use interface to make finding people you know and sharing data with them easily accessible. A number of websites, ranging from

Classmates.com (1995) to SixDegrees (1997), pioneered these features for ordinary users of the Web. Since the early days of the Web people that maintained their own

homepages have been posting activity updates to their sites, and this has been pushed into the mainstream with the the development of user friendly blogging software

(from "web logs") such as LiveJournal and Blogger in 1999. Innovations in this space allowed the general public to become more and more apt at blogging, and

independent news sites such as Indymedia (1999) pioneered the notion of user-generated content management. However, these services remained fairly experimental up

until after the collapse of the initial "dot-com" bubble. After this a rash of social networking sites like Friendster (2002), LinkedIn (2003), MySpace (2003), Orkut

(2004), and Facebook (2004) took off, and eventually became the most popular sites on the Web. Starting with Flickr (2004) and Youtube (2005), user-generatedcontent took over this newly re-invigorated Social Web. The launch of Twitter (2007), a micro-blogging site, which propagated updates to users' social networks, via

desktop and mobile devices, showed another dominant t rend in the Social Web. It was around this time that the concept of the Social Web became associated both with

the aforementioned companies and with the wider "Web 2.0" paradigm. Today, the Social Web is becoming part of corporate communication portfolios and Web 2.0

companies start commercializing data from and about their users.

While the world remained incredibly geographically disparate over a number of these sites, as illustrated by this map (ht tp://www.vincos.it/wp-content/uploads/2009/06

/wmsn-06-09.png) , with many countries developing their own most popular social networking sites such as Hi5 in Japan and QQ in China, there has been an overall

tendency towards users moving their profiles between services, such as users moving their profiles from Friendster to Myspace for example. This, in turn, led to a

dismissive attitude by some that the most "popular" social networking sites would simply turn over every year or two. In a similar manner to how competition amongst

search engines eventually led to the dominance of Google, Facebook rapidly rose to become a global leader in social networking. A number of major vendors began

either purchasing social networking sites (such as the purchase of Blogger (2003) and Orkut (2007) by Google) and other companies like Yahoo! trying to roll their-own

social networking sites like Yahoo! 360 (2005). Social Web features, such as comments and user-generated content, became intertwined with such phenomenon as Flickr

for sharing photos and YouTube for sharing video. Today, it is a de-facto requirement for Web sites to have social features and for individuals and organizations to have

a presence on popular social Web platforms. Yet the ways for web-sites to do so are currently fractured and have yet to be standardized.

While empowered by the compelling user experience of these social networking sites, the real victim of these data-silos has been the end-users. Social networking sites

encourage users to put their data into the given proprietary platform, and have tended to make the portability of the user's own data to another site or even their homecomputer difficult if not impossible. Architects of new Social Web services and user-advocacy groups began to ask for the ability of users to move their data from

platform to platform. The first technology created specifically for a portable social graph was the Friend-of-a-Friend vocabulary for the Semantic Web (FOAF) in 2001,

and in 2005, a biannual gathering of developers started the Internet Identity Workshop from which standards like OpenID emerged. Momentum took off after Brad

Fitzpatrick (formerly of Livejournal)'s post on "Thoughts on the Social Graph (http://bradfitz.com/social-graph-problem/) ", together with David Recordon, in 2007.

There was quickly following a number of initiatives like the DataPortability (http://www.dataportability.org/) initiative, the Data Liberat ion Front

(http://www.dataliberation.org/) at Google, and lately, the Federated Social Web initiative. This momentum continued to attract interest, however, at the same time an

open and decentralized Social Web still seems distant and few users have actually left these data-silos. (@@QUESTION: Did OpenID really come out of the IIW).

Many social networking sites considered privacy and portability to be contradictory, insofar as Facebook used to deny users the ability to let data be portable outside its

system due to concerns over user privacy, as their terms of service in 2006 stated that "We understand you may not want everyone in the world to have the information

you share on Facebook; that is why we give you control of your information." In one particularly infamous incident, in 2008 blogger Robert Scoble wanted to make their

information portable by copying his contacts from Facebook, but had his account disabled by Facebook (http://scobleizer.com/2008/01/03/ive-been-kicked-off-of-

facebook/) . However, in 2009 there seemed to be little concern about issues of privacy and portability except amongst those deeply immersed in designing social

networking platforms, with only 20 percent of users listing privacy as a primary concern motivating their choice in Social networking sites. Whilst the market for online

social networking remains competitive, privacy has yet to emerge as a competitive advantage (http://preibusch.de/publ/privacy_jungle/) . Today, privacy is a secondary

argument to stimulate new sign-ups. Widespread usability problems impede users to exercise effective control over their personal information on social networking sites,where permissive defaults are another threat to privacy. While Scott McNealy of SUN infamously remarked that "You have zero privacy anyway," recent studies show

that youth have "an aspiration for increased privacy" and are equally concerned about privacy as adults [PEW REPORT!!].

As more people are adopting Web-enabled smartphones, with mobile users spending more minutes per day on social networking sites than the average PC user, in 2010

30% of smartphone users accessed social networks via mobile browsers, the mobile Social Web must not be ignored. Users seem attracted to mobile device access

because they can consult with friends and quickly make decisions while remaining mobile, allowing users to use applications in a context such as the live-tracking of


22 25/10/2010 20:00


4/22

buses. Many popular social networks at the time of writing this report tend to offer both a Web-based version, and a dedicated application which can be downloaded for

the given smartphone platform. These dedicated applications tend to be able to make much greater use of built in sensors, and applications found on these smartphones.

As several mobile social networking sites allow users to both upload their location and see the location of their friends, a number of small groups have joined together to

form the OSLO alliance (ht tp://groups.google.com/group/locallies) (Open Sharing of Location-based Objects). OSLO includes many players in mobile social networking

and location-based social software which have signed an agreement to enable their approximately 30 million users to share location information between mobile social

networks, in essence supporting the portability of location information between services. However, this activity seems to have stalled and the W3C Device API WG is

quickly filling the gap by standardizing a set of APIs to be implemented by mobile browsers to cater for access to device functionality, such as a user's address book,

calendar, location, within a Web Application running inside a standard mobile browser. As more and more of Web usage goes mobile and data access speeds increase,

one can expect the difference in capabilities between the Web and the mobile Web to diminish.

2010 was the year in which the issues of privacy on the Social Web grew beyond a niche concern and entered the popular consciousness. In December 2009, Facebook

changed its privacy settings by defaulting certain privacy settings which in turn made part of a user's profile information public. Users were encouraged to use "privacy

controls" to provide access control to their data, but many users found these controls to be confusing and the default settings led to revealing lists of friends. This sparkedwidespread outrage, even amongst the governments. The development of Facebook Connect and other more distributed services led Facebook's Terms of Service to

become even more open with users data , such as "When you connect with an application or website it will have access to General Information [which includes] your and

your friends names, profile pictures, gender, user IDs, connections, and any content ... the default privacy setting for certain types of information you post on Facebook

is set to everyone." Google long supported the general notion of portability, and its OpenSocial API (and Open Social Alliance) and Google FriendConnect starting to lay

the ground for a distributed portable social platform. However, Google's attempt to t ransform their popular GMail into a social networking platform via Google Buzz in

2010 also led to massive privacy concerns amongst users: Buzz users saw their most frequent communication partners exposed publicly and needed to opt-out to have

them concealed. Overall, at this moment in 2010, privacy is returning as a major concern. Furthermore, none of the concerns about the portability of social data have

been addressed in a manner that is widely implemented across social Web platforms, leading to a fragmentation of identity and a generalized lack of portability and

privacy on the Social Web. (@@QUESTION: was it really facebook connect or was it the OpenGraphProtocol which made this change to the T&C's)

In 2009 the World Wide Web Consortium held a workshop on the "Future of Social Networking" in Barcelona, and, shortly thereafter, launched the Social Web

Incubator Group to investigate future work in the area of the Social Web. Tim Berners-Lee proposed Socially Aware Cloud Computing (http://www.w3.org/DesignIssues

/CloudStorage.html) , where he illustrated how the technologies required to have a decentralized socially aware Web were available and how it is but a matter of

engineering to realize this forward. Overall interest still remains high as witnessed by the launch in 2010 of products like Vodafone's OneSocialWeb and the open-source

Diaspora Project, and the first attempt at developing a common test-suite across differing standards-based social networking sites at the Federated Social Web Summit.

At this point in history, the Social Web has became the dominant platform for communication, rapidly beginning to even eclipse the use of e-mail among youth. The nextsteps the companies and communities around the Social Web take will have real consequences on the future of the Web and communication itself.

Social Web Frameworks

The Problem of Walled Gardens

The importance of the Web has always been its open and distributed nature as a universal space of information. Until recently this space of information has been limited

to hypertext web-pages without attention being paid to social interactions and relationships. This was not a particular fault of the Web, in fact but a result of a certain

focus of the early Web on documents. However, these kinds of activities are currently restricted to particular social networking sites, where the identity of a user and

their data can easily be entered, but only accessed and manipulated via proprietary interfaces, so creating a "wall" around connections and personal data , as illustrated in

the picture below. This current dismal situation is analogous to the early days of hypertext before the World Wide Web, where various systems stored hypertext in

proprietary and incompatible formats without the ability to use, globally link and access hypertext data across systems, a situation solved by the creation of URIs and

HTML. A truly universal, open, and distributed Social Web architecture is needed.

The lack of such an architecture deeply impacts the everyday experience of the Web of many users. There are four major problems experienced by the end user:

Portability : An ordinary user can not download their own data and share it how they like. Information stored on social networks could be useful for any numberof applications, but the lack of portability of tediously entered social networking information causes users to continually re-enter and update their personalinformation, wasting their time.

1.

Identity: Not having a easy way to manage digital identity across digital networks leads to unsafe re-usage of passwords. Every time a user goes to a new site, theymust not only create a new username and password, but re-find their friends and entice friends to move sites with them. Porting personal data from one network toanother does not solve the problem of loosing one's friends if one moves.

2.

Linkability: Users have no way of being notified if they are being mentioned on a social networking site which they are not a member of. For example, if someonetakes a photo of some friends at a party and wishes to publish it on the Web to share with those friends, but does not wish to make that publicly available, he must

find a social network where each one of them is already a member, or simply not tell people that the photo has been uploaded.

3.

Privacy: A user cannot control how their information is viewed by others in different contexts by different social applications even on the same social networking

site, which raises privacy concerns. Privacy means giving people control over their data, empowering people to they can communicate the way they want. Thiscontrol is lacking if configuring data sharing is effectively impossible or data disclosure by others about oneself cannot be prevented or undone.

4.

Participation is the life blood of social networks. If no one (or if too few people) participates, a social networking application dies. If social applications are to thrive and

provide engaging and valuable services to users, they must be easy-to-use, and must support ways for people to connect with and manage their social interactions and

connections across multiple sites. While we take a "user-centric" approach in this report, having a common set of Social Web standards is a "win-win" proposition for

both industry and users. As portability issues prevent companies from accessing user-data held on third-party sites to build innovation and large social networking


22 25/10/2010 20:00


5/22

platforms themselves lack ways to easily share their data, in turn monetizing their assets.

The Social Web Vision

People express different aspects depending on context, thus giving themselves multiple profiles that enable them to maintain various relationships within and across

different contexts: the family, the sporting team, the business environment, and so on. Equally so, in every context certain information is usually desired to be kept

private. In the 'pre-Web world' people can usually sustain this multiplicity of profiles as they are physically constrained to a relatively small set of social contexts and

interaction opportunities. In some ways, social dynamics on the Web resemble those outside the Web, but social interactions on the Web differ in a number of important

ways:

the kinds of profile exhibited by a single person are not controlled by the same constraints and so are less limited in scope, and so may include profiles for fictionalpersonae.

the set (number) of people with whom interactions are possible is not limited by distance or time. The Web allows for users to user connect with a vast number ofpeople, which was inconceivable only a few years ago.

a person can explicitly "manage" the relationships and access to information they wish to have with others and with the increasing convergence of the Web and theworld outside the Web is a lso leading to increasing concerns about privacy as these worlds collide.

Anyone should be able to create and to organize one or more different profiles using a trusted social networking site of choice, including hosting their own site that they

themselves run either on a server or locally in their browser. For example, a user might want to manage their personal information such as home address, telephone

number, and best friends on their own personal "node" in a federated social network while their work-related information such as office address, office telephone

number, and work colleagues is kept on a social network ran by work. Today current aggregator-based approach exemplified by FriendFeed are but a short-term solution

akin to "screen scraping", that work over a limited number of social networking sites, are fragile to changes in the sites' HTML, and which are legally dubious.

The approach we endorse allows the user to own their own data and associate specific parts of their personal data directly to different social networking sites, as well as

the ability to link to data and friends across different sites. For example, your Friends Profile can be exposed to MySpace and Twitter, whereas your Work Profile to

Plaxo and LinkedIn, and links between data and friends should be possible across all these sites. Traditional services can utilize these features, so that your "health"

profile can be exposed to health care providers and your "citizen" profile exposed to online government sites and services. In this world of portable social data, both large

and small new players can then also interface to profiles and offer seamless personalized social applications.

Privacy is a complex topic, and we understand privacy as control over accessibility of social information in general, including security as an enabler (the authentication

of digital identity and ownership of data). Privacy controls are often not well-understood by users and they do not stop data "leaking" from the social networking site

itself, which may give user data to other companies or even governments for some kind of gain without alerting the user. Privacy should be controlled by the users

themselves in an explicit contract with social networking sites and applications that lets privacy controls easy-to-use and understandable. As custodian of their own

profiles, users can then decide which social applications can access which profile details via explicitly exposing personal data to that application provider, and retracting

it as well, at an appropriate level of granularity (http://www.w3.org/2010/api-privacy-ws/papers/privacy-ws-1.pdf) . This in itself is one of the biggest challenges for the

entire Web community, not just social networks, and needs a new "policy-oriented Web" architecture to support trust and privacy on the Web in the longer term. Whilst

technical security is a mandatory enabler, users' effective ability to control the processing of their data is largely influenced by accesible controls, helpful user interface

design withg strong visual metaphors, and privacy-enhancing default settings regarding data sharing.

This Social Web architecture a rticulated here is not the invention of the Social Web Incubator Group, but of a long-standing community-based effort that has been

running for multiple years, of which only a small fraction of have been explicitly interviewed and acknowledged by the Social Web Incubator Group. This report is

dedicated to all the developers out there working to make this vision a reality.

The Terminology

As the Social Web is large and innovative space, the creation of new terms can not be avoided, but to be too loose with terminology may serve to cause confusion rather

than build consensus. Building on existing work like the lexicon of Identity Commons [IDLEXICON!!], we propose definitions for the following concepts in order to

clarify our presentation:

Concept Definition Graphical Representation

User The user is a person, organization, or other agent that participates in online social interactions on the Web.

ProfileA single digital representation of a user. These are potentially unlimited and may coincide with different personaeof the user such as a personal profile and work profile. This is a "personae" in the Identity Commons Lexicon.

ProfileAttribute

Information about a user that is a component of the profile such as name, e-mail, status, photo, work phone, homephone, blog address, etc.

. This is an "identityattribute" in the IdentityCommons Lexicon.

Social

Connection

Social connection are associations between a profile and a resource (or group of resources) and may include thetype of the relationship (eg friend, colleague, spouse, likes etc) and may be either reciprocal ('friend') oruni-directional (following). The connection may be between different users or be tween a user and some socialmedia (a video or an item the user likes). The collection of all connections of a profile is called the Social Graph ofthat profile.

Social GroupSocial Groups are explicit named sets of social connections between resources. For example, My Football Team,Wine Club, My Favorite Movies, etc.

SocialPlatforms

Social platforms refer to a collection of features in which the user can interact with their social connections andsocial media, publish social media, and use social applications. The social platform is often centered on a single website,a 'social networking site analogous to Facebook or LinkedIn. but may also be owned and controlled by the

user.

DistributedSocial Graph

A set of profiles and social connec tions between agents which may be hosted across different social platforms.


22 25/10/2010 20:00


6/22

SocialApplications

Social Applications are functions of a social platform such as rea l-time messaging and social games. Socialapplications may be bound to a particular social platform (Facebook and FBML, Twitter and Twitter OAuth) or

capable of running across multiple social platforms (OpenSocial). Note that the difference between a socialplatform and a social application is often fuzzy, as some platforms do not allow third-party applications, and someplatforms are indistinguishable from their applications.

Profile

Association

A kind of social connection. A profile associations are used to indicate the link between a specific profile and a

social platform. The social platform can then provide profile attributes for use to social applications.

SocialInteraction

A social interaction links a Social Web user and a social platform by providing all the necessary applications andprofile information.

(@@QUESTION:Shouldn't "Profile" in the table be "Identity)

Social Web User and Profiles

Figure 1 below shows how a single user (one person) can have multiple profiles that share common attributes. A user can then associate his/her profile at the profile level

with particular social applications, controlling them in some sort of aggregated view that the user may have on either a desktop application access via an aggregator. The

profiles are exposed to and/or synchronized with different social platforms. In some cases, the social platform will update a profile property and this modified property

will be reflected across all profile instances. The attributes included in a profile will depend greatly on the needs and desires of the user and context of each social

application, including dynamic attributes that capture the evolving changes of a persons context, such as geolocation attributes. In Figure 1, one profile is associated

with the "light blue" and "red" social applications, one profile to the "grey" social application, and one profile to the "blue", "green", and "orange" social applications.

(@@QUESTION: What does this above sect ion give us?)

Single Distributed Social Graph

Attributes within a profile, including information about social connections, may be distributed. This means that the relevant at tributes and social connections could be

stored with a social application for use in the context of that application. For example, a work phone attribute is stored by my current employer's social platform, but

another social platform (e.g., LinkedIn) may store my previous employer's information. Together, these two (distributed) attributes can be considered a distributed single

"work" profile whose information I may want to combine in context of a social application (such as a job-hunting social application). Figure 2 below shows a profile that

has two sets of two attributes at distributed sites each with two local attributes. The user is interacting with the profile through the "blue" social platform, which could be

a node in a decentralized Social Web platform. For example, a profile management service that could be ran in the browser or via a third-party web-site would keep track

of the distributed attributes and multiple profiles and allow the user to edit the attributes across multiple platforms.


22 25/10/2010 20:00


7/22

Multiple Distributed Social Graphs

A profile is associated with one or more social platforms in which the user's social graph is formed and nurtured. The social platform is the context for how a user is

connected to the profiles of others and will support the specific connect ion types (e.g. friend, colleague, likes, etc) that will typically serve the purpose of some social

application. A core feature or service of a social application is to make, maintain, and expand these connections.

A users connections in a particular social platform should be portable. The user should be able to take them to another social platform, so it is not necessary to

re-establish all connections again in another (new) social application. Note that Amy (Profile 1) in the "blue" social platform is connected twice to Bob via his Profile 1

and 2. This demonstrates that the same user can connect via different social platforms. The social platforms do not necessarily have to be controlled by any social

platform, but could be links through the open Web. The lines between profiles are either uni-directional (such as Twitter) or bi-directional (such as Facebook) to capture

where the connection is or one-way (following) or mutual (friendship). Two dots means that the connection is bi-directional. One dot means the connection or association

it is not reciprocal.

Figure 3A shows an example of multiple distributed social graphs with a number of different users, profiles, and social platforms. For example:

The blue social platform connects Amy (Profile#1) to Bob (Profile#2), Col (Profile#3), Dan (Profile#2), Bob (Profile#1).The green social platform connects Amy (Profile#1) to Fran (Profile#3), Gary (Profile#1), Ed (Profile#2).The orange platform connects Bob (Profile#1) to Amy (Profile#2), Fran (Profile#1), Ed (Profile#7).The red social platform connects Ed (Profile#7) to Dan (Profile#2).

Figure 3B shows an example of explicit groups. In this example, Amy has designated a number of her connections into two groups. These named groups then enable Amy

to refer to the collection of connections in a single instance. For example, "allow my book reviews to be read by my Book Club members only", and with global digital

identities and profile information, these groups could encompass users across many social platforms.

So far, an emphasis has been placed on the creation and management of profiles with their associated interwoven multiple social graphs. To be successful, the Social

Web must include far more than distributed profile and social graph management. We propose an open conceptual system in which there are multiple interoperable

frameworks (see Figure 4) covering different levels of complexity and use-cases.


22 25/10/2010 20:00


8/22

In effect we depict a "meta-framework" within which there currently appears:

Identity Framework,

Profile Framework,Social Media Framework.Privacy Framework,Activity Framework, andEmerging Frameworks (Decentralized Social Networking)

At this point, we will assume the frameworks will be able to work together seamlessly via a combination and harmonization of standards in order to enable a wide variety

of innovation across social platforms and applications. An evolving combination of interoperable frameworks will move the Social Web towards this overall objective

without constraining developers to a single monolithic architecture.

The Case for Open Social Web Standards

However, a critical problem in realizing this vision of Social Web is the fact that any "distributed" social networking platform will become yet another walled garden

unless it is based on open and royalty-free standards. Via open standards, multiple social networking platforms ranging from large vendors to simple personal websites

should be able to interoperate. However, these standards are currently scattered across various communities and are at times even incompatible, so that producing a

single overview of what technologies and standards is a difficult if not impossible task, as is guaranteeing implementers can develop them without being hit by a patent

lawsuit.

(@@QUESTION: what is this last sentence saying?)

Identity

Identity is the connection between a profile, a set of att ributes, and a user. Some credentials or "proof" of identity may be required from the user to access or create a

profile, which is the step ofauthentication. In particular, these credentials may take many forms as a password, a signed digital certificate, or some other log-in

credentials. Identity providers make claims (at least one) by providing attributes and may or may not authenticate the identity of a user. One of the most important parts

of any profile claim is the identifier (a URI, including an e-mail address) for a user, although maknig a claim does not always reveal an identifier. An identity may be

de-coupled from all but the most minimal of profiles (a simple identifer) and make claims without revealing any identifier, and may be anonymonized as to not include a

user's true identity (i.e. legal name or other identifying characteristics). A user should be able to have multiple identities as well as multiple profiles. A user should be able

to revoke an identity if it becomes compromised or for any other reason.

Problem: Usernames and Passwords are Insecure

Username and password combinations are currently the most prevalent identification technology on the Web. They are easy to understand, but suffer from a number of

technical and economic drawbacks (http://preibusch.de/publ/password_market/) , including phishing threats. Web users are excessively requested to create password-

backed accounts across various Web sites, leading to password-reuse with growing insecurity of each account. Passwords that are manually generated are often insecure,

and automatically generated ones are difficult to remember. Widespread technical negligence in implementating password systems securely further undermine the

security of password systems on the Web, and can partially attributed to lacking practical advice or standards on how to implement good password schemes

(http://www.lightbluetouchpaper.org/2010/07/29/web-password-standards-2/) . Approaches like Facebook Connect and Google FriendConnect at this point rely on

user-name and password-based authentication for sharing personal social data.

Use-case: No more passwords (or only one)

Social Web user Alice wants to access her Social Web platforms Twitbook for her friends and BizLink for job contacts. She wants to keep the two identities separate,

and access these platforms from multiple devices. Unfortunately, Alice uses so many social platforms and other web-sites that she currently just repeats the same

password and username combination over and over again, which is insecure and may lead to identity theft. Luckily, using a distributed and secure identity framework,

she can verify her identity by associating herself to a profile using some proof like self-signed certificates on her favorite devices like her laptop and mobile phone.

Furthermore, as sometimes she may want to access her social platforms using an Internet cafe while traveling, so she finds a trusted third-party passphrase-based identity

provider called SocialAggregator. As both Twitbook and Bizlink support her standardized identity authentication mechanism, whether it is used via her browser on her

mobile phone and laptop or via a third-party identity provider, Alice no longer has to remember passwords when she uses any of social platforms on her trusted everyday


22 25/10/2010 20:00


9/22

devices, and has to use a passphrase only when not using a trusted device.

(@@QUESTION: THe social aggregator sounds like a service, which would know too much information, I suggest the language here should be changed so that it seems

like a collection of open standards, which would mean that people could host their own SocialAggregator)

Identity Standards

This section will list a number of online identity providers which are currency deployed or in development on the Web. We will include both identity standards, as well as

authentication and discovery standards that rely on a notion of digital identity.

Browser-based Password Management

Browsers now make it easier for users to create different passwords for each website by remembering them for the user, as is currently implemented by Mozilla

[@@Other browsers?]. The Weave project (http://mozillalabs.com/sync/how-do-i-get-started-using-weave/) of Mozilla aims to make password based authentication

more integrated in the browser by allowing the browser to create and update passwords automatically across the Web. Instead of trapping the user within the browser,

Mozilla's Sync (http://mozillalabs.com/sync/) plugin could allow the user to copy passwords, browser preferences and bookmarks from one browser and device to another

in a secure manner by storing these preferences cryptographically encrypted on a server. The end user then only needs to remember this URL and the one password for

its contents, to be able to retrieve it in any other device that knows how to decrypt and read the content. While currently browser based approaches do not track social

connections, these could be addressed future work. However, even then it would not address the ability make and use connections across different social networking

sites.

OAuth

(@@QUESTION: OAuth 1.0 is not an IETF standard, is it?)

(@@QUESTION: OAuth is not really anything to do with Identity, it is about data authorization and API access to data... I am not sure where this is best placed, but I

find it odd to be in the "Identity" section)

OAuth (http://oauth.net/core/1.0a) (Open Authorization) [OAUTH!!] is an IETF standard lets users share their private resources on a resource-hosting site with anotherthird-party site without having to give the third-party their credentials for the site and so access to all their personal data on the social site. This standard essentially

defeats the dangerous practice of many early social networking sites of accessing for the username and password of an e-mail account in order to populate a list of

friends. Instead, OAuth allows an authorized handshake to happen between an resource-hosting site and a third-party, which then lets the third-party to redirect the user

to authorize the transaction explicitly on the original site. If the transaction is explicitly authorized, then OAuth generates a duration-limited token for the third party that

grants access to the resource-hosting site for specific resource. OAuth's tokens establish a unique ID and shared secret for the client making the request, the authorization

request, and the access grant. To its huge advantage, this approach works securely over ordinary HTTP requests, as client generates a signature on every API call by

encrypting a unique information using the token secret, and the token secrets never leave the sites. However, a session-fixation (http://oauth.net/advisories/2009-1/)

attack was discovered in the original specification that a llowed a malicious party to save the authorization request and then convince a victim to authorize it, giving the

malicious party access to the victim's resources, but this was fixed by having the third-party register with the resource-hosting site, as given in an update of OAuth

(http://oauth.net/core/1.0a) . Recently there has also been a timing attack (https://www.blackhat.com/html/bh-us-10/bh-us-10-briefings.html#Lawson) (using the

difference of time in "bad" and correct digital signature verification to figure out tokens), but this has been addressed by having digital signature verification use a

constant time.

While OAuth 1.0 is highly successful, the process of generating and managing the various tokens was considered difficult by many developers, so the IETF draft standard

OAuth 2.0 (http://tools.ietf.org/html/draft-ietf-oauth-v2-10) simplifies the process by relying on TLS. While OAuth 2.0 requires that the resource-hosting site use HTTPS

rather than HTTP (and is therefore backwards incompatible), OAuth 2.0 requires SSL SSL is required for generating the token, so signatures are no longer required forboth token generation and API calls. Decreasing complexity, OAuth 2.0 has just a single security token and no signature is required. This has led to wider adopt ion across

social networking sites like Twitter.

OpenID

OpenID (http://openid.net/developers/specs/) centralises the authentication step at an identity provider, so that a user can identify themselves with one site (an OpenID

identity provider) and share their profile data with another site, the relying party. A user need only remember one globally unique identity, which in OpenID 1.0 was a

URI. In the initial OpenID 1.0 specification [OPENID-1!!], the identity provider was discovered by following links of a HTML page accessed by the OpenID 1.0 URI,

and OpenID 2.0 also allowed the use of the XRD format. One of the primary findings of the OpenID effort was that users were uanble to use URIs to identify

themselves, and so approaches like Webfinger (http://code.google.com/p/webfinger/wiki/WebFingerProtocol) [WEBFINGER!!] even just an e-mail address, as enabled

by the Webfinger (http://code.google.com/p/webfinger/wiki/WebFingerProtocol) were developed to allow e-mail addresses to be used as identifiers, which has had more

success. (@@QUESTION, I think that Webfinger should probably get its own section? And the final comment "which has had more success" should either be backed up

with examples or deleted)

Once the OpenID provider is discovered, a shared secret is established in between the provider and the relying party, allowing them to share data. This is primarily done

via an attribute exchange protocol [OPENID-ATTR!!], that allows the user to specify what personal data should be sent to the relying party. Note that this attributeexchange protocol is constrained by the information that can be placed as attribute-value pairs inside a URI, which is practically limited to a maximum of 1024

characters [@@Check?]. OpenID is currently deployed by AOL, Facebook, the Livejournal codebase, Microsoft, Myspace, Google (including Blogger), WordPress and

Yahoo! (including Flickr). However, many larger sites expose themselves as OpenID identity providers but do not funct ion as OpenID relying parties, i.e. do not allow

users to log-in to their site using user credentials from another site.

As a server-side solution, OpenID and successor technologies have the advantage of only relying on server-side HTTP redirects, and so in general works independent of

browsers. Very seriously, OpenID 2.0 Authentication does not require relying parties to validate, and so has been described as phishing heaven (http://www.links.org

/?p=187) , since it allows any OpenID-enabled site to redirect a user to a fake OpenID provider, that then steals the user's credentials. While OpenID does not specify the

credentials needed by the authentication mechanism, very few OpenID providers provide authentication based on certificates or other kinds of credentials, but primarily

rely on username-password authentication. Also, the protocol seems complex to developers, requiring 7 HTTPS in general. Its creators feel that users are not yet using

the functionality of OpenID on the scale they would like, and so given the similarities between the workflow of OpenID and the success of OAuth with developers, now

it appears that that the next version of OpenID, the work known as OpenID Connect (http://openidconnect.com/) [OPENID-CONNECT!!] starting at the OpenID

Foundation in 2010, will be built ent irely on top of OAuth.

ebID

WebID (http://webid.info/spec/) , originally foaf+ssl (http://esw.w3.org/foaf+ssl) , uses TLS and client-side certificates for identification and authenticat ion.To

authenticate a user requesting an access-controlled resource over HTTPS, the "verifying agent" controlling the resource needs to request an X.509 certificate from the

client. Inside this certificate, in addition to the public key there is a "Subject Alternative Name" field which contains a URI identifying the user (the "WebID"). Using

standard TLS mutual-authentication, the user agent confirms they know the private key matching the public key in the certificate. A single HTTPS cacheable lookup on

the WebID should retrieve a profile. If the semantics of the profile specifies that the user named by that URI is whoever knows the private key of the public-key sent in

the X.509 certificate this will confirm that the user is indeed named by the WebID, allowing the authenticating agent to make an access control decision based on the


22 25/10/2010 20:00


10/22

position of the WebID in a web of trust.

The user does not need to remember any identifier or even password and the protocol uses exactly the same TLS stack as is used for global commercial transactions and

is not vulnerable to phishing. As it is widely known that cert ificate authorities (http://www.win.tue.nl/hashclash/rogue-ca/) can be impersonated (although with a lot of

work), instead of relying on widely known Certification Authorities, the client side certificates may be self signed. Such certificates can be generated in the browser in a

one click operation. Disabling a certificate is as simple as removing the public keys from the personal profile.

However, there are a number of problems with this approach. First, certificate management and selection in browsers still has a lot of room for improvement on desktop

browsers, and is a lot less widely supported on mobile devices, although there exists WebID implementations that are written in Javascript as to be completely

de-coupled from the browser. Furthermore, it is often thought that by tying identity to a certificate in a browser, users are tied to the device on which their certificate was

created. In fact a user profile can publish a number of keys for each browser, and certificates are cheap to create. Some people see that this can be enhanced by uses of

protocol such as Nigori (http://www.links.org/files/nigori-protocol.html) that requires only a single password to access "secret s" like certificates on a server, and so

WebID could be integrated into a Firefox Sync-style identity management system.

InfoCard

Infocard (http://informationcard.net/.) is a user-centered identity technology based on three interrelated concepts: the card metaphor, active client software, and the

OASIS IMI protocol for identity authentication [INFOCARD!!]. As such, it is a multi-layered integrated approach and infrastructure in of itself. Active client software

integrated with the local browser, sometimes called a selector, acts as a local digital wallet for the user. Each card in this wallet supports a set of profile attributes called

claims. Personal cards can be created directly by the user and hold self-asserted claims and values. Managed cards, on the other hand, are issued by identity provider

websites that act as the authority for the claims supported by that card. The interactions between the active client and external services are defined by the OASIS IMI

standard [IMI!!]. Under IMI, an infocard-compatible relying party website, usually via HTML extensions passively expresses its policy: the set of claim URIs that it

requires, the card issuer it trusts, etc. When the user clicks on an HTML button, extensions with the browser trigger the invocation of the active client which displays a

set of cards that support the claims required. If a managed card is selected by the user, the user authenticates and the client fetches a security token from the card issuer

site using IMI protocols, and POSTs it to the relying website where it can be validated and the claim values extracted. The Infocard architecture provides phishing

resistance, eliminates the need for per-site passwords, provides a familiar card/wallet metaphor, provides on-the-fly privacy enhancements (e.g. attribute minimum

disclosure and generation of pseudonyms). Microsoft's Cardspace (http://www.microsoft.com/windows/products/winfamily/cardspace/default.mspx) , is built into Vista

and Windows 7. Open source projects including Novell's Digital Me (http://code.bandit-project.org) , OpenInfocard (ht tp://code.google.com/p/openinfocard/) , and

Eclipse Higgins (http://higgins-project.org) provide clients for MacOS, Linux, Window, iPhone as well as support for popular browsers. Commercial and open sourcecard issuing services and relying party enabling technology is also available from a number of providers.

While much has been achieved, Infocard remains a work in progress. Its main disadvantage is the perceived complexity of interlocking standards and technology needed

to support the architecture, so current work is on driving adoption via focus on applications in the government sector. Infocard's relatively secure architecture and

privacy-respecting characteristics when compared with most browser-redirect-based identity technologies are compelling this marketplace. On the technology side, work

is underway (e.g. within [1] (http://higgins-project.org) ) on active clients that move a considerable distance beyond the first generation clients that came to market in

2007-8. These new clients, while implementing the IMI protocol will also add support for other protocols is to make them interoperable. These Infocard-aware clients

incorporate Web services to at the least provide "card roaming" across browsers and devices and can provide a "Personal Data Store." New kinds ofrelationship cards

that create continuous data feeds vs. one-shot attribute conveyance are under development. It is expected is now moving into "identity in the browser" work.

(@@QUESTION : what is meant by "relatively secure" ?!?! can we change this to "secure" or "not secure")

XAuth

XAuth (http://xauth.org/spec/) allows multiple identity providers to update an XAuth provider (currently only xauth.org (http://xauth.org/) ) so that third parties can

authenticate a given user's identity [XAUTH!!]. When a user signs-on to an account on an XAuth-enabled identity provider, the identity provider notifies xauth.org.

When a site is encountered that needs authentication, the site can use some simple embedded Javascript to ask xauth.org which identity providers the user is logged inon, and then uses the cookies stored locally on the browser to help the user authenticate with the third-party site. This approach easily allows logging out (as XAuth-

enabled identity providers can te ll xauth.org that the user's sesion has ended) and lets users enable or block identity providers. However, this approach has been heavily

criticized. First, xauth.org is controlled by a singe entity (currently Meebo), so XAuth is heavily centralized @@ (http://hueniverse.com/2010/06/xauth-a-terrible-

horrible-no-good-very-bad-idea/) . Although this could be fixed (i.e. letting xauth.org redirect to a local host, as suggested (http://www.abstractioneer.org/2010/06/xauth-

is-lot-like-democracy.html) ), it still reveals to third-parties the identity providers a user employs without their authentication, which can be enough information to

identify them for malicious purposes (http://www.links.org/?p=938) . Google and Meebo deploy XAuth.

Profiles

The Profile framework contains those applications which can be used to access attributes and the distributed access to such information. Users in this stage should also be

able to find, discover, add and delete connections in order to update their profile. Using an identity selector, a user may want to select amongst multiple profiles (each of

which could be a personae) and their attendant set of attributes. Each of these set of claims could be hosted by different providers. It should be possible for a user to

control multiple profiles across multiple social networking sites, and synchronize the updates to their identity providers. In this manner, social applications should be able

to share profile information, but on an as needed basis ("capability-based"), so that only the information needed in a particular context is revealed. Users can then be able

to import their connections to new social applications and platforms so they do not have to find and confirm all contacts "from scratch" over and over again.

Furthermore, a user should be able to export all their profile information and delete all profile information from an identity provider.

Problem: Can not Describe Yourself

Today, when users create profiles they are often constrained in how they describe themselves and have to manually re-find their friends. For example, using this

real-world example given in the figure below, it is claimed that the user's real name has "too many characters", and thus forces the user to use a name that just happens to

fit the engineer's schema, even if it's not their real name. Given the importance of names in identifying people and the complexities of many international names

(@@QUESTION this sentence has no main verb). Worse, some social networking sites constrain preferences, such as gender and religion preferences, that can be very

sensitive. Also, many users may wish to have different names and profiles on different kinds of sites, and on some sites anonymity is a must. Furthermore, a near fatal

problem with the uptake of new social networking sites and applications is that not only do users have to re-enter all their informations to conform to what the new site

wants, but then they have to re-locate all their friends on the new site or re-invite them.

(@@QUESTION: Which Figure is being referred to? Either re-add figure or delete accordingly)

Use-case: Keep Your Profile and Friends Across Networks

Alice has gotten bored of her social platforms, and wants to move to the new and increasingly popular augmented social reality gaming platform Fazer. However, she

does not want to re-enter her old information and find her friends again. She authenticates herself using her browser-based ID and then accesses Fazer, and selects her

"personal" identity as to not let her work colleagues about her game-playing identity. Since Fazer is a "real-world" augmented reality social game, she does not create a

completely fictional profile (although she could) but instead opts to use an existing profile. In the account creation process, she is not required to complete all the profile

attributes, but has them auto-completed, and she even creats a few new (custom) fields in a profile, and this new updated personal profile information is automatically

synchronized between Twitbook and Fazer. She also explicitly agrees to share he r geolocation with Fazer, which she has never done with Twitbook. Her various settings,


de 22 25/10/2010 20:00


11/22

such as avatars, presence, mood indicators, time-of-day and geolocation context are also automatically synchronized. Then using her set of social connections, her

existing friends are automatically discovered on Fazer and she is given the option to add each of them or invite them if they are not on Fazer. A few months later she

quickly gets tired of Fazer after having made some new friends in the process of playing various augmented reality games, and she decides to completely remove her

profile from Fazer. However, as Fazer supports portability, Alice is able to download her own data to her profile manager at SocialAggregator and not lose touch with her

friends, including downloading their numbers automatically to her mobile phone and backing her valuable data up locally.

Profile Standards

A number of standards exist for profile and relationship information on the Web. One distinction among them is what data format (plaintext, XML, RDFa) the profile is

in and whether or not they are easily extensible. Even more importantly, there are differences in how, given a digital identity, any particular application can then try to

discoverand access the profile data and other capabilities that the digital identity may implement. While some profiles mention these discovery and use techniques

explicity and others do not, these common or standardized discovery techniques will be mentioned in context with each profile data format.

XRD

XRD (http://docs.oasis-open.org/xri/xrd/v1.0/xrd-1.0.html) (Extensible Resource Description), formerly YADIS and XRD-Simple (XRD-S), is a XML file format for

discovering what capabilities a particular profile provider may have [XRD!!]. For example, is it also an OpenID identity provider or does it provide PortableContacts

information? The XRD format provides this for arbitrary resources via the use of types and typed links describing URIs (URI templates) given in the XML format that

can then be queried by a user-agent. The work around XRD has led to a number of innovations for locating XRD besides the W3C-style use of content negotiation,

including the use of the IETF draft standards [.host-meta] and more generic .well-known (http://tools.ietf.org/html/draft-nottingham-site-meta-05) subdirectories from

any URI [HOST-META!!] [WELL-KNOWN!!]. Furthermore, the XRD file (or other metadata format) can be discovered to via possibly a combination of markup

directly in the document (such as a Link element in HTML), HTTP Link Headers in response codes, and then generic directories like .host-meta. The priority can be

determined by the IETF LRDD (http:// tools.ietf.org/html/draft-hammer-discovery-06#section-5) (Link-based Resource Descriptor Discovery) informational document

[LRDD!!], which has now been subsumed by the host-meta draft specification (ht tp://tools.ietf.org/html/draft-hammer-hostmeta-13#appendix-B) . The IETF [2]

(http://tools.ietf.org/html/draft-nottingham-http-link-header-10) Web Linking specification specifies an IETF standard for Link Registries [WEB-LINKING!!].

Overall, XRD seems useful and offers only modifications to XRDS (XRD Simple), and is similar to earlier W3C-inspired efforts using HTML and XLink like RDDL

(http://www.rddl.org/) . Despite the fact tha t XRD was originally developed in 2004 by the OASIS XRI (Extensible Resource Identifier) Technical Committee as the

resolution format for XRIs. is seemingly rid of the use of XRIs, which are custom URI-like identifiers for people and organizations. Due to technical concerns

(http://lists.w3.org/Archives/Public/www-tag/2008May/0078.html) and the use of at least previously patented technology (http://danbri.org/words/2008/01/29/266) for

XRIs, this is a step forward. There is also movement in the specification, as it seems developers want a JSON specification of XRD, tentatively called JRD

(http://hueniverse.com/2010/05/jrd-the-other-resource-descriptor/) (although there is no RDF serialization of XRD). The general discovery management also needs to be

integrated with content negotiation, but Web Linking and related specifications provide a much needed clarification of how to retrieve metadata about resources on the

Web.

VCard

The IETF standard vCard [VCARD!!] is the oldest and most widespread format for personal addressbook data, the kind of information typically found on a business

card, such as name, address. Therefore, this format serves in general as the common core of most data-formats, except for FOAF (leading to a the definition of vCard 3.0

in RDF [VCARD-RDF!!]). However, vCard 3.0 in general lacked the ability to describe social relationships and was serialized in a ASCII text format, so the VCard 4.0

activity at IETF [VCARD4] has provided improved semantics for properties about people and organisations (such as the ability to express groups of users, e.g. "Wine

Club members") and direct relationships between users ("friendship") and mechanisms to extend these terms. Syntactically, vCard can be expressed in its native format

similar to VCard 3.0 and in a new XML format [VCARD4-XML] similar to the PortableContacts XML format. VCard import and export is supported by most mail

programs like Thunderbird, Microsoft Exchange, and Apple Mail.

Based on vCard 3.0, profiles can also be embedded in HTML pages using the hCard microformat (http://microformats.org/wiki/hcard) specification from the

microformats process. One extension of hCard used by the microformat community is the XFN (XHTML Friends Microformat (http://gmpg.org/xfn/) ), which embeds its

own idiosyncratic social contact relationships directly into HTML links using the rel att ribute, and provides a set of finite at tributes (http://gmpg.org/xfn/11) to define

which kind of relationships exist between individuals (friend, co-worker, met). This kind of contact information based on hCard is currently deployed by sites such as

Slideshare, dopplr, and Twitter to express social networks and can be converted to formats like RDF via GRDDDL [GRDDL!!]. Overall, despite debates on alignment

vCard 4.0 promises to be a stable core set of terms for the Social Web.

FOAF

The first project that used standards to describe distributed, de-centralised social networks was the FOAF project (http://foaf-project.org) (Friend-of-a-Friend) [FOAF].

FOAF however only attempts to address descriptive challenges, rather than the entire problem space. FOAF provides an extensible and open-ended approach to

modelling information about people, groups, organizations and associated entities, and is designed to be used alongside other descriptive vocabularies. The FOAF project

also established the practice of linking together RDF documents, prompting the L inked Data (http://www.w3.org/DesignIssues/LinkedData.html) design note from Tim

Berners-Lee that kick-started the Linked Open Data movement. Despite these innovations, FOAF itself does not itself provide for "social networking" functionality. It

assumes other tools and techniques will be used alongside it, and does not itself specify authentication, syndication or update mechanisms. Today the vast majority of

data expressed in FOAF is exported from large "social network" sites. However when FOAF began, most of these sites (except LiveJournal (http://livejournal.com) ) did

not exist, and the conceptual model for FOAF was the personal homepage.

FOAF profiles can be used to describe both attributes of one's user as well as one's social network the foaf:knows (http://xmlns.com/foaf/spec/#term_knows) property).

The discovery of FOAF information currently supports tha t information being simply accessed via RDFa or Linked Data over HTTP, and for private profile data

authenticated using an identity provider before access. Current applications natively export FOAF profiles of their users, including hi5 (http://hi5.com/) , status.net

(http://status.net) , Drupal7 (http://drupal.org) , and SMOB (http://smob.me) . Various exporters have been created by the community to enable FOAF export of major

Websites (Twitter (http://semantictweet.com) , Flickr (http://apassant.net/blog/2007/12/18/rdf-export-flickr-profiles-foaf-and-sioc/) , Facebook

(http://www.facebook.com/apps/application.php?id=2626876931) , and last.fm (http://http://dbtune.org/last-fm/) .

FOAF is well-suited to enable a decentralized Social Web due to its the use of URIs and Web-scale linking. For instance, a URI like http://example.org/alice#me can be

used as Alice's identifier and Bob can state he knows Alive on his website by re-using this URI in either HTML or a RDF file. Like other RDF vocabularies, FOAF can

be easily extended in a decentralized manner, as done by the SIOC (http://sioc-project.org) vocabulary does as regards user-profiles and user-generated content, the

Online Presence Ontology (http://online-presence.net) and the relationship vocabulary (http://vocab.org/relationship) . However, while FOAF was created to demonstrate

the decentralized nature of distributed vocabularies, it's historic divergence from vCard and PortableContacts makes it difficult to use with current Social Web

appplicatons, along with the general perceived complexity of RDF and lack of adequate RDF tooling. The FOAF project does not propose FOAF as the format that

should be adopted for decentralized social networking; rather it is offered as a representational model that can find middle-ground between the semantics from diverseinitiatives ranging from digital libraries and cultural heritage to those used in the Social Web. Recent changes to the FOAF specification have brought parts of it into

closer alignment with the Portable Contacts work, and further such convergence is needed if FOAF is to be seen as a modern component of the technology landscape.

PortableContacts

An increasingly popular profile standard is PortableContacts (ht tp://portablecontacts.net/draft-spec.html) , which is derived from vCard, and is serialized as XML or,


de 22 25/10/2010 20:00


12/22

more commonly, JSON. It contains a vast amount of profile attributes, such as the "relationshipStatus" property, that map easily to common profiles on the Web like the

Facebook Profile [PORTABLE-CONTACTS!!]. More than a profile standard, the PortableContacts profile scheme is designed to give users a secure way to permit

applications to access their contacts, depending on XRDS for discovery of PortableContact end-points and OAuth for delegated authorization. It provides a common

access pattern and contact scheme as well as authentication and authorization requirements for access to private contact information. It has support from Google, Hi5,

Plaxo and others, and is a subset of the contact schema used by OpenSocial, so every valid OpenSocial provider is also a PortableContacts profile provider.

Originally as VCard 3.0 did not have an XML format, PortableContacts was the first realistic contact schema with an XML format. It is also a proper super-set of vCard

3.0 [VCARD3!!] and is very close to mapping on to VCard 4.0, as co-ordination work in the DAP group shows (http://www.w3.org/2009/dap

/wiki/ContactFormatsComparison) . Ideally, PortableContacts and VCard 4.0 could converge or gain an easy-to-understand super-set or subset relationship with each

other, as to reduce the friction between various profile data formats.

OpenSocial

(@@QUESTION: Am not that clear on how OpenSocial is a "profile" related thing, but I am not sure where it should go. My understanding is that it is an API for data

access, whereas most of the other things mentioned in this section are profile data-format standards)

OpenSocial is a collection of Javascript APIs, controlled by the OpenSocial Foundation, that allow Google Gadgets (a proprietary portable Javascript application) to

access profile data, as well as other necessary tasks such as persistence and data exchange [OPENSOCIAL]. It allows developers to easily embed social data into the ir

Gadgets. The profile data it uses is a superset of the PortableContacts and vCard 3.0 data formats. It does not require access to Google servers to run, but instead can

run-off the open-source Shindig (http://shindig.apache.org/) implementation and so positions itself as an "open" alternative to the Facebook Platform, and has been

supported by a number of vendors, such as Google, MySpace, Yahoo!, IBM, and Ning.

There is a rather unfortunate mismatch between OpenSocial Gadgets and W3C Widgets from the Web App WG (http://www.w3.org/2008/webapps/) , given that both

are primarily based on top of HTML and Javascript. Currently there is work being undertaken by Apache Wookie project (http://incubator.apache.org/wookie/) to

provide interoperability between Widgets and Gadgets, although ideally in the future the W3C Widgets would either be adopted or work more closely with major

vendors in the iteration. Also, W3C Working Groups like WebApps (http://www.w3.org/2008/webapps/) and DAP (http://www.w3.org/2009/dap/) (Device Access

Policy) are also producing APIs that involve contact information and so should ideally maintain some baseline compatibility with OpenSocial.

Social Media

The Social Web is not only the connections between people, but the connections between people and arbitrary resources, including messages like blog posts, audio,

photos, videos, and other resources. So social media is any resource that is used in a social relationship with a user. A user should also be capable of having connections

to "non-Web" resources like locations and items. For example, a user may "like" a particular musical style or "review" a particular album. The Social Web should offer a

way to avoid having identical user content stored in different social platforms. Users should be able to create, link to, and annotate social media with multiple social

applications to aggregate their social media together in designated social platforms, as well as being given the option to save the data to local storage (e.g. in their

browser). This is an extension of what is called by Berners-Lee "Linked Data" where links (connections) should be possible between arbitrary resources (anything

identified with a URI), not just hypertext web-pages. One of the most important features that will support the generation of media on the Social Web is provenance.

Provenance information should support the tracking of social media identifying when and how it came to be posted on a given social platforms and/or application on the

Web. Any such provenance information should be capable of answering questions such as "When was it originally posted?", "Where does it originate from?", and "Who

posted it?".

Problem: Fined for Consuming Social Media

Increasingly users and social platforms find themselves consuming social media, but not knowing if it's trustworthy or whether or not they can consume such social media

without a monetary fine, i.e. whether their usage breaks the content's copyright! Not knowing this information can lead to disaster. For example, re-tweets that

"pretended" to be from people in Iran were sent out during the political upheaval there in 2009 (@@QUESTION: HOW IS THE IRAN THING RELATED?!?! Copyright

VS impersonation). On a more personal note, people are now often downloading and re-using social media can now be fined for huge amounts of money, but many of

them are unaware that the data was under copyright in the first place. So many users would like to have mechanisms to automatically determine whether a Web

document or resource can be used, based on the original source of the content, the licensing information associated with the resource, and any usage restrictions on that

content. Without any provenance (the information about who created the data and how has it changed over time), users can not trust social media. This applies to social

applications themselves, whose reputation can be dependent on verifying sources, such as verifying the person or organization who created a news story in order to credit

the original source in its site, which most real-world social applications would like to do automatically for thousands of sites a day. With the increase in fines related to

social media consumption, users will want to be exceptionally well-informed about the social media they consume.

Use-case: Safely Drag-and-Drop Social Media Across Multiple Platforms

Alice enjoys taking photographs about penguins and would like to share them as widely as possible with her friends. Using an image processor on her laptop, she

fine-tunes her photos and publishes these to her personal blog using a graphical drag-and-drop interface that lets her just drop the photo into her blog and automatically

update social networking sites she uses. Since she controls not only her profile but her social media, she can easily attach a Creative Commons with attribution

non-Commercial license and ask for a small fee of 10 cents for commercial use. As Alice explores social media, she even finds herself even paying for some social mediashe finds useful, and she uses a simple micropayment policy that allows her to consume up to five dollars in social media a week without having to worry about fines. She

finds herself automatically paying tiny amounts of money for some social media to help support her friends and creators she likes and she finds herself collecting

micropayments for her penguin photos, allowing her to turn her hobbies into a way to help sustain herself. Also, not only can she drag-and-drop social media safely, she

can remove social media. When she discovers she has accidentally sent a message on Twitbook that spread a false rumor about an oil spill threatening penguins, she

retracts it immediately so she does not cause a panic. Not only is the message removed from Twitbook, but it's removed from other sites as that aggregated it as well!

(@@QUESTION. I don't think this section adds ANYTHING in the form of content, I vote to have it taken out)

Social Media Standards

Content

SIOC - Semantically-Interlinked Online Communities (http://sioc-project.org) - aims at developing a standard vocabulary for representing user-generated content on the

Web, using Semantic Web technologies. The SIOC ontology (a W3C member submission, but still evolving) consists in a core vocabulary (with classes such as

sioc:UserAccount and sioc:Item) and several modules, in part icular a Types one, providing classses for finer-grained content description (sioct:Wiki, sioct:WikiArticle,etc.). SIOC has strong ties with FOAF, so that it can be used to represent user-generated content of a person defined by the FOAF data format, and so that the content

can be distributed over the Web, following the decentralized Social Web vision. In addition to the vocabulary, various tools have been designed, ranging from APIs to

produce SIOC data, to systems identifying, crawling or consuming it. Also, SIOC is supported by Yahoo! SearchMonkey, and is used in Drupal7 as one of the core

vocabularies used to expose machine-readable data about a given website's to the Open Web.

(@@QUESTION: I thought Yahoo! pulled SearchMonkey?)


de 22 25/10/2010 20:00


13/22

Tagging

Tagging is a powerful and massively deployed means of categorizing content on the Web, a s deployed for bookmarks ([3] (http://del.icio.us) , photos (Flickr

(http://flickr.com) ), videos (YouTube (http://youtube.com) ), and blog posts. Unlike more complex categorization methods, the simplicity and ease of entering natural

language keywords appeals to users. However, there are problems with interoperability. The two general approaches have been towards a common API for tagging via

the currently inactive TagCommons (http://tagcommons.org/) effort and an approach using some sort of common data-model based on RDF. There is also the rel:tag

microformat (http://microformats.org/wiki/rel-tag) is used to link an item to its tag(s).

Most of the data-models use a tripartite model of tagging as the relationship between a User, a Resource and a Tag.

There has been a long history of tagging vocabularies, ([www.holygoat.co.uk/projects/tags/ TagOntology], SCOT (http://scot-project.org) , MOAT (http://moat-

project.org) ). The most recent effort in the area is the CommonTag vocabulary (http://commontag.org/ns) that solves ambiguity ('apple') and heterogeneity ('socialweb',

'social_web', 'socweb') by means of an additional link to a resource in order to represent the tag's meaning, such as URIs from the Linking Open Data(http://linkeddata.org/) project to represent that meaning. NiceTag (http://ns.inria.fr/nicetag/2010/09/09/voc#) explicitly puts each tagging act inside a named graph that

receives its own URI to make it easier to add context such as where it was performed and license information [NICETAG!!]. All of these vocabularies are easily

extensible, and CommonTag is supported by various players in the Social Web area, including Yahoo!. [COMMONTAG!!].

Microformats

Microformats (http://microformats.org/) are a simple way to embed semantics in ordinary HTML by re-using established HTML attributes such as 'rel', 'class', and 'rev'

with a set of string values given definition by a number of vocabularies [MICROFORMATS!!]. These vocabularies are meant to standardize common information (like

contact information hcard) on the Social Web. For example, social sites often allow users to rate online content using some simple integer (like "1-5 stars"). The hReview

(http://microformats.org/wiki/hreview) microformat allow to represent this ratings in a structured way. Overall, the approach to using microformats has been massively

successful in deployment, with over two billion web-pages marked up in microformats, about 5 percent of web-sites (http://www.readwriteweb.com/archives

/google_semantic_web_push_rich_snippets_usage_grow.php) .

While easy-to-use, microformats specializes in a finite number of vocabularies, with these being done via a centralized and informal process based around

Microformats.org (http://microformats.org/) . Alternative decentralized approaches like RDFa aim at the "long-tail" of vocabularies, which allow arbitrary RDF data to

be o put inside HTML. The microdata (http://dev.w3.org/html5/md/) propoal of HTML5 also lets arbitrary attribute-value pairs to be put inside HTML. However,alternative approaches to microformats have not reached widescale deployment, although RDFa is now used in Drupal and all three kinds of semantic markup are

consumed by Google Rich Snippets (http://www.google.com/webmasters/tools/richsnippets) . 94 percent (http://www.readwriteweb.com/archives

/google_semantic_web_push_rich_snippets_usag

final report - social web - w3c

Documents