ai andthe future of free journalism

Upload: kenneth-lipp

Post on 04-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 AI andThe Future of Free Journalism

    1/31

    1

    The Future of Journalism:

    Artificial Intelligence

    And Digital Identities

    Noam Lemelshtrich Latar

    Sammy Ofer School of CommunicationsIDC Herzliya

    Israel

    David Nordfors

    Stanford Center for Innovation and Communication

    Stanford University

    Feb 2011

  • 8/14/2019 AI andThe Future of Free Journalism

    2/31

    2

    Table of Contents1 INTRODUCTION

    32 DEFINING JOURNALISM IN THE DIGITAL AGE 73 ESTABLISHING THE DNA OF JOURNALISTIC CONTENT 93.1 CONTENT BASED IMAGE RETRIEVAL (CBIR) .................................................... 10

    3.2 VIDEO INFORMATION RETRIEVAL ...................................................................... 11

    3.3 HUMAN CENTERED CONTENT ANALYSIS ......................................................... 11

    3.4 THE DNA OF LITERATURE ............................................................................ 12

    4 JOURNALISM CONTENT AND CONSUMER ENGAGEMENT 124.1 THE CONCEPT OF MEDIA ENGAGEMENT .......................................................... 12

    4.2 BEHAVIORAL TARGETING AND JOURNALISTIC CONTENT .................................... 15

    4.3 BEHAVIORAL TARGETING IN SOCIALNETWORKS ............................................... 15

    4.4 PROJECT SMART PUSH ................................................................................ 16

    5 AI: DIGITAL IDENTITIES AND BEHAVIORAL TARGETING ENGINE17

    5.1 MANAGING DIGITAL IDENTITIES DEVELOPING A UNIVERSAL STANDARD .......... 17

    5.2 DIGITAL IDENTITIES AND SOCIALNETWORKS .................................................... 18

    5.3 SOCIO-GENETICS AND DIGITAL IDENTITY ....................................................... 19

    5.4 BEHAVIORAL TARGETING AI ENGINE BASED ON JOURNALISTIC CONTENT AND

    CONSUMER DIGITAL IDENTITY ................................................................................ 20

    6 DIGITAL IDENTITIES AND WEBLINING217 DIGITAL IDENTITIES AND THE PRACTICE OF JOURNALISM228 PRINCIPLES OF JOURNALISM AND DIGITAL IDENTITIES248.1 PRINCIPLES FOR USING DIGITAL IDENTITIES FOR JOURNALISM .......................... 25

    8.2 NEED FOR FURTHER DISCUSSION BETWEEN STAKEHOLDERS IN SOCIETY ......... 25

    REFERENCES................... 27ABOUT THE AUTHORS 30END NOTES ........................ 30

  • 8/14/2019 AI andThe Future of Free Journalism

    3/31

    3

    The Future of Journalism:

    Artificial Intelligence and

    Digital IdentitiesInteraction between journalism, the Internet and social communities is

    familiar and intensely discussed, helping us understand how journalism can

    raise our collective intelligence. We discuss how artificial intelligence (AI)

    will add to that picture and thus influence the future of journalism. We

    describe 'Digital Identities' and their future interaction with journalism. Wesummarize state-of-the-art AI methods usable to establish the 'DNA' of

    journalistic content, how matching that content with digital identities enables

    behavioral targeting for consumer engagement. We review the driving forces

    such procedures may introduce to journalism and show an example of a

    journalistic behavioral-targeting engine. We highlight some concerns and

    discuss how using digital identities and AI can be complex versus current

    journalistic principles. We stress the need for ethical principles in using

    digital identities in journalism, and suggest examples of such principles. We

    issue a call for stakeholders to jointly explore the potential effects of AI

    algorithms on the journalism profession and journalism's role in a

    democratic society and suggest questions to be explored.

    1. Introduction

    Computer-assisted intelligence is part of life: augmented intelligenceiof

    individuals using personal computers and collective intelligence of groups whennetworking. Finally, there is Artificial Intelligence (AI), when computers act

    intelligently without human interaction, mimicking human intelligence (Turing, 1950)

    These intelligences are blending and converging. Augmented individual

    intelligence, Collective intelligence and AI are co-evolving. The Internet is

    becoming part of our minds and our minds are becoming part of the Internet.

    Journalism is part of IT-assisted intelligence. Personal computing entered

    journalism in the 80s, the Internet in the 90s, and we are now seeing the explosion

    of social interaction enter journalism, ranging from reader comments to crowdsourcing.

  • 8/14/2019 AI andThe Future of Free Journalism

    4/31

    4

    The interaction between journalism, the Internet and social interaction is familiar

    and intensely discussed, helping us understand how journalism can help increase

    our collective intelligence. Here we study how AI may contribute through

    algorithms being developed for rating news, based on mixing systems for

    aggregating crowd opinions (collective intelligence) and smart algorithms for

    contextual analysis (AI).

    Ratings help control societal systems. Any recognized rating method influences

    societal development; people will try to improve their ratings. A rating that

    changes peoples lives represents a complex issue. Even if everyone finds a rating

    annoying and counterproductive, it will still influence the system, given that people

    think others recognize the rating. Indeed, journalism must scrutinize and challenge

    rating systems and explore alternatives. Intelligent algorithms rating journalism,

    such as TechMemeii, strive to share in public perception of which tech journalism

    matters more than others. This may incent journalism to optimize stories to rank

    high. This is only one example of how AI is co-evolving with journalism.

    Journalisms role is to focus attention on stories that interest the public. For

    journalism to remain meaningful, it should also empower the audience. So how

    does it interact with individuals augmented intelligence, societys collective

    intelligence and machine AI? Ideally, journalism raises intelligenceempowering

    the audienceas it uses higher intelligence around it, i.e. the audience and the

    machines.

    AI algorithms are changing professional journalism and related academic research.

    AI is penetrating journalisms traditional pillars: journalistic content (via automatic

    content analysis in all media formats and delivery systems) and advertising (by

    measuring consumer attention and targeting ads per user digital identity or

    personality, measured by behavior). Both content and advertising are changing

    dramatically.

    The new media and AI technology based on computings growing power are the

    change agents. Interactive new media is permitting, for the first time, accurate

    measurement of the attention each user gives to journalistic content. Advertiserswill demand full validation of consumer ratings. Existing measuring methods will

  • 8/14/2019 AI andThe Future of Free Journalism

    5/31

    5

    vanish. Fierce competition will arise in selling consumer attention to advertisers,

    whose ROI (Return On Investment) will determine the fate of channels for

    advertising, including journalism paid by ads, across all media formats.

    Journalistic content is undergoing major changes via interactive platforms that

    make media content available continually, everywhere. Until recently, the mass

    media for distributing content were controlled by the same companies that

    produced content. The traditional business model for news and entertainment

    included controlling and bundling both medium and content. But with the Internet,

    a new generation of media incumbents is arising. Companies such as Twitter,

    Facebook or Google consciously avoid producing content. They do not do

    journalism; they only provide access to journalism.

    Journalism is separating from the media (Nordfors, 2008)iii

    . The latest generation of

    producers of journalism is no longer involved in the processes or infrastructures of mass

    communication. They focus on producing content and publishing on-line,

    delivering it via the infrastructures of the new content-neutral media entities. The

    Huffington Post and TechCrunch, started as blogs, are now large and important

    publications, without controlling the infrastructure for spreading content.

    Traditional media spend hugely to measure readerships, estimating their sizes and

    attention probabilities and creating statistics and probabilities for advertisers. On

    the Internet, the new media offer content producers and advertisers not

    probabilities but hard data: which user looked at what, where, when and for how

    long. Advertisers know if a reader clicked their ad. Traditional media ads are

    indiscriminate, broadcast to all consumers and costing the same, regardless of how

    many people pay attention or act. The Internet enables contextual advertising,

    where advertisements shown to each user are selected and served by automated

    systems based on content displayed to the user.

    Monitoring users and adapting content and ads to individuals is revolutionizing

    content, media and marketing. In a digital-interactive world, marketing must

    account for media spending. The ROI in advertising and targeting content is

    becoming a science, driving development of advertising, media and content. In

    2007, global ad spending was estimated at $385B

    iv

    , equal to the 2008 GDP of theworlds 26th largest economyv.

  • 8/14/2019 AI andThe Future of Free Journalism

    6/31

    6

    Targeting content per consumer digital identity will require AI engines to analyze

    multi-dimensional content vs. attributes of the engaging experience and a

    consumers total beingrelate human DNA, content DNA and context DNA

    (attempts to identify successful music and literature DNA already exist). Research

    in biology, genetics and psychology that explore and identify links between

    individuals genetic codes, cognitive attributes and pro-/anti-social behavior is

    merging with data mining relating to Web 2.0 social-network activities aimed at

    consumer profiling. Digital Identities will integrate a person's genetic code with

    data derived from web clicks. People will pay with privacy for social networks

    benefits.

    New AI algorithms analyze contenttext, video, audio and still imagesto

    annotate (tag) content automatically. Global efforts are creating unified digital-

    identity standards to individuals and use AI engines to target, code and annotate

    content automatically vs. digital identity. This will affect journalistic content

    significantly and may revolutionize journalism and its academic research.

    Journalism must adapt and investigate new business models (Lemelshtrich Latar & Nordfors,

    2009)vi.

    In this article, we describe digital identities and new global standards for digital

    identities, the use of social networks, genetics and virtual worlds for creating

    digital identities and the new AI research being used for adapting content to digital

    identities. Scientists are converting journalistic content to math formulations

    (signatures) to understand content and context. We probe the popular concept of

    media engagement and its derivativesbehavioral targeting, contextual targeting,

    and how AI is used in social networks to target content and ads. An AI engine

    that can filter and target journalistic content based on the consumers digital

    identity, to maximize the ROI of every dollar spent on advertising will be

    described.

    We highlight some concerns and discuss how using digital identities and AI can be

    complex versus current journalistic principles. We stress the need for ethical

    principles in using digital identities in journalism, and suggest examples of such

    principles. We issue a call for stakeholders to jointly explore the potential effectsof AI algorithms on the journalism profession and journalism's role in a democratic

  • 8/14/2019 AI andThe Future of Free Journalism

    7/31

  • 8/14/2019 AI andThe Future of Free Journalism

    8/31

    8

    hold on to dated one-to-many media technologies. Their business models, based on

    controlling the medium and the content, have been difficult to move to the Internet.

    In cases where new business models for ads have succeeded, such as Google, eBay or

    Craigslist, the brokering of ads is not integrated with the practice of journalism.

    Journalisms essence is described in principles of journalism, as suggested by the

    Pew Research Centers Project for Excellence in Journalism (PEJ) and the

    Committee of Concerned Journalistsviii

    :

    1 Journalisms first obligation is to the truth;2 Its first loyalty is to the citizens;3 Its essence is a discipline of verification;4 Its practitioners must maintain independence from those they cover;5 It must serve as an independent monitor of power;6 It must provide a forum for public criticism and compromise;7 It must strive to make the significant interesting and relevant;8 It must keep the news comprehensive and proportional;9 Its practitioners must be allowed to practice their personal conscience.

    These principles remain, even when we no longer know what the media are.

    Consider a new, short definition of journalism, separating it from the media,

    connecting journalistic principles based on the relation between journalism and its

    audience, rather than on its relation to the communications medium it uses (which

    is what is causing the confusion today). Take, for example, the following

    suggestion (Nordfors, 2009)

    Journalism is the production of news and feature stories, bringing public

    attention to issues that interest the public. Journalism gets its mandate from

    the audience.

    Journalism must act on behalf of its audience, not its sources or advertisers.

    Journalism often has business models based on attention work (Nordfors, 2006). generating

    and brokering attention, such as selling ads. Therefore, much journalism is attention work

    performed with a mandate from the audience. When attention work is done with a mandatefrom the sources, it is public relations and publicity, not

  • 8/14/2019 AI andThe Future of Free Journalism

    9/31

  • 8/14/2019 AI andThe Future of Free Journalism

    10/31

    10

    content automatically searches for consumers based on their digital identities. But

    first we describe current research on automatic knowledge retrieval from

    journalism multimedia content. Most research tools used by the different

    communities aim at dividing content into small content digital units, analyzing

    them, tagging these sub-content units and then carrying an integrative analysis to

    conceptualize the entire content meaningfully for the consumers. Some researchers

    convert the visual content into mathematical formulations that can then be

    subjected to analysis employing AI algorithms(Jeon, Laverenko, and Mammatha, 2007).

    3.1 Content Based Image Retrieval (CBIR)

    The primary method used in the search for image retrieval or automatically

    conceptualizing visual content is dividing the visual frames into smaller

    sections/regions termed blobs. This is achieved by using statistical tools such as

    clustering. Each blob is annotated with text. The visual image is described by

    employing categories such as color, texture, shapes, and structures. Statistical

    theories are used to associate words with image regions that are then compared

    with human manual annotations of similar images (Smeulders et al., 2000). Attempts are

    made to describe images using vocabulary of blobs as proposed by Duygulu et al. (2002).

    Jeon et al. (2007)proposed a method for using training set of annotated images for cross

    media relevance model for images.

    CBIR researchers are developing mathematical descriptions of images defined as signatures.

    The signatures describe an image in mathematical formulations to let researchers measure

    content similarities between image frames. Statistical methods such as clustering and

    classification form image signatures that will allow automatic similarity measurements by

    machines. Images are segmented by features such as color, texture differences, shapes and

    other salient points.

  • 8/14/2019 AI andThe Future of Free Journalism

    11/31

  • 8/14/2019 AI andThe Future of Free Journalism

    12/31

    12

    3.4 The DNA of Literature

    Before the invention of computers (but after Boolian Logic and Bayes Theorems

    laid the mathematical foundations of modern computers and algorithms), in the

    mid 19th century, the French writer George Polti, (b.1868) analyzed the elements of

    successful literature, its DNA. Polti listed thirty-six dramatic situations in good

    drama, including prayer to the supernatural, crime pursued by vengeance, loss or

    recovery of a lost one, disaster, remorse, revolt against a tyrant, enigma and others.

    Poltis list remains popular and writers often use it in developing stories. Terry

    Rusio, the Shrek scriptwriter, said he referred to Poltis list to resolve a situation

    in the films plot. To create his list, Polti analyzed classical Greek texts and French

    literature. His analysis of the DNA of drama was followed by other writers.

    These attempts to discover good story elements and persuasive drama were not

    written from an information-retrieval perspective but provide the literary blobs

    that will let researchers dissect, in our case, a journalism story along its content

    elements.

    These content elements, found in texts, should receive mathematical formulations

    that will allow computer-based analysis. One should be able to retrieve content

    based, for example, on Poltis thirty six situations or preferably other sets of

    situations, which may be determined by modern data analysis of journalism stories,

    for comparative analysis or for marketing news stories to consumers based on their

    digital identities.

    4. Journalism Content and Consumer

    Engagement

    4.1. The Concept of Media Engagement

    The economic engine driving journalism in the non-public service media has until

    now been advertising-based. Journalism companies, regardless of media platform

    (paper, video, audio) have sold consumer attention to advertisers. Though

    inaccurate, rating was the key measuring tool until the Internet. No pre-Internet

    rating technique can measure real attention by individual consumers to specificcontent. New media platforms, with the advance of interactive new media, make

  • 8/14/2019 AI andThe Future of Free Journalism

    13/31

    13

    the competition for consumer attention fierce and complex. The journalism

    industry now needs to develop new ways to measure consumer attention in

    multiple parameters, including the consumer cognitive and behavioral profiles and

    context parameters. The interactive nature of the new media platforms begins to

    allow for scientific measurement of consumer attention along personal dimensions.

    In this new battle for consumer attention the concept of engagement, a relatively

    new term, is being used to describe the new relations between consumers and

    journalistic content. The Advertising Research Council (ARC) has devised the

    following definition of media engagement: Engagement is turning on a prospect

    to a brand idea enhanced by the surrounding context the working definition

    proposed by ARC encapsulates the ultimate objective of linking positive effects

    towards a brand with brand advertising within the environment of the program

    content" (Kilger, and Romer, 2007).

    Context within which content is delivered is becoming of prime importance. Kilger

    and Romer identified three mechanisms that enhance consumer engagement in a

    journalistic content:

    Cognitive (relevance of the program and advertisement to the consumer)

    Emotional (the extent to which one likes the content and advertising)

    Behavioral (paying attention to the program and advertising content) (ibid) .

    The main hypothesis is that the more engaged consumers are the more they will

    spend on the advertised product. This recognition by the advertising world, that

    engagement in journalistic content involves consumer cognition, emotional profile

    and behavior, provides relevance to computer-based information retrieval as

    applied to content analysis.

    Research by Kilger and Romer (ibid) about the relationship between media engagement and

    product-purchase likelihood reveals that as engagement measures increased so did the mean

    likelihood of products advertised in the media to be purchased. Three media platforms were

    studiedtelevision, Internet and printed magazines. All three exhibited similar findings.

    Internet and magazines exhibited very close response curves, while TV followed a similar

    path but slightly lower mean of purchase likelihood." (ibid).

  • 8/14/2019 AI andThe Future of Free Journalism

    14/31

    14

    The personal parameters Kilger and Romer examined were those traditionally used in social-

    science research: gender, age, education, income, race and marital status.

    Age, income and race mattered. In the TV and Internet, people with lower

    education expressed higher levels of trust in the media and older people reported

    lower engagement. The finding that personal attributes affect media engagement is,

    as can be expected, of great relevance regarding digital identities. Digital identities

    are valuable to advertisers who will not hesitate to take advantage of them once

    available on a large scale and accessible automatically. The road to influence

    journalistic content in the direction of higher consumer engagement is short. Kilger

    and Romer considered a limited number of personal parameters, as a larger number was too

    many to fill within the space constraints of this article." (ibid).

    The Internet offers ways to measure and broker not only consumer attention and

    engagement but also consumer interaction. A pay-per-click model does this, as

    advertisers will pay not for being visible but for consumer clicks, an action. This

    can be taken further. For example, a click on an ad will usually lead to a sales site

    and may result in further interaction between the consumer and the vendor,

    including a purchase. So ads, thus journalistic content, could in principle be paid

    by finders fees. This could, however, introduce business incentives for journalists

    that might jeopardize journalistic principles.

    To convert the content-engagement/product-purchase relations into a science

    requires the analysis of many variables including contextual ones, and requires

    automation and the introduction of artificial intelligence: Excelling during an era

    of frugality in high expectations requires digital marketers to be accountable for

    every dollarThe ROI focus will force agencies to improve effectiveness and

    we see increased dependence on automation recent shifts [in the liberal

    direction] in user privacy perceptions have created a window for marketers to use

    AI to run efficient campaigns."(ibid).

    The ultimate goal of engagement as perceived by the advertising industry is to

    target advertisements to consumers based on contextual and personal parameters as listed by

    the Kilger group: cognition, emotions and behavior. This today is being

    done and researched in the new media channels, termed Behavioral Targeting' by

    academic researchers, journalists and the advertising industry.

  • 8/14/2019 AI andThe Future of Free Journalism

    15/31

    15

    4.2 Behavioral Targeting and Journalistic Content

    In the late 90s a new marketing field gained academic and industry attention:

    Behavioral Targeting. Recent advances in Internet and Web 2.0 interactivity,

    characterized by consumers becoming content creators and providers, have opened

    new frontiers for targeting ads to consumers based on interactive behavior.

    Behavioral Targeting is the ability to deliver ads to consumers based on their

    behavior while viewing web pages, shopping on-line for products and services,

    typing keywords into search engines or combinations of all three(Aho Williamson, 2005).

    Many Internet companies are involved in behavioral targeting, including Google,Microsoft and Yahoo. M. Kassner (2009) surveyed Googles extensive use of behavioral

    targeting. Google confirms this in its official website. Google uses two separate systems,

    Adwords and AdSense. Adwords targets ads based on the search subject matter by identifying

    search keywords. AdSense targets ads based on website content the consumer views for

    example if you visit a gardening site, ads on that site may be related to gardening." (ibid).

    AdSense was extended to searching annotated images and videos in YouTube. According to

    Kassner, Google is also trying to present relevant advertisements in the Gmail

    applicationby scanning every Gmail message for spam and sending ads based on the

    keywords the whole process is automated and involves no human matching ads to the

    Gmail content." (ibid).

    Googles rationale is that by making ads more relevant to customers it brings them

    more value. So far, AdSense and Adwords, in all their applications, are still based

    on text analysis. Once image and video content are analyzed and annotated

    automatically, behavioral targeting will likely be applied to all journalistic content.

    4.3 Behavioral Targeting in Social Networks

    Social networks characterized by voluntary profiling by members uploading

    personal data in texts, pictures and videos are ripe for behavioral targeting. Social

    network members profiles include lists of friends, hobbies, demographics and

    other interests. Behavioral targeting is growing rapidly in social networks. Startups

    are devising behavioral-targeting technologies developed for social networks.

    Stefanie Olson (2008)describes one example: 33Across.com. The New York-basedcompanys algorithms can follow consumer behavior patterns in social networks,

  • 8/14/2019 AI andThe Future of Free Journalism

    16/31

    16

    identify sociograms among members and identify for advertisers the more

    influential members and the viral propagators by studying message dynamics.

    Universal Pictures is using 33Across to study how people share studio trailers or

    content with their friends (ibid).Other companies that started to use behavioraltargeting on social networks for marketing advertisements include Reverence

    Science and Tacoda Systems (bought by AOL, now a full subsidiary). Yahoo

    launched SmartAds, to combine behavioral information with demographic data for

    targeting ads. Behavioral targeting ad spending is projected at $1B in 2010,

    growing to $3.8B by 2011 (Mills, 2007).

    Behavioral targeting raises serious privacy issues discussed extensively in

    academic literature and political circles. The issue of privacy vis-a-vis consumer

    profiling is beyond the scope of this paper. Tim Berners-Lee, credited with

    inventing the World Wide Web, spoke before the U.K. parliament on privacy and

    the Internet. He said that he came to raise awareness to the technical, legal and

    ethical implications of the interception and profiling by ISPs in collaboration with

    behavioral targeting companies.(Watson, 2009). He continued: It is very important that

    when you click, you click without a thought that a third party knows what we are

    clicking on I have come here to defend the Internet as a medium. (ibid).

    But surveys by TRUSTe (a privacy company) shows that the public show a

    willingness to submit to monitoring and enhanced content delivery.(Olsen, 2008).This is

    a remarkable finding that should be followed.

    4.4 Project Smart Push

    Davitz of SRI applies machine-learning techniques to study communications in

    social networks as part of a multimillion dollar project funded by the Defense

    Advance Research Project Agency (DARPA) of the U.S. Department of Defense.

    Davitzs objective was to automatically monitor peoples interest and influence in

    military communities to identify the influencers then to ensure that they see

    relevant information in news feed to that topic." (Oslen, 2008)Davitz calls this targeting of

    news according to members interest profiles Smart Push. According to Olsen, SRI is

    looking at commercial applications for it not related to advertising you can already learn

    more about people from MySpace and Facebook." (ibid).

  • 8/14/2019 AI andThe Future of Free Journalism

    17/31

    17

    When a powerful research institute like SRI promotes concepts like Smart Push,

    news media, when rating is king, will adjust journalistic content to fit consumers

    digital profiles. This may be done by using an AI engine to filter or webline

    services based on digital identities.

    5. AI: Digital Identities and Behavioral

    Targeting Engine

    5.1. Managing Digital Identities Developing aUniversal Standard

    The consumers digital identity is a vital component in this process and will

    directly affect the type of services and information he or she will receive.

    Today, the global knowledge industry invests great resources in developing and

    improving management techniques of digital identities. Digital-identity

    management is developing rapidly and is called federated identity management.

    The term federated identity refers to various components of users profiles

    gathered while they surf on different sites and consolidated into uniform profiles

    according to a global standard. The term is also used for adoption of standards for

    the consumer-identification process on the various platforms. Currently, the most

    acclaimed standard for constructing digital identity is called SAML2, Security

    Assertions Markup Language 2.0;ixit enables consolidation of digital identities of

    surfers on various platforms and management of those identities; and it allows

    mobilizing various parts of the surfers identity definition, defined on different

    social networks, and merging them into one virtual profile. The standard was

    successfully assimilated in financial organizations, academic institutions, the

    American electronic government and more.

    Adoption of international standards for defining digital identities is significant. It

    will enable researchers to follow surfers in any site in cyberspace and carry out

    widespread studies on the connection between the users digital identities and their

    personalities, fields of interest and cognitive abilities. Every surfer has a uniquelydynamic way of surfingderived from the person's ability to make decisions,

  • 8/14/2019 AI andThe Future of Free Journalism

    18/31

    18

    memory and additional cognitive factorsrendered to automatic cognitive

    diagnosis through AI algorithms.

    Soon AI algorithms will be able to construct a personal digital identity for every

    person performing actions on the Internet. Data-mining robots will be able to

    analyze texts, video and audio contents and transform them into sociological DNA

    (SDNA) that will describe the individual personality (Lemelshtrich Latar, 2004).

    Constructing the digital identity is a dynamic process updated as long as the person is active

    on the Web.

    5.2 Digital Identities and Social Networks

    One of the main Internet uses is activity in social networks. Today, millions of

    people belong to social networks that answer many needs, social, economical and

    political. A social network is a group that maintains connection to exchange

    information in text, video, photos or voice or for social purposes. Every network

    member must give personal details about themselves, and these are exposed to the

    other network members or part of them, according to the users choice(Boyd andEllison,

    2007). Some major networks, originally constructed as reservoirs for content to serve the

    surfers, see their purpose today in providing services, information and products adapted to

    members digital identities. In September 2007 the network Myspace informed its

    shareholders that it intended to undertake data mining, using the profiles and blogs of

    approximately one hundred million of its members, to direct advertisements and services to

    them. Thus, this is the start of a screening system that will provide services and information to

    members according to their digital identity(Abramovitch, 2007). The declared objective is to

    improve the membership experience on the network, to add value to the user experience

    (almost a paraphrase of Aldous Huxley in Brave New World).

    Social networks create a substantial and dangerous expansion of the digital-identity

    notion to include complete mapping of surfers social and professional

    connections. This mapping will accompany the surfers in all human activities and

    may become a powerful filter that will limit the information and possibilities

    presented to them, without them being aware of it.

    5.3 Socio-Genetics and Digital Identity

  • 8/14/2019 AI andThe Future of Free Journalism

    19/31

    19

    The mind and the body hang together, and science is constantly improving the

    knowledge about it. We know today that social behavior is linked to genetics.

    Understanding these connections, and how they work in a social context, is

    powerful for constructing digital identities and can be valuable for analyzing the

    body, mind and ecosystem surrounding them: society. So information about

    peoples genetic codes may be as rewarding for constructing digital identities as

    the information from social networks.

    Research and instrumentation for mapping mans genetic code, gene sequencing,

    are developing rapidly at leading research institutes and large commercial

    companies worldwide. Their main objective is to identify genes associated with

    hereditary diseases and to develop medication based on genetic treatment. Since

    the completion of the Human Genome Project in 2001, commercial competition

    has arisen between companies for producing machines that map the genetic code of

    man. The main research project in this field is the Personal Genome Project. x

    The connection between genes and human traits, and the entry of information-age

    giants such as Google and leading research centers such as Harvard and Cornell

    into the field of genetic research, should close the knowledge research gaps much

    faster. The large volume of participants in these studies, the vast databases holding

    participants digital identities and data mining peoples social behavior on the

    Internet, together with the use of smart algorithms is helping science to begin to

    predict social behavior, both pro-social and anti-social, according to the genetic

    mapping of humanity.

  • 8/14/2019 AI andThe Future of Free Journalism

    20/31

  • 8/14/2019 AI andThe Future of Free Journalism

    21/31

    21

    The model shows a dynamic learning model constantly updated as it learns the

    consumer profile and content preferences. Unknown factors are expressed by

    probabilities constantly updated in the learning process. Journalistic content will

    be monitored constantly as consumers interact and make choices. The AI engine

    will also monitor context parameters and consumers emotional state during

    interaction by analyzing verbal or other reactions. A brief description of the

    information flow:

    Step One: All journalistic content is analyzed by AI smart algorithms and receive

    automatic annotations (tags);

    Step Two: Consumers digital identities and annotated content are fed to the

    Assessment Rule Engine for initial content determination; proper ads are sent to

    consumers based on their profiles;

    Step Three: Consumers interact with the content and advertisements; this

    interactivity is monitored constantly and consumer attention measured;

    Step Four: The Learning Engine analyzes consumer feedback and automatically

    adjusts the probabilities to better describe consumer behavior; new content is sent

    to consumers;

    Step Five: The Learning Engine transmits updated information to a Personal

    Memory database where a consumer media profile is created and constantly

    updated;

    Steps four and five continue indefinitely to allow the AI engine to accurately

    predict consumer content and product interests/choices in varying contextsthe

    Learning Process section.

    6. Digital Identities and Weblining

    Filtering journalistic content vs. consumer profiles could lead to serious social

    inequality. Marcia Stepanek coined weblining to describe this phenomenon:

    Call it weblining, an information-age version of that nasty practice of red lining,

  • 8/14/2019 AI andThe Future of Free Journalism

    22/31

    22

    where lenders and other businesses mark neighborhoods off limits. Cyber space

    doesnt have geography but thats no impediment to weblining [] weblining may

    permanently close doors to you or your business."(Sterpanek, 2000).

    New York University sociologist Marshall Blonsky adds to the meaning of

    weblining: If I am weblined and judged to be of minimum value, I will never

    have the product and services channeled to me or the economic opportunities that

    flow to others over the net." (ibid).

    Digital identity is at the core of weblining. Though the emphasis of Stepanek and

    Blonsky is on economic aspects of commercial organizations, the described

    phenomenon is also true in spreading journalistic content based on profiling. The

    economic forcesadvertisers and the journalism organizationscannot be

    expected to show altruism and create mechanisms to protect our right to equal

    accessibility to content. More seriously, no one can protect us from the effects of

    the need to target content per consumer profiles on the quality of journalistic

    content.

    7. Digital Identities and the Practice ofJournalism

    From the point of view of journalism practice, the emergence of digital identities

    suggests that publishers and journalists will be able to simulate and measure what

    their news stories will do for audiences and the other stakeholders in their

    storytelling, while they are developing the story. They would be able to test run

    stories before publication, much as advertisers now do with new product tests. This

    will introduce interesting opportunities and challenges for journalism.

    Simple on-textual advertising need not threaten journalistic principles of separation

    between content production and selling audience attention to advertisers (who may

    have stakes in the stories). But as contextual advertising starts to understand

    content, context and audience better, ads will be placed precisely. Present pay-perclick

    business models will create an incentive for publishers to focus on stories that match ads. If

    so, it will threaten journalistic freedom. The classic separation of

    Church and State, the metaphor used by publishers to distinguish between news

  • 8/14/2019 AI andThe Future of Free Journalism

    23/31

    23

    stories and paid advertising, will blur.

    For example, consider a situation where readers use their digital identities,

    combined with a series of filters, to select news stories they want to be brought to

    their attention. Lets say the quality of filters and digital identities is good enough

    to estimate both the chance that a story will catch the reader attention and the

    chance it will lead to action by the reader. Now consider a set of contextual

    advertisers (these can also be digital identities) that will pay for attention and

    interaction with readers. Consider a journalist with access to these digital identities

    and filters, as well as access to the contextual advertisers, when writing a story.

    The journalist can test the story on digital identities representing both audience and

    advertisers as the story is written. The journalist can adjust the writing to receive

    the best results, a combination of what the journalist, the audience wants and the

    advertisers want.

    Consider, finally, that the journalists own digital identity will be included in the

    interaction, The journalists digital identity is combined with a set of filters for

    selecting themes that the journalist wishes to cover, connected to readers and

    advertisers digital identities, and exposed to a news ticker type flow of events,

    e.g. all the twitter feeds, the blogosphere and all the other news feeds on the

    Internet. It can be data flow from stock markets, sensors measuring weather or

    earthquakes etc. The journalist can then be tipped off about events that will

    produce suitable matching between his/her own interests, and the interests of the

    audience and advertisers.

    Thus producing a successful story is equal to solving a dynamic equation involving

    the journalist, the audience and the business model, e.g. the advertiser. Producing a

    journalistic story while guided by the interaction between the digital identities and

    the filters can be seen as an iterative, heuristic solution of the equation, identifying

    overlapping interests and optimizing the combined actions into a result maximizing

    value for each party. In each interaction, real-life users behind the digital identities

    give feedback, reinforcing or modifying digital identities and filters actions, to

    improve the outcome in the next round.

  • 8/14/2019 AI andThe Future of Free Journalism

    24/31

    24

    8. Principles of Journalism and Digital

    Identities

    The interaction between digital identities, as discussed above, may improve the

    outcome for all parties involved. But it is a hazardous scenario. It needs to be

    discussed among the actors who care about journalism and its role in society.

    Looking at existing journalistic principles, at least the following can be strongly

    affected by the above scenario:

    Journalisms first loyalty is to the citizens: Journalists can be pressured toshow loyalty to citizens digital identities rather than to the citizens themselves.

    If each story is coupled directly to the business model, and if the businessmodel builds on selling audience attention/interaction to advertisers, this can

    be a problem. It will be difficult to maintain a loyalty to the audience of

    citizens if the journalist will earn more money by adapting to the [digital

    identities of the] advertisers.

    Its practitioners must maintain independence from those they co ver:It

    may be possible to involve behavioral models of those covered in the stories in

    the equation. This will improve the journalists chances to plan a series of

    stories, knowing how the outcome of one story opens for the next. It will give

    journalists a tool for projecting the effects the story will have on stakeholders.

    Those covered in the story may also be advertisers or have strong, shared

    interests with advertisers. This makes the web of co-dependencies more visible

    to the journalist. In some cases this can help a journalist to be independent but

    in many other cases it will make it difficult to maintain independence.

    Its practitioners must be allowed to practice their personalconscience: If

    the business model and the system of digital identities and filters permits

    projecting how much profit a story can produce as it is written, or if it will

    offer predictions of how the story will influence stakeholders in the journalism

    organization, probability increases that the journalists personal consciences

    may conflict with businesses or other stakeholders interests. In short: if I

    write the story the way I want, my publisher will know that I chose to earn less

    money. Or: If I write the story the way I want, my publisher will know that I

    chose to increase the risk of us getting in conflict with the advertisers.

  • 8/14/2019 AI andThe Future of Free Journalism

    25/31

    25

    These are only quick, simple examples of types of issues that need to be considered

    while developing systems of digital identities and filters for journalism.

    8.1 Principles for Using Digital Identities for

    Journalism

    We suggest the need for principles for using digital identities in journalism. Some

    such may be:

    1.Peoples needs are more important than the needs of digitalidentities

    Digital Identities can never be identical to a persons whole being. Some

    measure of error should always be considered. People are more important

    than digital identities. Digital identities should adapt to people, not viceversa;

    2 Using digital identities in journalism should not compromisejournalisms loyalty to the audience o r its independence fromsources;

    3 Using digital identities in journalism should not compromise thejournalists freedom to practice his/her personal conscience

    8.2 Need for Further Discussion Between

    Stakeholders in Society

    A group of computer scientists, AI researchers and roboticists met in Asilomar

    Conference Grounds on Monterey Bay in California to debate whether there

    should be limits on research that might lead to the loss of human control over

    computer-based systems that carry a growing share of societys workloadtheir

    concern is that further advances could create profound social disruptions and even

    have dangerous consequencesand force humans to learn to live with machines

    that increasingly copy human behaviors"(Markoff, 2009).

    The scientists were concerned about job loss or criminals accessing these tools. No

    reference was made to the possible devastating effects that using AI tools may have

    on journalistic content. The conference was organized by the Association for the

  • 8/14/2019 AI andThe Future of Free Journalism

    26/31

    26

    Advancement of Artificial Intelligence (AAAI). Dr Horvitz of Microsoft, who

    organized the meeting, said he believed computer scientists must respond to the

    notions of superintelligent machines and artificial intelligence run amok the

    panel was seeking ways to guide research so that technology improved society

    rather than move it toward technological catastrophe" (ibid).

    It is time to organize a similar conference with computer scientists, AI experts,

    academic researchers in the area of multimedia information retrieval, journalism

    professionals and experts, social communication experts and economists who

    specialize in media business models, to explore the potential effects of AI

    algorithms on the journalism profession and its role in a democratic society. Some

    of the questions to be explored:

    1. Will people control or be controlled by their digital identities?

    2. How will the definition of journalism be influenced by digital identities?

    3. With the Internet, journalism is no longer only broadcasting but also

    interacting with readerships and facilitating public discussions. What is the

    role of journalism in society?

    4. How will journalistic principles be affected by interaction between digital

    identities?

    5. Which business models are enabled by digital identities? To what extent

    will journalists be attention workers, paid by brokering the readership

    attention to advertisers; to what extent will they be knowledge workers,

    paid by brokering knowledge?

    6. What are suitable principles for journalism, in a situation where interaction

    with and between digital identities guides the production of journalism, the

    ways it generates value for people, and the ways it creates profits for the

    journalism industry?

    7. What is the match between journalism and journalistic business models?

    8. How will journalistic principles and matching business models be updated?

    9. How are journalistic principles, and the process for updating them, be

    implemented in an environment of digital identities?

  • 8/14/2019 AI andThe Future of Free Journalism

    27/31

    27

    References

    Abramovitch, G. (2007). Myspace has data mining plans, www.dmnews.com, Sept. 24.

    Aho Williamson, D. (2005). White Paper on Behavioral Targeting, Wall Street Journal and

    eMarketer, May 11

    Barnard, K., Duygulu, P., Forsyth, D., et al. (2003). Matching Words and Pictures. Journal

    of Machine Learning Research, 3, pp. 1107-1135.

    Boyd, Dana, M and Ellison, Nicole B. (2007). Social network sites: definition, history and

    scholarship., school of information, uc Berkeley, and., Dep. Of telecommunications

    and information studies, Michigan state university.Journal of computer mediated

    communications, 13(1), article 11, 2007.

    Duygulu, P., Barnard, K., de Freitas, N., and Forsyth, D. (2002). Object Recognition as

    Machine Translation: Learning a Lexicon for Fixed Image Vocabulary, Seventh

    European Conference on Computer Vision, pages 97-112.

    Flickner, M., Sawhney, H., Niblack, W., et al. (1995). Query by Image and Video Content:

    The QBIC System, Computer, 28(9), pp. 23-32.

    Jeon, J. Laverenko, V. and Mammatha, R. (2007). Automatic Image Annotation and

    Retrieval Using Cross Media Relevance Models.Proceedings of the 26th annual

    international ACM SIGIR conference on Research and development in information

    retrieval.

    Kassner, M.(2009). Google Quitely Starts behavioral targeting,ZDNetAsia, April 21.

    Kilger, M. and Romer, E. (2007). Do Measures of Media Engagement Correlate with

    Product Purchase Likelihood?,Journal of Advertising Research, 47(3), pp. 313-325.

    Lemelshtrich Latar, N. (2004). Personal web social DNA and cybernetic decision making,

    Hubert burda center for innovative communications, BGU, Feb 2004, ICA conference,

    New Orleans, 2004.

  • 8/14/2019 AI andThe Future of Free Journalism

    28/31

    28

    Lemelshtrich Latar, N. & Nordfors, D. (2009). "Digital Identities and Journalism Content",

    Innovation Journalism, 6(7), Nov. 11, Stanford

    Markoff, J. (2009). Scientists Worry Machines May Outsmart Man,NYT.com, July 26.

    McCombs, M.E., and Shaw, D.L. (1972). The Agenda-Setting Function of Mass Media.

    Public Opinion Quarterly, 36, p.176-187.

    Mills, E. (2007). AOL buys ads from TocadaZDNetAsia, July 25

    Nordfors, D. (2006). PR and the Innovation Communication System,Innovation

    Journalism,3(5). http://www.innovationjournalism.org/archive/INJO-3-5.pdf , also

    published by Strategic Innovators ( July - Sept 2007, Volume I | Issue 3)

    Nordfors, D. (2009). Innovation Journalism, Attention Work and the Innovation Economy.

    A Review of the Innovation Journalism Initiative 2003-2009,Innovation Journalism,

    6(1),

    http://www.innovationjournalism.org/archive/injo-6-1.pdf, Retrieved Sep 9 2009.

    Olsen, S. (2008). 33Across: The Next Generation of Behavioral ad Targeting,

    news.cnet.com, June 23.

    Salway, A. and Graham, M. (2003). Extracting Information about Emotions in Films,

    Proceedings of the Eleventh ACM International Conference on Multimedia,November 2-8,

    pp. 299-302

    Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., and Jain, R. (2000). Content Based

    Image Retrieval at the End of the Early Years,IEEE Transactions on Pattern Analysis

    and Machine Intelligence, 22(12), pp. 1349-1380.

    Sterpanek, M. (2000). Weblining, businessweek on-line, April 3.

    Turing, Alan (October 1950), "Computing Machinery and Intelligence", Mind, LIX (236):

    433460

  • 8/14/2019 AI andThe Future of Free Journalism

    29/31

    29

    Watson, F. (2009). Behavioral Targeting: Profiling or Projecting User Experience, Search

    Engine Watch, Mar 13.

  • 8/14/2019 AI andThe Future of Free Journalism

    30/31

    30

    About the authors

    Noam Lemelshtrich Latar is the Founding Dean of the Sammy Ofer School ofCommunications at IDC Herzliya (the first private academic institution in Israel),and serves since 2009 as the Chairperson of the Israel Communications

    Association, which groups all media researchers in the Israeli Universities and

    Colleges. Lemelshtrich Latar received a Ph.D. in communications from MIT in

    1974 and MSc. in engineering systems at Stanford in 1971. He was among the

    founders of the Community Dialog Project at MIT, experimenting with interactive

    TV programs involving communities through electronic means. From 1975 to 2005

    Lemelshtrich Latar pioneered the teaching and research of new media at the

    Hebrew and Tel Aviv Universities. From 1999 to 2005 he was involved in the

    Israeli high-tech industry as a venture-capital chairman, helping to establish several

    communications start ups in cognitive enhancement, data mining of consumer

    choices and home networking. In 2005 he joined IDC Herzliya Israel as founding

    Dean of a new school of communications, emphasizing new media. His current

    research interest is in digital identities and the effect of AI on journalism.

    David Nordfors is co-founding Executive Director of the Center for Innovation andCommunication at Stanford University. He coined

    Innovation Journalism and Attention Work and started the first innovation

    journalism initiatives, in Sweden and at Stanford. He is a member of the World

    Economic Forum Global Agenda Council on the Future of Journalism. Nordfors is

    adjunct professor at IDC Herzliya and visiting professor at the Monterrey Institute

    of Technology and Higher Education (Tech Monterrey). Dr. Nordfors has a Ph.D.

    in molecular quantum physics from the Uppsala University, and did his postdoctoral

    research in theoretical chemistry at the University of Heidelberg. He was

    the initial Director of Research Funding of the Knowledge Foundation in Sweden

    (KK-stiftelsen). He was the first Science Editor of Datateknik, a Swedish IT

    magazine, from where he initiated and headed the first hearing about the Internet to

    be held by the Swedish Parliament.

    "The expression augmented intelligence is attributed to Engelbart, D.C. (Oct 1962). "AugmentingHuman Intellect: A Conceptual Framework", Summary Report AFOSR-3233, Stanford Research

    Institute, Menlo Park, CA. Related concepts: 1) IA or Intelligence Amplification by Ashby, W.R.

    (1956),An Introduction to Cybernetics, Chapman and Hall, London, UK. Reprinted, Methuen andCompany, London, UK, 1964. 2) Man-Computer-Symbiosis Licklider, J.C.R. (1960). "Man-

  • 8/14/2019 AI andThe Future of Free Journalism

    31/31

    31

    Computer Symbiosis",IRE Transactions on Human Factors in Electronics, vol. HFE-1, 4-11.

    ""Techmeme, http://techmeme.com, arranges tech journalism story links into a single page.Techmeme works by scraping news websites and blogs, and then compiles a list of links to the most

    popular technology-related news of the day, which is continuously updated. The stories selected are

    all chosen by an automated process. http://en.wikipedia.org/wiki/Techmeme (11 Jan 2010)

    """Nordfors, D. (2008). Separating Journalism and the Media, EJC Magazine, 4 Dec 2008, EuropeanJournalism Centre http://www.ejc.net/magazine/article/separating_journalism_and_the_media/

    "#Wikipedia. (Aug 29 2009). http://en.wikipedia.org/wiki/Advertising#cite_note-2 "GlobalEntertainment and Media Outlook: 20062010, a report issued by global accounting firm

    PricewaterhouseCoopers". Pwc.com. Retrieved 2009-04-20.

    #"The World Bank: World Development Indicators database, 1 July 2009. Gross domestic product(2008) http://siteresources.worldbank.org/DATASTATISTICS/Resources/GDP.pdf

    #"Lemelshtrich Latar, N. & Nordfors, D. (2009). "Digital Identities and Journalism Content", TheInnovation Journalism Publication Series, 6(7), Nov. 11. VINNOVA-Stanford Research Center of

    Innovation Journalism, Wallenberg Hall, Stanford University.

    #""Compact Oxford English Dictionary, published on-line by AskOxford.com. Retrieved Sep 6 2009.#"""PEJ and CCJ Principles of Journalism, published 1997. Available athttp://www.journalism.org/resources/principles Retrieved 6 Sep 2009.

    "$Madsen, Paul, SAML2: The building blocks of federated identity, Jan 2005, xml.com.$www.personalgenomes.org.