overview of research at hp labs india

46
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Overview of research at HP Labs India

Upload: turner

Post on 16-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Overview of research at HP Labs India. Bristol. Palo Alto. St. Petersburg. Haifa. Beijing. Bangalore. Tokyo. HP Labs around the world. 7 locations. 600 researchers in 23 labs. 20-30 large projects in 8 high-impact areas. High-Impact Research Areas - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Overview of research at HP Labs India

© 2006 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

Overview of research at HP Labs India

Page 2: Overview of research at HP Labs India

HP Labs around the world

HaifaHaifa

BangaloreBangalore

BeijingBeijingBeijingBeijing

St. PetersburgSt. PetersburgBristolBristol

Palo AltoPalo Alto

TokyoTokyo

7 locations 600 researchers in 23 labs

20-30 large projects in 8 high-impact areas

Page 3: Overview of research at HP Labs India

Cloud

High-Impact Research AreasThe next technology challenges and opportunities

Information Management

Digital Commercial Print

Sustainability

Immersive InteractionAnalytic

s

Intelligent Infrastructure

Content Transformation

Page 4: Overview of research at HP Labs India

HP Labs’ research contribution: Breakthrough technology to accelerate the transformation to digital commercial printing

Printing ProcessCommercial-grade throughput, cost and quality

ColorSelf-calibration, intuitive rendering

Digital Commercial Print

Data PathEfficient processing of massive data streams

Job CreationAutomated content generation

End State: Flexible, customized, on-demand printing that replaces the traditional distribution of mass-produced materials

Page 5: Overview of research at HP Labs India

Content Transformation

End State: Complete convergence of physical and digital information

HP Labs’ research contribution: Technologies to transfer content seamlessly from paper to digital and access digital content wherever paper is used today

Content ManagementIntuitive, personalized organization; Intelligent content extraction; Live, interactive documents

Displays/MaterialsUnbreakable, conformable, ultra-thin and lightweight; Digital with the look and feel of paper

Page 6: Overview of research at HP Labs India

Immersive Interaction

End state: Intuitive human interaction through and with technology

HP Labs’ research contribution: Radically simplify the user experience to make technology more useful, intuitive and pervasive

Intuitive Interfaces Natural, multi-modal, computer-human interactions

Seamless Collaboration Immersive multimedia communication – anytime, anywhere – with no physical barriers

Contextual Services Delivering “the right thing at the right time”; Personal paradigms to simplify Web interaction

Page 7: Overview of research at HP Labs India

Information Management

HP Labs’ research contribution: Redefine the twin tasks of taming and exploiting information to revolutionize enterprise decision makingManagementSuperior analysis, extraction and delivery of massive enterprise content

IntelligenceCapabilities to transform massive-scale, real-time data into transactional, operational business intelligence

End State: The vast universe of enterprise information transformed into immediate, business-relevant insight

Page 8: Overview of research at HP Labs India

Analytics

End state: Application of mathematic and scientific methodologies create better run businesses

HP Labs’ research contribution: Drive secure, informed, highly effective decision making

Software Enhance automation and business processes

Services Analytics that address and transform operational efficiency and security

Solutions Predictive customer behavior; Individual profile learning

Page 9: Overview of research at HP Labs India

End state: Everything-as-a-Service: Billions of users, millions of services, thousands of service providers, millions of servers, exabytes of data, terabytes of traffic

HP Labs’ research contribution: Develop an integrated cloud stack, from infrastructure to services

Cloud

InfrastructureEnterprise-grade security, capacity and management

ServicesDisrupt traditional industries and offer rich, dynamic experiences

Page 10: Overview of research at HP Labs India

Intelligent Infrastructure

End state: Capture more value via dramatic computing performance and cost improvements

HP Labs’ research contribution: Radical, new approaches for collecting, storing and transmitting data to feed the exascale data center

Data CenterCost and power efficient; Manageable, reliable; Easily programmable

Intelligent StorageCloud-scale, dynamic enterprise-grade

NetworksProgrammable, scalable, energy-efficient

NanotechnologyMemristors, Sensors, Photonic Interconnect

Page 11: Overview of research at HP Labs India

Sustainability

End state: An IT industry with a light carbon footprint that drives the reduction of carbon emissions throughout the global economy

HP Labs’ research contribution: Displace conventional supply chains with sustainable IT ecosystems

Data Centers Integrated, end-to-end management of compute, power & cooling resources from cradle to cradle

Tools & Methodologies Reengineer existing value chains using IT to lower environmental footprint

Page 12: Overview of research at HP Labs India

12 April 21, 2023

2008 HP Labs Innovation Research Awards41 awards, 34 universities,14 countries

12 April 21, 2023

• Stanford University• University of California,

Berkeley• University of California, Davis• University of California, San

Diego• University of California, Santa

Barbara• University of Southern

California

• University of Toronto• Carnegie Mellon University• Massachusetts Institute of

Technology• State University of New

York at Buffalo• Rochester Institute of

Technology

• University of Illinois at Urbana-Champaign

• University of Michigan• University of Wisconsin-

Madison• Purdue University• Georgia Institute of

Technology

• University of Edinburgh, Scotland• University of Bath, England• University of Leeds, England• University of Bristol, England

• Konstanz University, Germany• Technische Universitaet Muenchen, Germany• Vrije Universiteit Amsterdam, Netherlands• Universidade do Minho, Portugal

• Indian Institute of Technology, Madras, India

• Indian Institute of Technology, Bombay, India

Americas

EMEA

• Russian Academy of Sciences, Russia• University of Saint-Petersburg, Russia

• Bilkent University, Turkey

• National Institute of Informatics, Japan

• Peking University, China• Tsinghua University, China

• Nanyang Technological University, Singapore

APJ

• Technion, Israel Institute of Technology, Israel

Europe, Middle East & Africa

Asia-Pacific & Japan

Page 13: Overview of research at HP Labs India

Open cloud computing research test bed• A loose federation of “Centers of

Excellence” around the globe −UIUC, Singapore IDA, KIT: 3 initial CoE−HP, Intel, Yahoo: 3 initial sponsors with CoE

• Research objectives−Multi-datacenter, multi-geography, multi-

tenancy, secure, massive scale, open test bed

• Each center: 1000-4000 cores and up to PB storage −Base service: PRS (physical resource set)−Required services: Open EC2-like, S3, and

Hadoop-on-demand−Plus additional local

extensions/variants/service types

Page 14: Overview of research at HP Labs India

© 2006 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

HP Labs India

Page 15: Overview of research at HP Labs India

Gesture-based keyboard (GKB)

Page 16: Overview of research at HP Labs India

Uplink Side

Downlink Side

Solid State Power Amplifier

Up converter

Modulator

Encoder

Inserter

Data from PCAV Signal

Television

Printer

PrintCast Decoder

Receiver Dish & LNBC

Uplink Dish

Set Top Box

PrintCast

Page 17: Overview of research at HP Labs India

Paper & IT convergence

Secure AiO

Page 18: Overview of research at HP Labs India

HP Labs India• Three ongoing projects

−Simplifying web consumption for the next billion (SWAN) – Remainder of this talk

−Intuitive multimodal and gestural interaction (IMAGIN)

−Paper in the digital enterprise (PRIDE)

Page 19: Overview of research at HP Labs India

SWAN project - Motivation

Simplifying web consumption for all

Web is useful but complex to use for non-tech-savvy people

Web has to be useful in the mobile context as well

Page 20: Overview of research at HP Labs India

Why is web consumption complex ?

• Each web site forces its own cognitive model on the user

− Website decides the interaction model, user has to learn it & remember it

− Different websites of the same genre impose their model

• Web requires very “low” level instructions − Information access is through query and manual filtering approach− Content adaption, e.g. translation, require a lot of technical skills

• Mobile web consumption is challenging− User’s frame of mind is different (limited attention span, distracted)− Devices are resource challenged

• Broken web experience across different access methods− experience continuity across broadband, mobile & disconnected

connectivity

Page 21: Overview of research at HP Labs India

State of the art

Web Widgets

Alerts Personalized web pages

Mobile environmentsMobile environments

Passive consumption

Web Simplification

Web Simplification

Personalized Web Content

Personalized Web Content

Browser Scripting

Mashups

chumby

Pipes

The Gap:The Gap: Need to Need to SimplifySimplify Personal WebPersonal Web InteractionsInteractions - especially for - especially for MobileMobile Environments Environments

Page 22: Overview of research at HP Labs India

Technical Goals Users to set their own preferred interaction

pattern Enabling users to easily express their own web

interaction patterns

Providing a familiar interface to all personal actions on the web

Higher level intent while interacting with services Implicit web content consumption based on higher user

intent expression, user feedback and user profile.

Understanding and translating user intent to web actions

Always responsive interactions Providing continual interaction across multiple devices

& connectivity situations

Providing ‘Responsive-Behavior’ despite disconnections

Page 23: Overview of research at HP Labs India

Intent Query Goal

Approach

Create simple interactions for long term and exploratory information needs

End user value: Simplify the “Intent -> Query -> Goal” cycle

User Profiles

Query expansion

Aggregation, ranking

Summarization

Google Youtube Digg/Delicious

Page 24: Overview of research at HP Labs India

Using User profiles to personalize services

Data Collection

User

Profile Constructor

Application

Personalized services

(Search, news, video, shopping)

User

Profile

Explicit and Implicit info

Page 25: Overview of research at HP Labs India

Aren’t online portals already doing this?• Online portals and search engines build

user profiles using cookies and other stored data (search keywords, web pages accessed)−However, they don’t see all the user data

−No way for users to aggregate and reuse the profiles different websites (Google, Yahoo, ..) build using their data

−Privacy is a big problem

Page 26: Overview of research at HP Labs India

Implicit profile construction - Prior approaches and their limitations• Word based Approach

−Use words in user documents to represent user interests

−Problems• Words appear independent of page content (“Home”,

“page”)• Polysemy and Synonymy• Large profile sizes

• DMOZ approach−Use existing ontology maintained for free−Problems

• Too large (about 6 lakh DMOZ nodes), ontology has to be drastically pruned for use

• Need to build classifiers for each DMOZ node

Page 27: Overview of research at HP Labs India

Our approach• Use Wikipedia as the language of profile

representation, map user documents to Wikipedia concepts−Has bias lower than DMOZ and variance lower

than words

• Build a hierarchical profile based on Wikipedia

• Tag the profile concepts as (transactional or recreational)

• Compute recency of user interests in a particular topic

Page 28: Overview of research at HP Labs India

Item: “Sony to slash PlayStation3 price”Term vector Representation: <sony:1>,<slash:1>, <playstation3:1>,<price:1>

Item: “Jittery Sony Knocks $100 Off PS3 Price Tag”Term vector Representation: <jittery:1>, <sony:1>, <knocks:1> <ps3:1>,<price:1>, <tag:1>

Index of Wikipedia dump

query

Sony to slash PlayStation3 price

Additional features: titles of the retrieved articles

1. PlayStation Network Platform2. PlayStation 23. Ducks demo4. PlayStation 35. PlayStation6. Ken Kutaragi7. PlayStation Portable8. Console manufacturer9. Sony Group10. Crystal Dynamics11. PlayStation 3 accessories12. …13. …

Mapping documents (web pages) to Wikipedia concepts

Page 29: Overview of research at HP Labs India

Term Vector vs Wikipedia profiles

Words in TF * IDF based user profile Concepts in Wikipedia Based user profile

Search

Home

Help

News

Privacy

Google

Terms

New

Page

Use

Web

View

Results

Information

Account

Text Retrieval Conference

HTML element

Bank of America

Google search

ICICI Bank

IDBI Bank

Bank fraud

Artificial neural network

Web crawler

Web design

Debit card

Extensible Markup Language

Hewlett-Packard

Microsoft

XHTML

Demand account

Page 30: Overview of research at HP Labs India

Constructing the hierarchical profileAlgorithm of Xu et.al. [WWW 2007]

Wild life photography (5)

Nature photography (10)

Photography (15)

Photography (15)

Wild life photography (5)

Nature photography (10)

Support (# pages mapped to this concept)

Page 31: Overview of research at HP Labs India

Tagging concepts in user profiles• Two types of tags

−Whether the concept is of commercial or recreational interest

−Recency of interest• Tagging Commercial interest

−Crawl shopping site pages, map pages to concepts and label these concepts as commercial interests

• Tagging Recreational interest−Use topics in Wikipedia recreational/hobby

categories• Recency of Interest – Sigma(1/e^(today –

time page supporting topic last accessed))

Page 32: Overview of research at HP Labs India

Wikipedia based profile

Page 33: Overview of research at HP Labs India

Evaluation results

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 200 400 600

Number of web pages in cache

Sta

bili

ty Stability_alpha

Stability_date

0.7

0.75

0.8

0.85

0.9

0.95

1

Support > 5 3 < Support < 5 Support < 3

Prec

isio

n

0

0.2

0.4

0.6

0.8

1

1.2

Level 1 Level 2 Level 3 Level 4 Level 5 Level 6

Pe

rce

nt

(%)

Percentage in profile

Precision

Figure 1

Figure 2 Figure 3

•Profiles are stable (fig 1)

•Profile elements with high support have high precision (fig 2)

•Profile elements at all levels of the hierarchy have similar precision (fig 3)

•Bookmarks are not a good data source for profiles

Page 34: Overview of research at HP Labs India

Query expansion – Personalized video• Approach

− Create three additional queries (based on terms with high TF in title, tags and description)

− Evaluating which expansion is better

• Example: Query on Youtube for “trains”

• Expansion using −Title

train+osbourne+midnight+bullet+rollin+mystery+maglev

−Description train+runaway+record+version+video+http+track

−Tags train+railroad+guitar+osbourne+railway+bullet

• Cross-lingual expansions−Baba Ramdev− Baba+ramdev+yoga+swami+prana

yam+liye+ram+disease+dev+india+dhyan

Page 35: Overview of research at HP Labs India

Query expansion - “Find similar” Problem – Can we construct queries to make getting “similar content” easier ?

Approach - Identify key phrases for text document, query standard search engine, rank results

•Retrieving the original documentcapture restart+ capture random+random walk+page rank+capture random walk+restart yields

retrieves Hopcroft’s talk at rank 1 in Google

Query - Ed Lazowska’s talk

Result – Hopcroft’s talk

Page 36: Overview of research at HP Labs India

Query expansion – “Find similar”economic growth

global development

economic history

economic governance

adam smith

good governance

economic growth process

modern technology

economic+growth+global+development+history+governance+adam+smith+process+rich+good+new+knowledge+cgd+brief+world+property+rights+productivity+labor+human+capital+getting+use+modern+technology+trade+barriers+public+goods+poor+countries+machine+natural+resources+research+intellectua

Query

Page 37: Overview of research at HP Labs India

Aggregating search results• Current search interfaces geared to

immediate gratification, no way to tradeoff search latency for more relevant results

• Different search engines have different coverage, no way to benefit from this

• Navigation of results requires clicking back and forth on search results−Search result snippets often misleading

Page 38: Overview of research at HP Labs India

Our solution• To create an

aggregated and personalized Information Retrieval (IR) system that −compiles and

consolidates the most relevant information on particular topic(s) from the web

−automatically creates a PDF document on the topic

Page 39: Overview of research at HP Labs India

Ranking results• Content Based Ranking (based on TF,IDF,

Document Boost, Field Boost)

• Delicious Vector Cosine Similarity

Rank (URL) = d*(CBR) + (1-d) ( DVCS)

Page 40: Overview of research at HP Labs India

User Interface User study results

Page 41: Overview of research at HP Labs India

Document summarization using Wikipedia

Index of Wikipedia content

query

Sony to slash PlayStation3 price

Additional features: titles of the retrieved articles

1. PlayStation Network Platform

2. PlayStation 23. Ducks demo4. PlayStation 35. PlayStation6. Ken Kutaragi7. PlayStation

Portable8. Console

manufacturer9. Sony Group10. Crystal

Dynamics11. PlayStation 3

accessories12. …13. …

C1 C2 C3 C4

S1

1 0 1

0

S2

0 1 1 0

S3

0 0 0 1

In degree = 2

Algorithm1

Document sentences mapped to Wikipedia concepts

Uses in degree of concept-sentence bipartite graph for sentence selection

Tested on DUC 2002 data from NIST

Would have come in 3rd in the NIST challenge

Limitations

- Controlling size of the summary

- General concepts (e.g. Sports) may win over specific concepts (e.g. Soccer)

Page 42: Overview of research at HP Labs India

Document summarization - Algorithm 2

Intuition : Important sentences in the document map to important concepts and vice versa

Propagate sentence importance to concepts and concept importance to sentences over multiple iterations

Future work – Size of summary, multi-document summaries, Indian language summaries

,G) f(xx tn

tn 1

Accumulate step

mn

tn

tm x y

N

1

nm

tm

tn y x

M

1

Broadcast step

Page 43: Overview of research at HP Labs India

Challenge 1• Better intent expression• Multi-lingual query reformulation

−Baba Ramdev−Baba+ramdev+yoga+swami+pranayam+liye+ram+disea

se+dev+india+dhyan

• Interfaces to simplify feedback for query reformulation

Page 44: Overview of research at HP Labs India

Challenge 2• Long standing queries• Queries spread over time

−Learning photography

−Information delivery needs to be incremental and non-repetitive

−Video retrieval

• Channels • Create Initial stickiness

• Ensure ongoing interest

−Caching – Utility models

• What are good evaluation measures for such systems ?

Page 45: Overview of research at HP Labs India

Challenge 3• Document summarization

−Extracting leads

−Compression versus missed information

−Cross lingual summarization

Page 46: Overview of research at HP Labs India