introduction to ir research

71
2008 © ChengXiang Zhai 1 Introduction to IR Research ChengXiang Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology, Statistics University of Illinois, Urbana-Champaign http://www.cs.uiuc.edu/homes/czhai, [email protected]

Upload: graceland

Post on 22-Feb-2016

40 views

Category:

Documents


1 download

DESCRIPTION

Introduction to IR Research. ChengXiang Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology, Statistics University of Illinois, Urbana-Champaign http:// www.cs.uiuc.edu/homes/czhai , [email protected]. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to IR Research

2008 © ChengXiang Zhai 1

Introduction to IR Research

ChengXiang ZhaiDepartment of Computer Science

Graduate School of Library & Information Science

Institute for Genomic Biology, StatisticsUniversity of Illinois, Urbana-Champaign

http://www.cs.uiuc.edu/homes/czhai, [email protected]

Page 2: Introduction to IR Research

2008 © ChengXiang Zhai 2

Outline1. What is research?

2. How to prepare yourself for IR research?

3. How to identify and define a good IR research problem?

4. How to formulate and test IR research hypotheses?

5. How to write and publish an IR paper?

Page 3: Introduction to IR Research

2008 © ChengXiang Zhai 3

Part 1. What is research?

Page 4: Introduction to IR Research

2008 © ChengXiang Zhai 4

What is Research?• Research

– Discover new knowledge – Seek answers to questions

• Basic research– Goal: Expand man’s knowledge (e.g., which genes control social

behavior of honey bees? )– Often driven by curiosity (but not always)– High impact examples: relativity theory, DNA, …

• Applied research– Goal: Improve human condition (i.e., improve the wolrd) (e.g.,

how to cure cancers?)– Driven by practical needs– High impact examples: computers, transistors, vaccinations, …

• The boundary is vague; distinction isn’t important

Page 5: Introduction to IR Research

2008 © ChengXiang Zhai 5

Why Research?

Amount of knowledge

Advancement of Technology

Utility of Applications

Quality of Life

Basic Research Applied ResearchApplication

Development

Curiosity

Page 6: Introduction to IR Research

2008 © ChengXiang Zhai 6

Where’s IR Research?

Amount of knowledge

Advancement of Technology

Utility of Applications

Quality of Life

Basic Research Applied ResearchApplication

Development

Information Science

Computer Science

Page 7: Introduction to IR Research

2008 © ChengXiang Zhai 7

Where’s Your Position?

Amount of knowledge

Advancement of Technology

Utility of Applications

Quality of Life

Basic Research Applied ResearchApplication

Development

Different position benefits from different collaborators

Page 8: Introduction to IR Research

2008 © ChengXiang Zhai 8

Research Process• Identification of the topic (e.g., Web search)

• Hypothesis formulation (e.g., algorithm X is better than Y=state-of-the-art)

• Experiment design (measures, data, etc) (e.g., retrieval accuracy on a sample of web data)

• Test hypothesis (e.g., compare X and Y on the data)

• Draw conclusions and repeat the cycle of hypothesis formulation and testing if necessary (e.g., Y is better only for some queries, now what?)

Page 9: Introduction to IR Research

2008 © ChengXiang Zhai 9

Typical IR Research Process• Look for a high-impact topic (basic or applied)• New problem: define/frame the problem • Identify weakness of existing solutions if any• Propose new methods • Choose data sets (often a main challenge)• Design evaluation measures (can be very difficult)• Run many experiments (need to have clear research

hypotheses)• Analyze results and repeat the steps above if necessary• Publish research results

Page 10: Introduction to IR Research

2008 © ChengXiang Zhai 10

Research Methods• Exploratory research: Identify and frame a new

problem (e.g., “a survey/outlook of personalized search”)

• Constructive research: Construct a (new) solution to a problem (e.g., “a new method for expert finding”)

• Empirical research: evaluate and compare existing solutions (e.g., “a comparative evaluation of link analysis methods for web search”)

• The “E-C-E cycle”: exploratoryconstructiveempiricalexploratory…

Page 11: Introduction to IR Research

2008 © ChengXiang Zhai 11

Types of Research Questions and Results

• Exploratory (Framework): What’s out there? • Descriptive (Principles): What does it look

like? How does it work?• Evaluative (Empirical results): How well does

a method solve a problem? • Explanatory (Causes): Why does something

happen the way it happens? • Predictive (Models): What would happen if

xxx ?

Page 12: Introduction to IR Research

2008 © ChengXiang Zhai 12

Solid and High Impact Research• Solid work:

– A clear hypothesis (research question) with conclusive result (either positive or negative)

– Clearly adds to our knowledge base (what can we learn from this work?)

– Implications: a solid, focused contribution is often better than a non-conclusive broad exploration

• High impact = high-importance-of-problem * high-quality-of-solution– high impact = open up an important problem– high impact = close a problem with the best solution– high impact = major milestones in between– Implications: question the importance of the problem and don’t

just be satisfied with a good solution, make it the best

Page 13: Introduction to IR Research

2008 © ChengXiang Zhai 13

Part 2. How to prepare yourself for IR research?

Page 14: Introduction to IR Research

2008 © ChengXiang Zhai 14

What It Takes to Do Research• Curiosity: allow you to ask questions

• Critical thinking: allow you to challenge assumptions

• Learning: take you to the frontier of knowledge

• Persistence: so that you don’t give up

• Respect data and truth: ensure your research is solid

• Communication: allow you to publish your work

• …

Page 15: Introduction to IR Research

2008 © ChengXiang Zhai 15

Learning about IR• Start with an IR text book (e.g., Manning et al., Grossman &

Frieder, a forth-coming book from UMass,…)

• Then read “Readings in IR” by Karen Sparck Jones, Peter Willett

• And read papers recommended in the following article: http://www.sigir.org/forum/2005D/2005d_sigirforum_moffat.pdf

• Read other papers published in recent IR/IR-related conferences

Page 16: Introduction to IR Research

2008 © ChengXiang Zhai 16

Learning about IR (cont.)• Getting more focused

– Choose your favorite sub-area (e.g., retrieval models)– Extend your knowledge about related topics (e.g., machine

learning, statistical modeling, optimization)

• Stay in frontier:– Keep monitoring literature in both IR and related areas

• Broaden your view: Keep an eye on – Industry activities

• Read about industry trends• Try out novel prototype systems

– Funding trends• Read request for proposals

Page 17: Introduction to IR Research

2008 © ChengXiang Zhai 17

Critical Thinking • Develop a habit of asking questions, especially why questions

• Always try to make sense of what you have read/heard; don’t let any question pass by

• Get used to challenging everything

• Practical advice– Question every claim made in a paper or a talk (can you argue the

other way?) – Try to write two opposite reviews of a paper (one mainly to argue

for accepting the paper and the other for rejecting it)– Force yourself to challenge one point in every talk that you attend

and raise a question

Page 18: Introduction to IR Research

2008 © ChengXiang Zhai 18

Respect Data and Truth• Be honest with the experiment results

– Don’t throw away negative results! – Try to learn from negative results

• Don’t twist data to fit your hypothesis; instead, let the hypothesis choose data

• Be objective in data analysis and interpretation; don’t mislead readers

• Aim at understanding/explanation instead of just good results

• Be careful not to over-generalize (for both good and bad results); you may be far from the truth

Page 19: Introduction to IR Research

2008 © ChengXiang Zhai 19

Communications• General communication skills:

– Oral and written– Formal and informal– Talk to people with different level of backgrounds

• Be clear, concise, accurate, and adaptive (elaborate with examples, summarize by abstraction)

• English proficiency

• Get used to talking to people from different fields

Page 20: Introduction to IR Research

2008 © ChengXiang Zhai 20

Persistence• Work only on topics that you are passionate about• Work only on hypotheses that you believe in• Don’t draw negative conclusions prematurely and

give up easily– positive results may be hidden in negative results– In many cases, negative results don’t completely reject

a hypothesis • Be comfortable with criticisms about your work (learn

from negative reviews of a rejected paper)• Think of possibilities of repositioning a work

Page 21: Introduction to IR Research

2008 © ChengXiang Zhai 21

Optimize Your Training• Know your strengths and weaknesses

– strong in math vs. strong in system development– creative vs. thorough– …

• Train yourself to fix weaknesses

• Find strategic partners

• Position yourself to take advantage of your strengths

Page 22: Introduction to IR Research

2008 © ChengXiang Zhai 22

Part 3. How to identify and define a good IR research problem?

Page 23: Introduction to IR Research

2008 © ChengXiang Zhai 23

What is a Good Research Problem? • Well-defined: Would we be able to tell whether we’ve

solved the problem?

• Highly important: Who would care about the solution to the problem? What would happen if we don’t solve the problem?

• Solvable: Is there any clue about how to solve it? Do you have a baseline approach? Do you have the needed resources?

• Matching your strength: Are you at a good position to solve the problem?

Page 24: Introduction to IR Research

2008 © ChengXiang Zhai 24

Challenge-Impact AnalysisLevel of Challenges

Impact/Usefulness

Known

UnknownGood applications

Not interestingfor research

High impactLow risk (easy)

Good short-termresearch problems

High impactHigh risk (hard)Good long-term

research problemsDifficult

basic researchProblems,

but questionable impact

Low impactLow risk

Bad research problems(May not be publishable)

“entry point” problems

Page 25: Introduction to IR Research

2008 © ChengXiang Zhai 25

Optimizing “Research Return”:Pick a Problem Best for You

Your Passion

High (Potential)

Impact

Your StrengthBest problems for you

Find your passion: If you don’t have to work/study for money, what would you do?

Test of impact: If you are given $1M to fund a research project, what would you fund?

Find your strength: If you don’t know your strength, at least avoid your weakness; acquire strength through training

Page 26: Introduction to IR Research

2008 © ChengXiang Zhai 26

How to Find a Problem? • Application-driven (Find a nail, then make a hammer)

– Identify a need by people/users that cannot be satisfied well currently (“complaints” about current data/information management systems?)

– How difficult is it to solve the problem? • No big technical challenges: do a startup• Lots of big challenges: write a research proposal

– Identify one technical challenge as your topic– Formulate/frame the problem appropriately so that you can solve

it

• Aim at a completely new application/function (find a high-stake nail)

Page 27: Introduction to IR Research

2008 © ChengXiang Zhai 27

How to Find a Problem? (cont.) • Tool-driven (Hold a hammer, and look for a nail)

– Choose your favorite state-of-the-art tools • Ideally, you have a “secret weapon”• Otherwise, bring tools from area X to area Y

– Look around for possible applications – Find a novel application that seems to match your tools– How difficult is it to use your tools to solve the problem?

• No big technical challenges: do a startup• Lots of big challenges: write a research proposal

– Identify one technical challenge as your topic– Formulate/frame the problem appropriately so that you can solve

it• Aim at important extension of the tool (find an unexpected

application and use the best hammer)

Page 28: Introduction to IR Research

2008 © ChengXiang Zhai 28

How to Find a Problem? (cont.) • In practice, you do both in various kinds of ways

– You talk to people in application domains and identify new “nails”

– You take courses and read books to acquire new “hammers”

– You check out related areas for both new “nails” and new “hammers”

– You read visionary papers and the “future work” sections of research papers, and then take a problem from there

– …

Page 29: Introduction to IR Research

2008 © ChengXiang Zhai 29

Three Basic Questions to Ask about an IR Problem

• Who are the users?– Everyone vs. Small group of people

• What data do we have?– Web (whole web vs. sub-web)– Email (public email vs. personal email)– Literature (general vs. special discipline)– Blog, forum, …

• What functions do we want to support?– Information access vs. knowledge acquisition– Decision and task support

Everyone (who has an Internet connection)

The whole web (indexed by Google)

Search (by keywords)

Page 30: Introduction to IR Research

30

The Data-User-Service (DUS) Triangle

Users

Data

Services

2008 © ChengXiang Zhai

Page 31: Introduction to IR Research

31

Many Ways to Connect DUS Triangle! (Map of IR Applications)

Web pages

Literature

Organization docs

Blog articles

Product reviews

Customer emails…

Everyone ScientistsUIUCEmployees

OnlineShoppers

Search Browsing Alert Mining Task/Decision support

CustomerServicePeople

Web Search

EnterpriseSearch

LiteratureAssistant

OpinionAdvisor

CustomerRel. Man.

2008 © ChengXiang Zhai

Page 32: Introduction to IR Research

32

Today’s Search Engine

User Data/Text

Services

Bag of words

Search

Keyword Queries

2008 © ChengXiang Zhai

Page 33: Introduction to IR Research

33

Where Do We Want to Be?

Bag of words

Search

Keyword Queries

Access

Mining

Task Support

Entities-Relations

Knowledge Representation

Search History

Complete User Model

Current Search Engine

Personalization(User Modeling)

Large-Scale Semantic Analysis

Full-Fledged Text Info. Management

2008 © ChengXiang Zhai

Page 34: Introduction to IR Research

2008 © ChengXiang Zhai 34

High-Level Challenges in IR• How to make use of imperfect IR techniques to do something

useful? – Save human labor (e.g., partially automate a task)– Create “add on” value (e.g., literature alert)– A lot of HCI issues (e.g., allowing users to control)

• How to develop robust, effective, and efficient methods for a particular application? – Methods need to “work all the time” without failure– Methods need to be accurate enough to be useful– Methods need to be efficient enough to be useful

Page 35: Introduction to IR Research

2008 © ChengXiang Zhai 35

Challenge 1: From Search to Information Access

• Search is only one way to access information

• Browsing and recommendation are two other ways

• How can we effectively combine these three ways to provided integrated information access?

• E.g., artificially linking search results with additional hyperlinks, “literature pop-ups”…

Page 36: Introduction to IR Research

2008 © ChengXiang Zhai 36

Challenge 2: From Information Access to Task Support

• The purpose of accessing information is often to perform some tasks

• How can we go beyond information access to support a user at the task level?

• E.g., automatic/semi-automatic email reply for customer service, literature information service for paper writing (suggest relevant citations, term definitions, etc), comparing prices for shoppers

Page 37: Introduction to IR Research

2008 © ChengXiang Zhai 37

Challenge 3: Support Whole Life Cycle of Information

• A life cycle of information consists of “creation”, “storage”, “transformation”, “consumption”, “recycling”, etc

• Most existing applications support one stage (e.g., search supports “consumption”)

• How can we support the whole life cycle in an integrated way?

• E.g., Community publication/subscription service (no need for crawling, user profiling)

Page 38: Introduction to IR Research

2008 © ChengXiang Zhai 38

Challenge 4: Collaborative Information Management

• Users (especially similar users) often have similar information need

• Users who have explored the information space can share their experiences with other users

• How to exploit the collective expertise of users and allow users to help each other?

• E.g., allowing “information annotation” on the Web (“footprints”), collaborative filtering/retrieval,

Page 39: Introduction to IR Research

2008 © ChengXiang Zhai 39

Look for New IR Research Questions• Driven by new data: X is a new type of data emerging (e.g., X= blog vs. news)

– How is X different from existing types of data?– What new issues/problems are raised by X? – Are existing methods sufficient for solving old problems on X? If not, what are the

new challenges?– What new methods are needed? – Are old evaluation measures adequate?

• Driven by new users: Y is a set of new users (e.g., ordinary people vs. librarians)– How are the new users different from old ones? What new needs do they have? – Can existing methods work well to satisfy their needs? If not, what are the new

challenges? – What new functions are appropriate for Y?

• Driven by new tasks (not necessarily new users or new data): Z is a new task (e.g., social networking, online shopping) – What information management functions are needed to better support Z?– Can these new functions reduced to old ones? If not, what are the new

challenges?

Page 40: Introduction to IR Research

2008 © ChengXiang Zhai 40

General Steps to Define a Research Problem

• Generate and Test • Raise a question• Novelty test: Figure out to what extent we know how to answer the question

– There’s already an answer to it: Is the answer good enough? • Yes: not interesting, but can you make the question more challenging? • No: your research problem is how to get a better answer to the raised

question– No obvious answer: you’ve got an interesting problem to work on

• Tractability test: Figure out whether the raised question can be answered – I can see a way to answer it or potentially answer it: you’ve got a solvable

problem– I can’t easily see a way to answer it: Is it because the question is too hard or

you’ve not worked hard enough? Try to reframe the problem to make it easier• Evaluation test: Can you obtain a data set and define measures to test

solutions/answers? – Yes: you’ve got a clearly defined problem to work on – No: can you think of anyway to indirectly test the solutions/answers? Can you

reframe the problem to fit the data? • Every time you reframe a problem, try to do all the three tests again.

Page 41: Introduction to IR Research

2008 © ChengXiang Zhai 41

Rigorously Define Your Research Problem

• Exploratory: what is the scope of exploration? What is the goal of exploration? Can you rigorously answer these questions?

• Descriptive: what does it look like? How does it work? Can you formally define a principle?

• Evaluative: can you clearly state the assumptions about data collection? Can you rigorously define measures?

• Explanatory: how can you rigorously verify a cause?

• Predictive: can you rigorously define what prediction is to be made?

Page 42: Introduction to IR Research

2008 © ChengXiang Zhai 42

Frame a New Computation Task• Define basic concepts

• Specify the input

• Specify the output

• Specify any preferences or constraints

Page 43: Introduction to IR Research

2008 © ChengXiang Zhai 43

From a new application to a clearly defined research problem

• Try to picture a new system, thus clarify what new functionality is to be provided and what benefit you’ll bring to a user

• Among all the system modules, which are easy to build and which are challenging?

• Pick a challenge and try to formalize the challenge– What exactly would be the input?– What exactly would be the output?

• Is this challenge really a new challenge (not immediately clear how to solve it)?– Yes, your research problem is how to solve this new problem– No, it can be reduced to some known challenge: are existing methods

sufficient? • Yes, not a good problem to work on• No, your research problem is how to extend/adapt existing

methods to solve your new challenge• Tuning the problem

Page 44: Introduction to IR Research

2008 © ChengXiang Zhai 44

Tuning the ProblemLevel of Challenges

Impact/Usefulness

Known

Unknown

Make a hard problem easier

Make an easy problem harder

Increase impact (more general)

Page 45: Introduction to IR Research

2008 © ChengXiang Zhai 45

“Short-Cut” for starting IR research• Scan most recently published papers to find papers that you like or can

understand• Read such papers in detail • Track down background papers to increase your understanding• Brainstorm ideas of extending the work

– Start with ideas mentioned in the future work part– Systematically question the solidness of the paper (have the authors

answered all the questions? Can you think of questions that aren’t answered?)

– Is there a better formulation of the problem – Is there a better method for solving the problem– Is the evaluation solid?

• Pick one new idea and work on it

Page 46: Introduction to IR Research

2008 © ChengXiang Zhai 46

Part 4. How to formulate and test IR research hypotheses?

Page 47: Introduction to IR Research

2008 © ChengXiang Zhai 47

Formulate Research Hypotheses• Typical hypotheses in IR:

– Hypothesis about user characteristics (tested with user studies or user-log analysis, e.g., clickthrough bias)

– Hypothesis about data characteristics (tested with fitting actual data, e.g., Zipf’s law)

– Hypothesis about methods (tested with experiments):• Method A works (or doesn’t work) for task B under condition C by

measure D (feasibility)• Method A performs better than method A’ for task B under

condition C by measure D (comparative)• Introduce baselines naturally lead to hypotheses

• Carefully study existing literature to figure our where exactly you can make a new contribution (what do you want others to cite your work as?)

• The more specialized a hypothesis is, the more likely it’s new, but a narrow hypothesis has lower impact than a general one, so try to generalize as much as you can to increase impact

• But avoid over-generalize (must be supported by your experiments)• Tuning hypotheses

Page 48: Introduction to IR Research

2008 © ChengXiang Zhai 48

Procedure of Hypothesis Testing• Clearly define the hypothesis to be tested (include

any necessary conditions)

• Design the right experiments to test it (experiments must match the hypothesis in all aspects)

• Carefully analyze results (seek for understanding and explanation rather than just description)

• Unless you’ve got a complete understanding of everything, always attempts to formulate a further hypothesis to achieve better understanding

Page 49: Introduction to IR Research

2008 © ChengXiang Zhai 49

Clearly Define a Hypothesis• A clearly defined hypothesis helps you choose the

right data and right measures

• Make sure to include any necessary conditions so that you don’t over claim

• Be clear about any justification for your hypothesis (testing a random hypothesis requires more data than testing a well-justified hypothesis)

Page 50: Introduction to IR Research

2008 © ChengXiang Zhai 50

Design the Right Experiments• Flawed experiment design is a common cause of rejection of

an IR paper (e.g., a poorly chosen baseline)• The data should match the hypothesis

– A general claim like “method A is better than B” would need a variety of representative data sets to prove

• The measure should match the hypothesis– Multiple measures are often needed (e.g., both precision and

recall)• The experiment procedure shouldn’t be biased

– Comparing A with B requires using identical procedure for both– Common mistake: baseline method not tuned or not tuned

seriously• Test multiple hypotheses simultaneously if possible (for the

sake of efficiency)

Page 51: Introduction to IR Research

2008 © ChengXiang Zhai 51

Carefully Analyze the Results• Do the significance test if possible/meaningful• Go beyond just getting a yes/no answer

– If positive: seek for evidence to support your original justification of the hypothesis.

– If negative: look into reasons to understand how your hypothesis should be modified

– In general, seek for explanations of everything!• Get as much as possible out of the results of one

experiment before jumping to run another – Don’t throw away negative data– Try to think of alternative ways of looking at data

Page 52: Introduction to IR Research

2008 © ChengXiang Zhai 52

Modify a Hypothesis• Don’t stop at the current hypothesis; try to generate

a modified hypothesis to further discover new knowledge

• If your hypothesis is supported, think about the possibility of further generalizing the hypothesis and test the new hypothesis

• If your hypothesis isn’t supported, think about how to narrow it down to some special cases to see if it can be supported in a weaker form

Page 53: Introduction to IR Research

2008 © ChengXiang Zhai 53

Derive New Hypotheses• After you finish testing some hypotheses and

reaching conclusions, try to see if you can derive interesting new hypotheses– Your data may suggest an additional (sometimes

unrelated) hypothesis; you get a by-product– A new hypothesis can also logically follow a current

hypothesis or help further support a current hypothesis

• New hypotheses may help find causes:– If the cause is X, then H1 must be true, so we test H1

Page 54: Introduction to IR Research

2008 © ChengXiang Zhai 54

Part 5:How to write and publish an IR

paper?

Page 55: Introduction to IR Research

2008 © ChengXiang Zhai 55

When to Write a Paper? • Survey/Review paper:

– An emerging field or topic has appeared (i.e., a hot topic) but no survey is available, or sufficient new development has occurred such that existing surveys are out of date

– You’ve read and digested enough papers about the topic• Original research paper: when you have sufficient results to

draw an interesting conclusion or answer an interesting research question, i.e., you’ve got a basic story to tell, e.g.,– A new problem, a solution, and results showing how good the

solution is– An old problem, a new solution, and results showing

advantage(s) of the new solution over the old ones– An old problem, many old solutions, and results showing an

understanding of their relative performance – In general, a research question and an answer

Page 56: Introduction to IR Research

2008 © ChengXiang Zhai 56

Before you write any paper, be clear about the targeted readers

Page 57: Introduction to IR Research

2008 © ChengXiang Zhai 57

Typical Structure of a Survey Paper• Introduction:

– Motivation for the survey • An emerging field/topic, but no survey available• Surveys exist, but they are out of date (e.g., due to new

development in a field/topic)– Scope of the survey

• Background (if necessary)

• Conceptual framework ( based on synthesis of the literature)– Define basic concepts, terminology, etc– Give a big picture of the topic so that your survey is coherent

Page 58: Introduction to IR Research

2008 © ChengXiang Zhai 58

Typical Structure of a Survey Paper (cont.)

• Systematic review of existing work – It’s very important that you have some clear structure for this part

• The structure is usually your conceptual framework, or • other meaningful structures (e.g., by time or some way to

classify all the work)– Be critical! Add your opinions about the work surveyed– Don’t treat every work equally; elaborate on some representative

work and simply give pointers to other work• Summary

– Summarize the progress and the state of the art– Give recommendations if any (e.g., for practitioners) – Outlook (remaining challenges, future directions)

• References

Page 59: Introduction to IR Research

2008 © ChengXiang Zhai 59

Typical Structure of a Research Paper• 1. Introduction

– Background discussion to motivate your problem – Define your problem – Argue why it’s important to solve the problem– Identify knowledge gap in existing work or point out deficiency of

existing answers/solutions– Summarize your contributions– Briefly mention potential impact

• Tips: – Start with sentences understandable to almost everyone– Tell the story at a high-level so that the entire introduction is

understandable to people with no/little technical background in the topic

– Use examples if possible

Page 60: Introduction to IR Research

2008 © ChengXiang Zhai 60

Typical Structure of a Research Paper (cont.)

• 2. Previous/Related work – Sometimes this part is included in the introduction or appears later– Previous work = work that you extend (readers must be familiar with it to

understand your contribution)– Related work = work related to your work (readers can until later in the

paper to know about it) • Tips:

– Make sure not to miss important related work – Always safer to include more related work– Discuss the existing work and its connection to your work

• Your work extends …• Your work is similar to … but differs in that … • Your work represents an alternative way of …

– Whenever possible, explicitly discuss your contribution in the context of existing work

Page 61: Introduction to IR Research

2008 © ChengXiang Zhai 61

Typical Structure of a Research Paper (cont.)

• 3. Problem definition/formulation– Clearly define your problem

• If it’s a new problem, discuss its relation to existing related problems

• If it’s an old problem, cite the previous work– Justify why you define the problem in this way– Discuss challenges in solving the problem

• Tips: – Give both an informal description and a formal description if

possible– Make sure that you mention any assumption you make when

defining the problem (e.g., your focus may be on studying the problem in certain conditions)

Page 62: Introduction to IR Research

2008 © ChengXiang Zhai 62

Typical Structure of a Research Paper(cont.)

• 4. Overview of the solution(s) (can be merged with the next part)– Give a high-level information description of the proposed

solutions or solutions you study– Use examples if possible

• 5. Specific components of your solution(s)– Be precise (formal description helps)– Use intuitive descriptions to help people understand it

• Tips: – make sure that you organize this part so that it’s understandable

to people with various backgrounds – Don’t just throw in formulas; include high-level intuitive

descriptions whenever possible

Page 63: Introduction to IR Research

2008 © ChengXiang Zhai 63

Typical Structure of a Research Paper(cont.)

• 6. Experiment design: make sure you justify it– Data set – Measures – Experiment procedure

• Tips: – Given enough details so that people can reproduce

your experiments– Discuss limitation/bias if any, and discuss its potential

influence on your study

Page 64: Introduction to IR Research

2008 © ChengXiang Zhai 64

Typical Structure of a Research Paper(cont.)

• 7. Result analysis: – Organized based on research questions to be answered or hypotheses

tested– Be comprehensive, but focus on the major conclusions– Include “standard” components

• Baseline comparison • Individual component analysis• Parameter sensitivity analysis• Individual query analysis • Significance test

– Discuss the influence of any bias or limitation • Tips

– Don’t leave any question unanswered (try to provide an explanation for all the observed results)

– Discuss your findings in the context of existing work if possible • Similar observations have also been made in …• This is in contrast to … observed in … One explanation is ….

Page 65: Introduction to IR Research

2008 © ChengXiang Zhai 65

Typical Structure of a Research Paper(cont.)

• 8. Conclusions and future work– Summarize your contributions– Discuss its potential impact– Discuss its limitation and point out directions for

future work

• 9. References

Page 66: Introduction to IR Research

2008 © ChengXiang Zhai 66

Tips on Polishing your Paper• Start with the core messages you want to convey in the paper

and expand your paper by following the core story • Try to convey the core messages at different levels so that

people with different knowledge background can all get them• Try to write a review of your paper yourself, commenting on

its originality, technical soundness, significance, evaluation, etc, and then revise the paper if needed

• Check out reviewer’s instructions, e.g., the following: http://nips07.stanford.edu/nips07reviewers.html (not necessarily matching your conference, but should share a lot of common requirements)

• Try to polish English as much as you can

Page 67: Introduction to IR Research

2008 © ChengXiang Zhai 67

What an IR reviewer often looks for• Most important factors:

– Realistic setup of a retrieval problem • What kind of users would benefit from your research?

– Solid evaluation of methods • Truly state of the art baseline• Careful selection of data sets

– Use as many representative data sets as possible– Always use a standard data set (e.g., TREC) if possible

• Careful definition of measures• Unbiased experiment procedure

• General factors:– Quality of argument, novelty, writing, …– Avoid all kinds of careless mistakes! (If you aren’t careful about writing,

it’s possible you aren’t careful about your experiments either.)

Page 68: Introduction to IR Research

2008 © ChengXiang Zhai 68

Where to Publish IR Papers• Core IR conferences:

– ACM SIGIR, ACM CIKM – ECIR, AIRS

• Core IR journals– ACM TOIS, IRJ– IPM, JASIS

• Web Applications– WWW, WSDM

• Other related conferences– Natural Language Processing: HLT, ACL, NAACL, COLING, EMNLP– Machine Learning: ICML, NIPS– Data Mining: KDD, ICDM– Databases: SIGMOD, VLDB, ICDE

• …

Page 69: Introduction to IR Research

2008 © ChengXiang Zhai 69

After You Get Reviews Back• Carefully classify comments into:

– Unreasonable comments (e.g., misunderstanding):• Try to improve the clarity of your writing

– Reasonable comments • Constructive: easy to implement• Non-constructive: think about it, either argue the other way or mention

weakness of your work in the paper• If paper is accepted

– Take the last chance to polish the paper as much as you can– You’ll regret if later you discover an inaccurate statement or a typo in your

published paper• If paper is rejected

– Digest comments and try to improve the research work and the paper– Run more experiments if necessary – Don’t try to please reviewers (the next reviewer might say something opposite);

instead use your own judgments and use their comments to help improve your judgments

– Reposition the paper if necessary (again, don’t reposition it just because a reviewer rejected your original positioning)

Page 70: Introduction to IR Research

2008 © ChengXiang Zhai 70

Summary • Research is about discovery and increase our knowledge

(innovation & understanding)

• Intellectual curiosity and critical thinking are extremely important

• Work on important problems that you are passionate about

• Aim at becoming a top expert on one topic area– Obtain complete knowledge about the literature on the topic (read

all the important papers and monitor the progress)– Write a survey if appropriate– Publish one or more high-quality papers on the topic

• Don’t give up!

Page 71: Introduction to IR Research

Good Luck!

2008 © ChengXiang Zhai 71