detecting the missing information in misinformation

30
Detecting the Missing Information in Misinformation Emre Kıcıman [email protected] @emrek Microsoft Research

Upload: others

Post on 05-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Detecting the Missing Information in Misinformation

Detecting the Missing Information

in MisinformationEmre Kıcıman

[email protected]@emrek

Microsoft Research

Page 2: Detecting the Missing Information in Misinformation

Collaborators

Rohail SyedU. Michigan

Michael GolebiewskiMicrosoft

Sudha RaoMicrosoft

Bruno AbrahaoNYU

Bhaskar MitraMicrosoft

Page 3: Detecting the Missing Information in Misinformation

Misinformation, disinformation, and fake news

• Misinformation: information that is false or misleading

• Disinformation: misinformation created with intent to harm

• Related: fake news, satire, mal-information, propaganda, clickbait, …, credibility, reliability

Page 4: Detecting the Missing Information in Misinformation

Effects of misinformation

Individual decisions Polarization No Trust

Page 5: Detecting the Missing Information in Misinformation

Misinformation: What’s being done?

• Fact checking, automated detection• Human, diffusion, network based,

reputation, content based

• Mitigating creation and spread• Legal and platform efforts

• Education• General population, media org.

• Research into effects

Page 6: Detecting the Missing Information in Misinformation

Web Search Perspective

• Search engines: Trusted representation of the web

• Conflicting principles?• Enable information access• Don’t mislead people

• Key difference with other platforms: People have a query

• If a person knows something exists, but search engine doesn’t show it, this can cause distrust, feed conspiratorial mindset

• One approach:• Show misinformation if someone searches for it directly• Perhaps label it clearly, and/or interleave with responses and checks• Avoid showing it otherwise. (i.e., don’t aid discovery and spread)

Page 7: Detecting the Missing Information in Misinformation

Missing information

Page 8: Detecting the Missing Information in Misinformation

Missing information in misinformation

• Fake news is often missing details to substantiate story• Often low quality and/or short

• Focus on emotion and reaction

• Similar phenomenon found in fake reviews also lacking detail [Ott, Cardie, Choy, Hancock, 2011]

Page 9: Detecting the Missing Information in Misinformation

Fake news example

Judge Mahal al Alallaha-Smith of the 22nd District

Federal Court of Appeals ruled this morning that two

“critical issues for Muslims” in Sharia Law had to

be abided by in the United States court system

because of the systematic infusion clause and

because the 14th Amendment guarantees them the

rights guaranteed by other states.

“[…] With that as precedent, understanding that a

higher court may reverse it, my decision is that

items one and two on the docket are allowable

between family members as prescribed by Sharia Law.”

https://www.snopes.com/fact-check/muslim-federal-judge-sharia/

Page 10: Detecting the Missing Information in Misinformation

Fake news example

Judge Mahal al Alallaha-Smith of the 22nd District

Federal Court of Appeals ruled this morning that two

“critical issues for Muslims” in Sharia Law had to

be abided by in the United States court system

because of the systematic infusion clause and

because the 14th Amendment guarantees them the

rights guaranteed by other states.

“[…] With that as precedent, understanding that a

higher court may reverse it, my decision is that

items one and two on the docket are allowable

between family members as prescribed by Sharia Law.”

https://www.snopes.com/fact-check/muslim-federal-judge-sharia/

Where is the “22nd District Federal Court of Appeals”?

Page 11: Detecting the Missing Information in Misinformation

Fake news example

Judge Mahal al Alallaha-Smith of the 22nd District

Federal Court of Appeals ruled this morning that two

“critical issues for Muslims” in Sharia Law had to

be abided by in the United States court system

because of the systematic infusion clause and

because the 14th Amendment guarantees them the

rights guaranteed by other states.

“[…] With that as precedent, understanding that a

higher court may reverse it, my decision is that

items one and two on the docket are allowable

between family members as prescribed by Sharia Law.”

https://www.snopes.com/fact-check/muslim-federal-judge-sharia/

What is “the systematic infusion clause”?

Page 12: Detecting the Missing Information in Misinformation

Fake news example

Judge Mahal al Alallaha-Smith of the 22nd District

Federal Court of Appeals ruled this morning that two

“critical issues for Muslims” in Sharia Law had to

be abided by in the United States court system

because of the systematic infusion clause and

because the 14th Amendment guarantees them the

rights guaranteed by other states.

“[…] With that as precedent, understanding that a

higher court may reverse it, my decision is that

items one and two on the docket are allowable

between family members as prescribed by Sharia Law.”

https://www.snopes.com/fact-check/muslim-federal-judge-sharia/

What are the “other states”?

Page 13: Detecting the Missing Information in Misinformation

Fake news example

Judge Mahal al Alallaha-Smith of the 22nd District

Federal Court of Appeals ruled this morning that two

“critical issues for Muslims” in Sharia Law had to

be abided by in the United States court system

because of the systematic infusion clause and

because the 14th Amendment guarantees them the

rights guaranteed by other states.

“[…] With that as precedent, understanding that a

higher court may reverse it, my decision is that

items one and two on the docket are allowable

between family members as prescribed by Sharia Law.”

https://www.snopes.com/fact-check/muslim-federal-judge-sharia/

What is the “higher court”?

Page 14: Detecting the Missing Information in Misinformation

Fake news example

Judge Mahal al Alallaha-Smith of the 22nd District

Federal Court of Appeals ruled this morning that two

“critical issues for Muslims” in Sharia Law had to

be abided by in the United States court system

because of the systematic infusion clause and

because the 14th Amendment guarantees them the

rights guaranteed by other states.

“[…] With that as precedent, understanding that a

higher court may reverse it, my decision is that

items one and two on the docket are allowable

between family members as prescribed by Sharia Law.”

https://www.snopes.com/fact-check/muslim-federal-judge-sharia/

What is the docket number? Who is suing whom?

Page 15: Detecting the Missing Information in Misinformation

How can we detect missing information?

Adapt models built for Question Answering tasks

E.g. SQuAD models https://rajpurkar.github.io/SQuAD-explorer/

Given a question and a passage of text, these models find the answer in text, or say it is missing.

Page 16: Detecting the Missing Information in Misinformation

Approach

Usually-Answerable Questions (UAQ): For a class of articles, set of template questions usually answered in reliable news articles

Article Evaluation: How many UAQs are answerable by the article?

Ground-truth human evaluation

Scalable QA model evaluation

Qualitative Outcome: Explain score by showing UAQs themselves

Page 17: Detecting the Missing Information in Misinformation

Preliminary Experiments and Results

Simple NER-driven template questions• X:person →Where does <X> work?

• Y:profession →What is the <Y>’s name?

• …

Gather 5000+ fake and real news articles, randomly subsample

Ground-truth human evaluation of question-answering

Scalable QA model evaluation

Page 18: Detecting the Missing Information in Misinformation

Preliminary Evaluation: Crowd-workers

Is the Q answerable by article?

Crowdsourced eval• 3 judgments

Result: Real news answers more UAQs than fake.

Varies based on Q

Question Avg. Fake Avg. Real p-val

Overall 24% 39% 3E-7

Where is the X? 12% 52% 0.001

Where is X? 12% 42% 0.008

What happened in X? 29% 63% 0.009

When did X happen? 12% 40% 0.011

Who was in X? 28% 57% 0.024

Where was X? 11% 33% 0.027

Page 19: Detecting the Missing Information in Misinformation

Preliminary Evaluation: BERT QA Model

QA Model eval

• Basic BERT model

Result: Unreliable news less answerable than reliable news

Page 20: Detecting the Missing Information in Misinformation

Summary of Preliminary Experiments

• There is missing information in fake news

• Some questions are more missing than others

• We can automate Q&A models

Page 21: Detecting the Missing Information in Misinformation

Open/on-going work

1. Are we choosing the right Usually-Answerable-Questions?

2. Why are questions unanswerable in an article?• Common knowledge question

• Nonsensical question

• Missing information

3. Improving the QA model

4. UAQ as input to learned fake news classification

Page 22: Detecting the Missing Information in Misinformation

What if this works?

Page 23: Detecting the Missing Information in Misinformation

Bigger implications: Fake news as arms-race

• If missing information is an identifiable sign of misinformation, ...… then authors will add fake news

• Hypothesis: The more facts are included, the easier fact-checking is

Missing informationdetection

Fact checking

Page 24: Detecting the Missing Information in Misinformation

Bigger implications: Information literacy

• Method not only generates a quantitative score, but also can explain its reasoning by example

• Will this promote more critical thinking and reading by teach people what to look for? 😀

• Or will people use it as a crutch to avoid thinking themselves? 😟

Page 25: Detecting the Missing Information in Misinformation

Broader challenges

Page 26: Detecting the Missing Information in Misinformation

Broader challenges: Fact-checking Synthetic media/deepfakes• Misinformation, missing information,

and fact-checking not limited to text

• How to fact check the rhetoric of a video? When is manipulation ok, when is it a lie?

• Fact checking mixed media (Eg misleading captions)

• How to identify deepfakes?

Page 27: Detecting the Missing Information in Misinformation

Broader challenge: Beyond “broadcast” misinformation• What happens when misinformation is

targeted? • I.e., in a phishing attack.

• With growth of AI, targeted attacks will become more scalable and automatable• phone calls, emails, text msg

• Today’s “broadcast” fakes can be caught as they get wide distribution.

• How to deploy and scale detection and checking for individualized attacks

Page 28: Detecting the Missing Information in Misinformation

Helping individuals

Noelle Martin

• Victim of deepfake attack

• Now, a law reform activist

• Combating image-based abuse

Page 29: Detecting the Missing Information in Misinformation

Summary

Missing information in misinformation

• Can we identify details that are missing from articles

• Will this force authors to add more details?

• Will it help readers be more critical thinkers?

Fact-checking more broadly:

• Rapidly changing landscape, many threat models and many points of leverage

→Many research and impact opportunities

Page 30: Detecting the Missing Information in Misinformation

Thanks! Questions?

Emre Kıcıman

[email protected]

• @emrek