exploring the political agenda of the greek parliament ... · 1.063.546 unique speeches output •...

20
www.athenarc.gr Exploring the Political Agenda of the Greek Parliament Plenary Sessions Dimitris Gkoumas, Maria Pontiki, Konstantina Papanikolaou, and Haris Papageorgiou ATHENA Research & Innovation Centre/Institute for Language & Speech Processing (ATHENA RIC/ILSP), Greece. ParlaCLARIN@LREC2018, 7 May 2018, Miyazaki, Japan “APOLLONIS, Greek Infrastructure for Digital Arts and Humanities, Language Research and Innovation” (MIS 5002738) “Computational Science and Technologies: Data, Content and Interaction” (MIS 5002437)

Upload: others

Post on 09-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Exploring the Political Agenda of the Greek

Parliament Plenary SessionsDimitris Gkoumas, Maria Pontiki, Konstantina Papanikolaou, and Haris Papageorgiou

ATHENA Research & Innovation Centre/Institute for Language & Speech Processing(ATHENA RIC/ILSP), Greece.

ParlaCLARIN@LREC2018, 7 May 2018, Miyazaki, Japan

“APOLLONIS, Greek Infrastructure for Digital Arts and Humanities, Language Research and Innovation” (MIS 5002738)

“Computational Science and Technologies: Data, Content and Interaction” (MIS 5002437)

Page 2: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

•Capturing the way political and policyagendas are shaped and evolve overtime.

•Reflecting the stance and the politicalbehavior under which individualParliament Members and politicalgroups react and act over time.

Parliament plenary sessions: Valuable Data Sources

2/20

Page 3: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Insights about:•the way in which the European Parliament reacts to exogenous events like the Eurocrisis (Greene and Gross,2017)

•the political attention of individual U.S. Senators (Quinn et al., 2010)

Parliament plenary sessions & Topic Modelling

3/20

Image source: https://theintelligenceofinformation.wordpress.com/2016/12/06/topic-modeling-latent-dirichlet-allocation-vs-correlation-explanation-alternative/

Page 4: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Exploring the Political Agenda of the Greek Parliament Plenary Sessions

Methodology: Overview

Data Collection Data Preparation Preprocessing Organization Topic Modelling Visualization

4/20

Page 5: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Data Collection

Data Collection

Data Preparation Preprocessing Organization Topic

Modelling Visualization

1989 to 2015 proceedingsà4356 plenary sessions

à1.063.546 unique speeches

5/20

Page 6: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Data Preparation

Different formats (txt/doc/docx/pdf/jpeg)

à structured and machine-readable xmlversion

Data Collection

Data Preparation

Preprocessing Organization Topic Modelling Visualization

6/20

Page 7: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Preprocessing

Data Collection

Data Preparation

Preprocessing

Organization Topic Modelling Visualization

ü Segment annotation: the start/end point of each speech

ü Temporal annotation: the date of each speech

ü Speaker annotation: the name of each speaker and his/her politicalaffiliation

ü POS tagging7/20

Page 8: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Data Organization

ü time windows of fixed durationà 1 month

ü 283 windows for the time period from 1989 to 2015

àdiscover topics within the monthly proceedings (e.g. segments)& monitor their dynamics over time

Data Collection

Data Preparation Preprocessing

Organization

Topic Modelling Visualization

8/20

Page 9: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Topic Modelling

Data Collection

Data Preparation Preprocessing Organization

Topic Modelling

Visualization

q Dynamic Non-negative Matrix Factorization (NMF) approach (Greene and Cross, 2017).

q After identifying a certain number of topics in each 1-month-window, the ones which are semantically similar are groupedtogether as “dynamic topics”, rather than identifying topics from the beginning.

9/20

Page 10: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Dynamic NMF

Input • Processed and organized data• 283 monthly windows,

1.063.546 unique speeches

Output • 283 window-based topics

• 90 dynamic topics

• Document term-matrix TF-IDF normalized• -stop words, +content words• -rare terms & units with length less than three

lines, +nouns & adjectives

• NMF model generates topics for each windowPhase 1

• NMF algorithm is applied again, with the number of k topics decided after applying the Topic Coherence via Word2Vec (TC-W2V)

• GrParl speeches used as a training corpus for the extraction of the word embeddings

Phase 2

10/20

Page 11: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Visualization

Data Collection Data Preparation Preprocessing Organization Topic Modelling

Visualization

ü By topic

ü By time

ü Search function

11/20

Page 12: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Greek Parliament Plenary Sessions presented by topic

12/20

Page 13: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

20 most prominent words for the dynamic topic D79

13/20

Page 14: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Involved Parliament Members

14/20

Page 15: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Contribution of each political party to a dynamic topic

15/20

Page 16: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Contribution of each region to a dynamic topic

16/20

Page 17: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Most important window-topics related to a dynamic topic

17/20

Page 18: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Greek Parliament Plenary Sessions presented by time

18/20

Page 19: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

DiscussionC

ontri

butio

nTransformation of the Greek Parliament plenary sessions minutes into a structured and machine readable corpus

√ For the first time available forNatural Language Processing andText Mining approaches

Insightful navigation across the Greek Parliament corpus

√ Topic based presentation√ Time-window based presentation √ Search function

Futu

re W

ork Qualitative analysis of the extracted dynamic topics

& assignment of descriptive labels

Exploration of the evolution of selected topics (e.g. refugee crisis, Eurocrisis) & comparison with the European Parliament and/or other European countries

Enrichment with more insights e.g. Quotations, Sentiment

19/20

Page 20: Exploring the Political Agenda of the Greek Parliament ... · 1.063.546 unique speeches Output • 283 window-based topics • 90 dynamic topics •Document term-matrix TF-IDF normalized

www.athenarc.gr

Dimitris Gkoumas, Maria Pontiki, Konstantina Papanikolaou, and Haris PapageorgiouATHENA-Research & Innovation Centre in Information, Communication and Knowledge Technologies (ATHENA RIC), Greece.

Exploring the Political Agenda of the Greek Parliament Plenary Sessions

ParlaCLARIN@LREC2018, 7 May 2018, Miyazaki, Japan

“Computational Science and Technologies: Data, Content and Interaction” (MIS 5002437)“APOLLONIS, Greek Infrastructure for Digital Arts and Humanities, Language Research

and Innovation” (MIS 5002738)