seminar feb 2016

27
Mathematical Modelling and Analysis of Legislation Networks Neda Sakhaee University of Auckland [email protected] February 25, 2016 Neda Sakhaee (UOA) Legislation Network February 25, 2016 1 / 27

Upload: neda-sakhaee

Post on 22-Jan-2018

151 views

Category:

Documents


0 download

TRANSCRIPT

Mathematical Modelling and Analysis ofLegislation Networks

Neda Sakhaee

University of Auckland

[email protected]

February 25, 2016

Neda Sakhaee (UOA) Legislation Network February 25, 2016 1 / 27

Overview

1 Introduction

2 Research ScopeQuestionsObjectives and Methodologies

3 Progress To-Date and Initial ResultsBuilding the NetworkNetwork General MeasuresCentrality, Important NodesInitial Community DetectionModelling ExampleAcademic outcomes

4 Future StudiesOverviewGenerative Models, Pattern PredictionCommunity Detection Method

5 Requirements and LimitationsDataSoftware

6 Time-table for my PhD project

Neda Sakhaee (UOA) Legislation Network February 25, 2016 2 / 27

Introduction

Introduction

This research looks at Legislation Networks within the wider class of citationnetworks.

Main case study is New Zealand Legislation Network, but comparisonstudies with other countries are included.

Legislation Network is a multi-layer graph and has some novel featureswhich make it an excellent test case for new network science tools.

They involve legal documents, but differ substantially from citation networksinvolving case law, Supreme Court opinions, etc.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 3 / 27

Introduction

Legislation Network

The concept of Legislation Network was introduced in 2015 as a novelapproach to show interdependence of the European Union laws.

In Legislation Network:

Nodes:Laws (Acts, Regulations, etc.)

Edges:

Any time one law references another law (definitions, amendments, etc.).We classify these as amendment edges or citation edges.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 4 / 27

Research Scope Questions

Research ScopeQuestions

How to build a Legislation Network? Is the Legislation Network ameaningful network, or just a set of random relationships?

What are the differences and similarities between Legislation Networkand other citation networks?

Can we measure the importance of legal documents using the networkscience tools? How does this relate to human expert opinion?

Is there any meaningful relationship between the Legislation Networkmeasures and the political or social processes?

Neda Sakhaee (UOA) Legislation Network February 25, 2016 5 / 27

Research Scope Questions

Research ScopeQuestions

Do the legal documents tend to cluster?

Can we find a good generative model for Legislation Network?

How can we study time evolution of Legislation Network?

Is this possible to predict the attributes of the legal documents inlegislation network?

Is this possible to predict the historical missing edges in LegislationNetwork?

Neda Sakhaee (UOA) Legislation Network February 25, 2016 6 / 27

Research Scope Objectives and Methodologies

Research ScopeObjectives and Methodologies

Stage IInvestigate how Legislation Network can be built and compared to theother networks.

Stage IIImplement appropriate centrality measures to determine and compare theimportance of legal documents.

Stage IIIDevelop a generative model of Legislation Network.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 7 / 27

Research Scope Objectives and Methodologies

Research ScopeObjectives and Methodologies

Stage IVContribute to community detection algorithms of directed networks.

Stage VInvestigate relationships between the network science properties and thepolitical or social processes.

Stage VIPropose link prediction and attribute prediction models for LegislationNetworks.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 8 / 27

Progress To-Date and Initial Results Building the Network

Progress To-Date and Initial ResultsBuilding the Network

We downloaded 8900 xml files of laws, dated from 1267 to October 2015,from Legislation.govt.nz. Substantial manual cross checking and datacleaning is required.

All types of references are extracted by a C sharp program from the xmlfiles. We have six different Acts networks based on the node and edge type:

Binary Whole Network (BWN)Weighted Whole Network (WWN)Binary Citation Network (BCN)Weighted Citation Network (WCN)Binary Amendment Network (BAN)Weighted Amendment network (WAN)

Neda Sakhaee (UOA) Legislation Network February 25, 2016 9 / 27

Progress To-Date and Initial Results Building the Network

Progress To-Date and Initial Results

Figure: Dataset Building Process

 

Act  Name:  Marriage  Act  1955  Type:  Public  Date:  27/11/1955  Terminated:  0  Year:  1955  Reprint:  1  Date  reprinted:  19/08/2013  Cites:  Child  Welfare  Amendment  Act  1948    

Data  Extraction  Process  

a)  Citation  Link  

Data  Extraction  Process  

Act  Name:  Marriage  (Definition  of  Marriage)  Amendment  Act  2013  Type:  Public  Date:  19/04/2013  Terminated:  0  Year:  2013  Reprint:  0  Date  reprinted:  -­‐  Amends:  Marriage  Act  1955    

b)  Amendment  Link  

Neda Sakhaee (UOA) Legislation Network February 25, 2016 10 / 27

Progress To-Date and Initial Results Building the Network

Progress To-Date and Initial ResultsData sets and Visualizations

Data set and visualisation are available at:https://dataverse.harvard.edu/dataverse/LN

Neda Sakhaee (UOA) Legislation Network February 25, 2016 11 / 27

Progress To-Date and Initial Results Network General Measures

Progress To-Date and Initial ResultsNetwork General Measures

RN*** BWN WWN RN*** BCN WCN RN*** BAN WANNodes 3856 3856 3856 2142 2142 2142 3856 3856 3856Edges 33884 33884 33884 20124 20124 20124 9030 9030 9030Average Degree 9.712 8.878 13.233 10.112 9.395 17.257 3.207 2.342 3.648Diameter 15 15 15 15 15 15 15 15 15CCcyc 0.003 0.223 0.223 0.004 0.492 0.492 0.001 0.031 0.031CCmid 0.003 0.305 0.305 0.004 0.655 0.655 0.001 0.066 0.066CCin 0.003 0.528 0.528 0.004 0.414 0.414 0.001 0.03 0.030CCout 0.003 0.506 0.506 0.004 0.374 0.374 0.001 0.033 0.033Average CC* 0.003 0.446 0.446 0.004 0.484 0.484 0.001 0.004 0.004Average Path length** 6.124 3.569 3.569 7.254 3.346 3.346 1.817 4.43 4.43Small world No Yes Yes No Yes Yes No No No

*Based on directed clustering coefficient as proposed by B.M.Tabak in 2014.

**Based on Average Path length proposed by S.H.Strogatz in 1995.

***Random Network is a graph with specific number of vertices n and connection probability of p. Theindexes are calculated based on a sample of 100 graphs.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 12 / 27

Progress To-Date and Initial Results Network General Measures

Progress To-Date and Initial ResultsIn-degree Out-degree correlation

There is not any meaningfulll correlation between out-degree and in-degreeof regular citation networks, but:

R result for Pearson’sproduct-moment correlation

X=In-Degree Y=Out-Degree

t = 57.332 df = 2140 p-value < 2.2e-16

Alternative hypothesis: true

Correlation is not equal to 0

95 percent confidence interval: 0.761 0.794

Sample estimates correlation: 0.778

0 50 100 150 200 250 300

0100

200

300

X

Y

Neda Sakhaee (UOA) Legislation Network February 25, 2016 13 / 27

Progress To-Date and Initial Results Centrality, Important Nodes

Progress To-Date and Initial ResultsCentrality, Important Nodes

Intuitively presented by Borgatti in 2005, centrality measures describethe importance of nodes. Each measure depends on an implicit modelof how traffic flows in the network.

We chose a standard measure (similar to PageRank), EigenvectorCentrality as our key measure which initially proposed by Bonacic in1972.

Later in community detection part, we use Betweenness centrality,introduced by Borgatti in 2005, to label the communities.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 14 / 27

Progress To-Date and Initial Results Centrality, Important Nodes

Progress To-Date and Initial ResultsEigenvector Centrality Result

Table: Top ten Acts

BWN & WWN BCN & WCN BAN & WANCriminal Procedure Act 2011 1 Public Finance Act 1989 1 Public Finance Act 1989 1Public Finance Act 1989 0.86 Criminal Procedure Act 2011 0.94 State Sector Act 1988 0.95Summary Proceedings Act 1957 0.84 Summary Proceedings Act 1957 0.93 Companies Act 1993 0.87State Sector Act 1988 0.77 State Sector Act 1988 0.85 Summary Proceedings Act 1957 0.83Companies Act 1993 0.67 District Courts Act 1947 0.82 Local Government Act 2002 0.83Local Government Act 2002 0.62 Judicature Act 1908 0.74 Criminal Procedure Act 2011 0.79Privacy Act 1993 0.62 Crimes Act 1961 0.72 District Courts Act 1947 0.69Crimes Act 1961 0.61 Privacy Act 1993 0.69 Official Information Act 1982 0.68Regulations (Disallowance) Act 1989 0.55 Companies Act 1993 0.67 Education Act 1989 0.63Official Information Act 1982 0.54 Local Government Act 1974 0.65 Land Transfer Act 1952 0.6Average EC 0.05 Average EC 0.05 Average EC 0.05

Eigenvector centrality values are normalised.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 15 / 27

Progress To-Date and Initial Results Initial Community Detection

Progress To-Date and Initial ResultsInitial Community Detection

Community detection algorithms provide a clearer picture of thenetwork.

Network clustering is a specific type of data clustering problem whichincludes network measures as variables in the objective functions. Themost famous network clustering models are spectral clustering andmodularity clustering.

We will use the Modularity method and Louvain algorithm to solve it.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 16 / 27

Progress To-Date and Initial Results Initial Community Detection

Progress To-Date and Initial ResultsBCN clusters based Modularity method and Louvain algorithm

Criminal Procedure Act 2011

Resource Management Act 1991

Public Finance Act 1989Local Government Act 1974

Companies Act 1993

Judicature Act 1908

Decimal Currency Act 1964

Patents Act 1953

Remainder

Neda Sakhaee (UOA) Legislation Network February 25, 2016 17 / 27

Progress To-Date and Initial Results Initial Community Detection

Progress To-Date and Initial ResultsWAN clusters based Modularity method and Louvain algorithm

Summary Proceedings Act 1957

Decimal Currency Act 1964

Public Finance Act 1989

Reserve Bank of New Zealand Act 1989

Income Tax Act 2004

Official Information Act 1982

Local Government Act 1974Criminal Procedure Act 2011

Employment Relations Act 2000

Public Works Act 1981

Government Superannuation Fund Act 1956

Remainder

Neda Sakhaee (UOA) Legislation Network February 25, 2016 18 / 27

Progress To-Date and Initial Results Modelling Example

Progress To-Date and Initial ResultsModelling Example

Hypothesis: Major Party in the government impacts the creation of

important laws.

{H0 CNational = CLabour

H1 CNational 6= CLabour

t-test result from R:

Parameter Valuet statistics −3.0250df 885P-value 0.0026%95 confidence interval [−0.0308,−0.0007]µCNational

0.0430µCLabour

0.0680

Result: H0 is rejectedOn average Labour governments produce more importantlegislation.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 19 / 27

Progress To-Date and Initial Results Academic outcomes

Progress To-Date and Initial ResultsAcademic outcomes

Presentation in INFORMS 2015, Philadelphia, Modelling of NewZealand Acts Network.Poster presentation in INFORMS 2015, Philadelphia, LegislationNetwork.Presentation in CMSS 2016, Auckland, Mathematical analysis of NewZealand Legislation Network.Presentation in Pitch on the Plains 2015, Christchurch, Law Sense.Journal paper under preparation for Cambridge Network Science, 85percent completed.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 20 / 27

Future Studies Overview

Future StudiesOverview

Generative model, degree distribution, pattern studies.Contribute to community detection.Relation of network properties with political and legal processes.Improving historical data and studying time evolution.Comparative studies (Canada, Australia, e.c.).Studying death of nodes (repeals, replacements).Link and attribute prediction.Legislative drafting tool, identifying dependency between Acts.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 21 / 27

Future Studies Generative Models, Pattern Prediction

Future StudiesGenerative Models, Pattern Prediction

Referring to Clauset in 2014, network generative models:Makes network different from noise and random graphs.Helps to describe the network succinctly and capture most relevantpatterns.Helps us to generalise from one part of the network to another, fromone network to other of same type, from small scale to large scale, orfrom past to future.Consider Graph G, then a generative model can be proposed as aprobability distribution P (G | θ) with parameter θ.The aim of the research is to find this probability distribution forLegislation Network using Bayesian inference.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 22 / 27

Future Studies Community Detection Method

Future StudiesCommunity Detection Method

It is an optimising problem, then:Objective function captures the notion of community structure asgroups of nodes with better internal connectivity than external. AQuality Measure needs to be defined.An algorithmic techniques, assigns the nodes of the network tospecific communities, optimising the objective function.It is a computing difficult problem and needs heuristic methods.It is well studied for undirected graph, but a gap exist in the literaturefor directed networks. This research aims to contribute to the latestmethods and algorithems for directed graph clustering problem. Thenthe results can be examined using the context of the documents.

Neda Sakhaee (UOA) Legislation Network February 25, 2016 23 / 27

Requirements and Limitations

Requirements and Limitations

DataIn the dataset there are some historical documents which are cited by

other documents, but their own document don’t exist in the xml files set.This missing historical data is required to complete to time evolution andgenerative models studies.SoftwareRequired softwares for this research are:

RMATLABGephiLATEX

Neda Sakhaee (UOA) Legislation Network February 25, 2016 24 / 27

Time-table for my PhD project

The time-table of PhD project completion

i Name/Title Start Date End Date Percent Complete1 Mathematical Modelling and Analysis of Legislation Network 01/02/15 31/01/18 34.71.1 Provisional Year 01/02/15 25/02/16 1001.1.1 Data Preparation 01/02/15 14/05/15 1001.1.2 Build the Network 14/05/15 15/06/15 1001.1.3 Network 5/29/15 5/29/15 1001.1.4 Literature of general measures 15/07/15 15/11/15 1001.1.5 Initial results 16/07/15 31/07/15 1001.1.6 First talk (INFORMS 2015) 01/08/15 04/11/15 1001.1.7 INFORMS 04/11/15 04/11/15 1001.1.8 Thesis Proposal 01/09/15 10/02/16 1001.1.9 CMSS Workshop preparation 01/01/16 20/02/16 1001.1.10 Departmental Seminar 01/01/16 25/02/16 1001.2 First Paper: network structure, general measures, initial modelings 01/08/15 15/03/16 851.2.1 Outline 01/09/15 30/09/15 1001.2.2 Network Building 01/08/15 31/08/15 1001.2.3 General Measures 01/08/15 30/09/15 1001.2.4 Centrality 01/08/15 29/02/16 801.2.5 Community Detection 30/9/15 29/02/16 801.2.6 Initial Modelling 15/10/15 29/02/16 1001.2.7 Writing 22/11/15 15/03/16 651.2.8 Submit the paper 15/03/16 15/03/16 01.3 Second Paper: develop community detection method for directed networks 15/03/16 20/10/16 111.3.1 Outline 15/03/16 31/03/16 01.3.2 Analysis and results 15/03/16 15/08/16 251.3.3 Writing 20/04/16 20/10/16 01.3.4 Submit the paper to the Cambridge Network Science Journal 20/10/16 20/10/16 01.4 Third Paper: generative models and prediction studies 01/08/16 01/03/17 51.4.1 Outline 01/08/16 31/08/16 01.4.2 Analysis and Result 01/09/16 28/02/17 101.4.3 Writing 15/09/16 01/03/17 01.4.4 Submit the paper to the Cambridge Network Science Journal 01/03/17 01/03/17 01.5 Fourth Paper: modelling and comparison studies 01/01/17 30/06/17 71.5.1 Outline 01/03/17 31/03/17 01.5.2 Analysis and Results 01/01/17 01/06/17 151.5.3 Writing 01/02/17 30/06/17 01.5.4 Submit the paper to the World Politics Journal 30/06/17 30/06/17 01.6 Theis 01/02/17 31/01/18 51.6.1 Write the thesis 01/02/17 31/01/18 5

Conference talks: SUNBELT, JURIX, e.c.Neda Sakhaee (UOA) Legislation Network February 25, 2016 25 / 27

Time-table for my PhD project

References[1] A. Barabási. Emergence of Scaling in Random Networks. Science 286.5439 (Oct. 15, 1999), 509-512.[2] Michael J. Bommarito, Daniel Katz, and Jon Zelner. Law as a seamless web: compar- ison of various network

representations of the United States Supreme Court corpus (1791-2005). In: ACM Press, 2009, p. 234.[3] Stephen P. Borgatti. Centrality and network flow. Social networks 27.1 (2005), 55-71.[4] U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner. On Modularity

Clustering. IEEE Transactions on Knowledge and Data Engineering 20.2 (Feb. 2008), 172-188.[5] Aaron Clauset, Cristopher Moore, and Mark EJ Newman. Hierarchical structure and the prediction of missing

links in networks. Nature 453.7191 (2008), 98-101.[6] Steven H. Strogatz Duncan J. Watts. Collective dynamics of ?small-world? networks (1998).[7] P. Erdo ?s and A. Rényi. On the strength of connectedness of a random graph. Acta Mathematica Academiae

Scientiarum Hungarica 12.1 (Sept. 29, 2013), 261-267.[8] Giorgio Fagiolo. Clustering in complex directed networks. Physical Review E 76.2 (2007), 026107.[9] Timothy R. and Spriggs James F. and Jeon Sangick and Wahlbeck Paul J. Fowler James H. and Johnson.

Network Analysis and the Law: Measuring the Legal Im- portance of Precedents at the U.S. Supreme Court.Political Analysis (2007).[10] Michel Grabisch. Social networks: Prestige, centrality, and influence (Invited paper). 2011.[11] Marios Koniaris, Ioannis Anagnostopoulos, and Yannis Vassiliou. Network Analysis in the Legal Domain: A

complex model for European Union legal sources. (2015).[12] Elizabeth A. Leicht, Gavin Clarkson, Kerby Shedden, and Mark EJ Newman. Large- scale structure of time

evolving citation networks. The European Physical Journal B 59.1 (2007), 75-83.[13] Linyuan Lü and Tao Zhou. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics

and its Applications 390.6 (2011), 1150-1170.[14] Pierre Mazzega, Danièle Bourcier, and Romain Boulet. The network of French legal codes. In: Proceedings of

the 12th international conference on artificial intelligence and law. ACM, 2009, pp. 236-237.[15] Mark EJ Newman. Modularity and community structure in networks. Proceedings of the National Academy of

Sciences 103.23 (2006), 8577-8582.[16] Mark EJ Newman. The structure and function of complex networks. SIAM review 45.2 (2003), 167-256.[17] Benjamin M. Tabak, Marcelo Takami, Jadson M. C. Rocha, Daniel O. Cajueiro, and Sergio R. S. Souza.

Directed clustering coefficient as a measure of systemic risk in complex banking networks. Physica A: StatisticalMechanics and its Applications 394 (Jan. 15, 2014), 211-216.[18] Duncan J. Watts and Steven H. Strogatz. Collective dynamics of ?small-world? net- works. Nature 393.6684

(June 4, 1998), 440.[19] Lavanya Zhang Paul and Koppaka. Semantics-based Legal Citation Network. In: Proceedings of the 11th

International Conference on Artificial Intelligence and Law.Neda Sakhaee (UOA) Legislation Network February 25, 2016 26 / 27

Time-table for my PhD project

Question?

Neda Sakhaee (UOA) Legislation Network February 25, 2016 27 / 27