universidade de lisboa instituto superior tecnico ... · rando uma decomposi˘c~ao de grafos da...
TRANSCRIPT
UNIVERSIDADE DE LISBOA
INSTITUTO SUPERIOR TECNICO
Improving Information Security in Ranking, Recommender and Control
Systems
Guilherme Henrique Cacador Ramos
Supervisor: Doctor Carlos Manuel Costa Lourenco Caleiro
Thesis specifically prepared to obtain the PhD Degree in
Information Security
Draft
September 13, 2018
To my family and friends.
“We can only see a short distance ahead, but we can see plenty there that needs to be done.”
Alan Turing
Resumo
Nesta tese, propomos varios desenvolvimentos em seguranca de informacao nas areas de
comercio eletronico, sistemas de ranking, sistemas de recomendacao, e de controlo de sis-
temas, utilizando ideias provenientes da area de teoria de informacao.
Primeiro, propusemos um sistema de ranking que agrupa utilizadores usando metricas de
semelhanca, com base nas suas preferencias. O sistema apresenta rankings, possivelmente
distintos, para o mesmo produto a grupos de utilizadores diferentes. Alem da vantagem
de apresentar um ranking mais personalizado aos utilizadores, o sistema e mais resistente a
ataques e ruıdo do que o estado-da-arte, como avaliado pelos dados reais. Depois, exploramos
o efeito de suborno (bribing) em sistemas de ranking baseados em reputacao, no cenario de
um ranking para cada produto e no cenario que propusemos. Esta ideia provem da area de
controlo, pois o objectivo e controlar utilizadores de forma a levar o ranking de produtos
para valores desejados. Encontramos as estrategias de suborno optimas e avaliamos a nossa
metodologia com dados reais, verificando que o sistema proposto e mais robusto. Em sistemas
de recomendacao, propusemos o uso, em dois contextos, das metricas de semelhanca intro-
duzidas. Introduzimos um sistema de recomendacao eficiente e que apresenta resultados que
competem com o estado-da-arte, por vezes melhores, em dados reais e artificiais. Propusemos
um sistema de recomendacao para grupos usando uma das metricas introduzidas, com melhor
complexidade computacional que a metrica padrao, tendo um ganho de horas em dados reais.
Em controlo de sistemas, apresentamos metodos para encontrar o posicionamento do
numero mınimo de actuadores em sistemas LTI e em sistemas LTI alternados no cenario em
que um conjunto de controladores falhe, por exemplo devido a um ciberataque. Mostramos
que o primeiro problema e NP-completo e desenhamos algoritmos para o resolver explicita-
mente e para aproximar a solucao em tempo polinomial. Criamos, ainda, algoritmos para
resolver e aproximar a solucao de dois cenarios em sistemas LTI alternados. Por fim, explo-
rando uma decomposicao de grafos da area de controlo, apresentamos um novo limitara o
ındice de convergiria de matrizes Booleanas com propriedades adequadas.
Palavras-chave: Sistemas de Ranking, Sistemas de Recomendacao, Sistemas de Controlo,
Seguranca de Informacao, Matrizes Booleanas.
i
Abstract
In this thesis, we propose to advance in the field of information security in the areas of e-
commerce, ranking systems, recommender systems and control systems, using ideas from the
area of information theory.
First, we propose a ranking system that groups users based on their preferences, introduc-
ing similarity measures. The system presents possibly distinct rankings for the same product
in different user groups. Besides the advantage of presenting more personalized rankings to
users, it is a system more resistant to attacks and noise than the state-of-the-art, as evaluated
from real data. We then explore the effect of bribing in reputation-based ranking systems,
in the usual scenario (a ranking for each product) and in the scenario we proposed. This
idea is inspired by the area of control because the goal is to control users, in order to drive
the ranking of products to desired values. We find the optimal bribing strategies, and we
evaluate our methodology with real data, with the proposed ranking system being more ro-
bust to bribery. In recommender systems, we propose the use of the introduced similarity
measures in two contexts. In the first, we introduce an efficient recommender system that
presents results that compete with the state-of-the-art, being sometimes better, in both real
and synthetic data. In the second, we propose a group recommender system utilizing one of
the introduced measures. We achieve better computational complexity compared with the
standard measures, obtaining a gain of hours in real data.
In control systems, we present methods to find the placement of the minimum number
of inputs in LTI systems and switched LTI systems, in the eventual scenario where a set of
controllers may fail, e.g., due to a cyberattack. In the first case, we prove that the problem
is NP-complete. We design an algorithm to solve it explicitly, and also one to approximate
the solution in polynomial time. In the second case, we design algorithms to solve and to
approximate the solution for two scenarios in switched LTI systems.
Finally, exploring a decomposition of digraphs from the area of control, we present a more
general bound for the index of convergence of Boolean matrices.
Keywords: Ranking Systems, Recommender Systems, Control Systems, Information Secu-
rity, Boolean Matrices.
iii
Acknowledgments
The work presented in this thesis would not have been possible without the support, influence
and encouragement of several people, namely, professors, colleagues, friends and family.
First, and foremost, I would like to express my sincere gratitude to the support of
the DP-PMI and Fundacao para a Ciencia e a Tecnologia (Portugal), through scholarship
SFRH/BD/52242/2013. Second, a special thanks to the support of Instituto de Telecomu-
nicacoes, Lisboa, through Research Grant - BIM/No154 - 16/11/2017 - UID/EEA/50008/2017.
Further, I acknowledge that this work was developed under the scope of R&D Unit 50008,
financed by the applicable financial framework (FCT/MEC through national funds and when
applicable co-funded by FEDER - PT2020 partnership agreement).
Next, I would like to express my deepest gratitude to my supervisor, Professor Car-
los Caleiro, for constant and generous support, opportunities, encouragement and guidance
through the inevitable bumpy road that makes part of any PhD endeavor. The fruitful discus-
sions, comments and criticisms from his part have truly changed my way of thinking, which
is reflected in this thesis exposition. For all of the mentioned I give him a big thank you.
Moreover, I would like to express my gratitude to my colleagues and co-authors Joao
Saude, Sergio Pequito, Ludovico Boratto, Jaime Ramos and Soummya Kar for all the profuse
discussions and hard working hours spent together.
Finally, I would like to thank to my PhD colleagues, and also friends, Andreia Mordido,
Iolanda Velho and Filipe Casal and to my family: Henrique, Isabel, Joana, Ricardo, Clara,
Laura, Susana, Josue and specially to Tiago, for their great support.
Thank you,
Obrigado!
v
Contents
1 Introduction 1
1.1 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Main Contributions of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . 8
I Ranking and Recommender Systems 11
2 Preliminaries and Notation 13
3 A Robust Reputation- and Cluster-based Ranking System 17
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Reputation-based ranking algorithms . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Bipartite graph algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.2 Similarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.3 Multipartite graph algorithms . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.2 Computational complexity analysis . . . . . . . . . . . . . . . . . . . . . 26
3.4 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4.1 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4.2 Spamming and Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5.1 Robustness against random spamming (noise) . . . . . . . . . . . . . . . 29
3.5.2 Robustness against attacks . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.3 Sensivity to parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
vii
4 Reputation-based Ranking Systems and their Resistance to Bribery 37
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 Bribing in ranking systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.1 Properties of strategies and its profit in the bipartite ranking systems . 42
4.3.2 Optimal Strategies in the Bipartite Ranking Systems . . . . . . . . . . . 44
4.3.3 Properties of strategies and its profit in Multipartite Ranking Systems . 46
4.3.4 Optimal Strategies in Multipartite Ranking Systems . . . . . . . . . . . 47
4.3.5 Bipartite vs. Multipartite Ranking Systems . . . . . . . . . . . . . . . . 49
4.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 Recommendation via Matrix Completion Using Kolmogorov Complexity 53
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.1 Setup specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.2 Complexity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4.2 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.4.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6 A Novel Similarity Measure for Group Recommender Systems with Opti-
mal Time Complexity 63
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2 Background and Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3 The Kolmogorov-based similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.4 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.5 The group recommender system . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.6 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.6.1 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.6.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
viii
II Control of dynamical systems 75
7 Preliminaries and Notation 77
8 The Robust Minimal Controllability Problem 79
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.2 Problems Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.3 Preliminaries and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.4 Robust Minimum Controllability Problem . . . . . . . . . . . . . . . . . . . . . 84
8.4.1 Numerical and Computational Remarks . . . . . . . . . . . . . . . . . . 92
8.5 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9 The robust minimal controllability problem for switched linear continuous-
time systems 97
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
9.2 Problems Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
9.3 The Robust Minimal Controllability Problem . . . . . . . . . . . . . . . . . . . 101
9.4 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
9.4.1 Example I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
9.4.2 Example II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10 On the index of convergence of Boolean matrices with commutative SD-
decomposition 111
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
10.2 Preliminaries & Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
10.2.1 Boolean Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
10.2.2 Digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.2.3 Known Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.3 Index of Boolean matrices with commutative SD-decom-position . . . . . . . . 114
10.4 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
10.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
11 Conclusions and Future Work 123
Bibliography 141
ix
List of Algorithms
1 Clustering reputation-based ranking algorithm. . . . . . . . . . . . . . . . . . . 23
2 Matrix completion algorithm: KolMaC . . . . . . . . . . . . . . . . . . . . . . . 58
3 Group Recommender System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4 Polynomial reduction of the structural optimization problem (8.3) to a set-
covering problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5 Approximate Solution to the rMCP . . . . . . . . . . . . . . . . . . . . . . . . . 91
6 Polynomial reduction of the structural optimization problem (9.5), to a set-
covering problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7 Find a minimal set of state variables of a problem (9.4) that need to be actuated105
8 Merging procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
xi
List of Figures
1.1 Organization of the dissertation. . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Bipartite graph representing n users, m items, and the ratings given by user u
to item i, Rui. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Multipartite graph representing n users, m items, and the ratings given by user
u to item i, Rui. The lines represent the connection, through ratings, from the
users to the items. The dashed lines represent links between users, through
their similarities, suv with u, v ∈ U . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 Evolution of the τ for random spamming with the proportion of spammers. . . 30
3.2 Evolution of the τ for the love/hate attack with proportion of spammers. . . . 30
3.3 Evolution of the ranking of the targeted item, rtarget, for love/hate attack with
proportion of spammers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Evolution of the ranking of the targeted item, rtarget, for reputation attack with
proportion of spammers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Evolution of the τ for the reputation attack with proportion of spammers. . . . 31
3.6 Evolution of r with proportion of attackers, for reputation attack, in the largest
cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.7 Variation of rtarget with the affinitity parameter, α, for different proportions of
attackers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.8 Variation of τ with the affinitity parameter, α, for different proportions of
attackers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Bipartite/multipartite graph representation of users and items with edges in-
terconnecting them weighted by the users’ ratings for items, not consider-
ing/considering the dashed links. . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Profit of bribing strategies of the most rated item’s sellers in (a) bipartite
ranking system (σ1 – σ4), and (b) multipartite ranking system (σ1 and σ2). . . 50
xiii
4.3 Profit of bribing strategy σ2 in the bipartite ranking system, fixed users’ repu-
tations versus reputations recomputed after each user being bribed. . . . . . . . 51
5.1 Graph representing n users, u1, . . . , un and m items, i1, . . . , in. The filled
edges between users and items represent the products each user rated weighted
by the rating. The top dashed edges (between users) represent the weights com-
puted in the matrix SU . The dashed bottom edges (between items) represent
the weights computed in the matrix SI . . . . . . . . . . . . . . . . . . . . . . . 56
6.1 RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with SVD for its step 2, with Pearson similarity
(blue points) and KS (yellow points) for the ML-100K. . . . . . . . . . . . . . . 71
6.2 RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with SVD for its step 2, with Pearson similarity
(blue points) and KS (yellow points) for the ML-1M. . . . . . . . . . . . . . . . 72
6.3 RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with KNN for its step 2, with Pearson similarity
(blue points) and KS (yellow points) for the ML-100K. . . . . . . . . . . . . . . 73
6.4 RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with KNN for its step 2, with Pearson similarity
(blue points) and KS (yellow points) for the ML-1M. . . . . . . . . . . . . . . . 74
10.1 Two digraphs and their respective SD-decompositions. . . . . . . . . . . . . . 116
10.2 A family of digraphs G(Ai1)∞i=1 in (a); digraph G(A2) in (b); and digraph
G(A3) in (c). The SCCs are represented by the red edges in (a), each cycle is
an SCC, and they are represented by different (not black) colors, one per SCC,
in (b) and (c). The DAG of each digraph is represented by the black edges. . . 119
10.3 Graph representation of a 5-bus power system. . . . . . . . . . . . . . . . . . . 120
11.1 Future work directions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
xiv
List of Tables
3.1 Details of the datasets A and B. . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1 Details of datasets MovieLens 100k and 1M. . . . . . . . . . . . . . . . . . . . . 59
5.2 RMSE of a 5-fold-cross-validation in four synthetic random and full rank 20×30
matrices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 RMSE for the datasets ML–100k and ML–1M. . . . . . . . . . . . . . . . . . . 61
6.1 Time and space complexity of the similarities. . . . . . . . . . . . . . . . . . . . 68
6.2 Details of datasets MovieLens 100k and 1M. . . . . . . . . . . . . . . . . . . . . 70
6.3 Average and standard deviation of the computation time of the similarities
between every pair of users in a 5-fold cross validation. . . . . . . . . . . . . . . 72
10.1 Number of size n CSDD Boolean matrices vs. size n Boolean matrices. . . . . . 116
xv
Chapter 1
Introduction
Information Security is a ubiquitous concern of this century. It finds applications in a wide
range of areas such as military, political, social and economic domains. In fact, Information
Security has been a long time and primary concern for diplomats and military comman-
ders. They soon understood not only the need for protecting the confidentiality of exchanged
messages, but also that the access to critical information should be private (restricted to
authorized people). Another important concern of the diplomats and military commanders
was the need for detecting when someone tampered with their messages content, to keep the
messages’ integrity, maintaining the consistency, accuracy, and trustworthiness. For similar
purposes, in 50 B.C., Julius Caesar is considered the inventor of the Caesar cipher, created
to prevent his secret messages from being read if they fall into the wrong hands, see [Singh,
2000].
Two other relevant aspects in Information Security are the availability of the informa-
tion, it must be available when needed, and also non-repudiation of information, that is the
information must be “signed”, so that author’s ownership cannot be denied.
In the First World War, both the cryptology, the science of designing codes to cipher
and decipher messages, and the cryptanalysis, the art of cracking codes without holding the
secret key, were very important. But, the cryptograms designed for the Great War were
easily breakable by the cryptanalysis of that time. It was in the Second World War that
the areas of cryptology and cryptanalysis, more than important, became essential. The use
of machines to cipher messages was the ultimate weapon. In fact, the concern of keeping
information’s confidentiality translated in the Enigma Machine, employed by the Germans
to encrypt warfare data. The need to keep information’s integrity led to the creation of
levels of access to information and the design of complex and safe storages to garner it. Alan
Turing was an English computer scientist, mathematician, logician, cryptanalyst, philosopher,
and theoretical biologist, that is considered as the father of theoretical computer science
1
and artificial intelligence [Hodges, 2012]. He is the one who led the team that successfully
decrypted the Enigma machine.
The end of the last century and the beginning of this century have faced a pick of advances
in the field of telecommunications. The development of small and affordable devices with
computational power brought data processing power from military and powerful businesses
to single home users. This computational power leveraged the development of the Internet,
which, in its turn, enlarged the power of these small devices.
In 2017, the estimated world population number was 7, 519, 028, 970 and the estimated
number of Internet users was 3, 885, 567, 619 (51.7% of the population), the last value rep-
resents a growth of 976.4% from 2000 to 20171. Everyone with a computer and a network
connection can reach, using worldwide groups of connected networks, any other point on
the Internet, without borders or timezones restrictions. Consequently, we face a continually
growing interest in electronic communications and commerce. However, all the convenience
we acquire with the World Wide Web (WWW) to get information, services, and goods also
comes with its own risks. This comfortable mean of communicating and buying services and
goods can also be explored by malicious users to steal or tamper with valuable information.
Hence, it is of utmost importance to keep valuable information on the Internet confidential,
only known to who it is intended to, to keep its integrity, without being corrupted, and
available, reachable in normal and abnormal situations.
With the fast growth of Internet users and electronic devices, the need to extract and ex-
plore meaning from the data deluge they produce has a paramount role to electronic commerce
(e-commerce), military, advertisement and marketing strategies, and politics, to name a few.
In tandem, there is a growth of interest from malicious entities in obtaining and exploiting in-
formation from this amount of data. These entities may tamper with the systems/companies,
do espionage or sabotage of users/companies, or even, in an extreme scenario, they may trigger
a cyberwar.
In e-commerce, a traditional way of collecting information from users is to allow them to
rate and comment on products/services. In a subsequent step, the sellers/service providers
process this information to generate rankings for the products/services or to recommend more
products/services to users that may be of their interest. These simple methods of collecting
and processing users’ information help us improve our experience on the web [Forman et al.,
2008,Sparks and Browning, 2011], and sellers rely on this information to evaluate the viability
of the products, to predict sales and to target advertise campaigns [Chevalier and Mayzlin,
2006, Dellarocas et al., 2007, De Maeyer, 2012]. However, they are susceptible to attacks
from malicious users or even sellers/service providers [Hu et al., 2012]. A set of users may
1http://www.internetworldstats.com/stats.htm
2
target a set of products/services and control, push or nuke, their rankings or tamper with
the recommendation list produced for determined users. Further, a seller may bribe a set of
users to control their ratings and increase the ranking of his/her own products or degrade
the ranking of competitors products. Hence, it is desirable that the ranking systems and
recommender systems, while essential tools, are robust to such kind of behaviors, trying to
keep information as secure as possible [Cialdini and Garde, 1987, Li et al., 2012, Apt and
Markakis, 2014,Simon and Apt, 2015,Grandi and Turrini, 2016].
In the cyberwar extreme scenario, the federal government of the United States of America,
in [Shiels, 2009], admitted that the electrical power grid is a cyberwarfare target [Singer and
Friedman, 2014]. In 2010, security experts from Kaspersky Lab found a malicious software
program named Stuxnet, in which the controllers’ input response to a tampered measured
output drives the system away from its usual operating conditions, see [Langner, 2011]. The
New York times said it was “the first attack on critical industrial infrastructure that sits
at the foundation of modern economies”. This software was not only infiltrated in factory
computers but also spread to electrical power grids around the world. Attacking the electrical
power grid of the target countries is considered to be the first tactical move in a technological
war, and this was successfully applied in [Case, 2016]. Hence, ensuring the security of the
electrical power grid is one of the main tasks of governments, see [Flick and Morehouse,
2010], for instance. Also, the United States Department of Homeland Security is working
together with industries to identify vulnerabilities and help them to enhance the security of
control system networks. Notwithstanding, several applications, not only power systems but
also control processes, multi-agents networks, control of large flexible structures and systems
biology [Egerstedt, 2011, Siljak, 2007, Skogestad, 2004] rely on the concept of controllability
to ensure their proper functioning.
The research of this thesis aims to contribute to information security aspects in the two
described scenarios. In the ranking and recommender systems scenario, the main goals are to
devise systems that are more robust to attacks, to provide a better-personalized experience,
and to design systems that have good computational complexity. In the cyberwar scenario, we
also focus on how the control system networks cope with attacks (robustness to attacks). We
design systems that can be controlled up to a specified number of controllers failures (possibly
due to attacks), while also focusing on the computational complexity of the design process.
1.1 Thesis Statement
The research on this dissertation aims at studying information security aspects in ranking,
recommender, and control systems and develop systems in these classes that are more robust
3
to attacks. Our claim is that the areas of information theory and control systems theory
may provide a valuable tool to approach the fields of ranking and recommender systems,
to improve the state-of-the-art in these two fields in terms of information security, to design
systems more robust to attacks and noise, and also to design systems that are more efficient in
terms of computational complexity. Hence, we seek to use ideas from control systems theory
and apply them in the context of ranking and recommender systems. In particular, we plan
to study optimal strategies to attack ranking systems and evaluate their effectiveness. That
is, what are the costs of an attack versus its gain.
1.2 Outline
The main topics we explore in the dissertation can be presented as a tree with the connections
represented as edges. In Figure 1.1, we map the main chapters of this thesis to a schema. We
omit Chapter 10 which is, in fact, related with the topic of control systems.
§
Information
Theory and Security
E-commerce ControlSystems
RecommenderSystems
RankingSystems
LTIsystems
SwitchedLTI systems
Ch. 5,6 Ch. 3,4Ch. 4 Ch. 8 Ch. 9
Part I Part II
Figure 1.1: Organization of the dissertation.
More specifically, the dissertation is organized in two parts and 11 chapters, as follows:
Introduction: The introduction chapter renders the context of the work presented in this
dissertation and its motivation. It summarizes the main contributions of this manuscript, and
it presents the structure of the document.
4
Part I – Ranking and Recommender Systems: In Part I, we use information theory,
in particular ideas from Kolmogorov complexity theory, from clustering, from graph theory,
and from collaborative filtering, to create a reputation-based ranking system for multipartite
networks that is more robust to noise and attacks than the state-of-the-art ranking systems.
This system is also more tailored to each user, because it presents (possible) different rankings
of the same item for different groups of users, see Chapter 3. We compare the robustness to
bribery of bipartite ranking systems versus multipartite ranking systems, showing that the
last case is more robust to bribery, in Chapter 4, and we explore the results with the ranking
system proposed in Chapter 3. We explore the similarity measures, based on Kolmogorov
complexity, that we introduced in Chapter 3, to develop a recommender system, in Chapter 5,
and a group recommender system, in Chapter 6.
Preliminaries and Notation: We introduce the preliminary results and notation used in
Part I of the manuscript in Chapter 2.
A Robust Reputation- and Cluster-based Ranking System: In Chapter 3, we pro-
pose a new reputation-based ranking system, utilizing multipartite rating subnetworks, which
clusters users by their similarities using three similarity measures, two of them based on Kol-
mogorov complexity. Our system is novel in that it reflects the diversity of preferences by
(possibly) assigning distinct rankings to the same item, for different groups of users. We
prove the convergence and efficiency of the system. By testing it, we see that it copes better
with spamming/spurious users, being more robust to attacks than state-of-the-art approaches.
This work was developed with Joao Saude, Ludovico Boratto, Carlos Caleiro and Soummya
Kar and was submitted for publication, see [Saude et al., 2018].
Reputation-based Ranking Systems and their Resistance to Bribery: In Chapter 4,
we study bribery resistance properties in the two classes of reputation-based ranking systems
presented in Chapter 3, where the rankings are computed by weighting the rates given by
users with their reputations. In the first class, the rankings are the result of the aggregation
of all the ratings, and we provide all users with the same ranking for each item. In the second
class, there is a first step that clusters users by their rating pattern similarities, and then
the rankings are computed cluster-wise. We study the setting where the seller of each item
can bribe users to rate the item, if they did not rate it before, or to increase their previous
rating on the item. We model bribing strategies under these ranking scenarios and explore
under which conditions it is profitable to bribe a user, presenting, in several cases, the optimal
bribing strategies. By computing dedicated rankings to each cluster, we show that bribing,
in general, is not as profitable as in the simpler scenario, without clustering. Finally, we
5
illustrate our results with experiments using real data. This work was developed with Joao
Saude, Carlos Caleiro and Soummya Kar and published in [Saude et al., 2017].
Recommendation via Matrix Completion Using Kolmogorov Complexity: In Chap-
ter 5, we explore the Kolmogorov-based similarity measures, introduced in Chapter 3, in the
context of recommender systems. Usually, recommender systems perform a completion of the
rating matrix via collaborative filtering algorithms. However, the choice of the algorithm to
employ is not trivial and usually depends on a set of factors and parameters. Indeed, when
choosing a neighborhood-based algorithm, we need to make assumptions on the ratio between
the number of users and items in the system, in order to choose between a user-based and
an item-based algorithm. Plus, the number of neighbors strongly impacts the effectiveness of
this class of algorithms. Memory-based algorithms, instead, assume that the matrix is either
low rank or that there are a small number of latent variables that encode the full problem.
However, it is hard to make such strong assumptions, especially for a recommender system
that will grow over time. To overcome these issues, in this chapter we propose a novel matrix
completion algorithm, without assumptions on the matrix rank. Also, it is model-free, i.e.,
the entries are not assumed to be a function of some latent variables. Instead, we use a
technique akin to information theory. Our method performs hybrid neighborhood-based col-
laborative filtering, using Kolmogorov complexity. Each component of the hybrid approach
is also parameter-free. It decouples the matrix completion into a vector completion problem
for each user. The recommendation for one user is, thus, independent of the recommenda-
tion for other users. For this reason, the algorithm is scalable, because the computations
are highly parallelizable. A large evaluation of our approach on 9 datasets shows that our
results outperform or compete with the state-of-the-art approaches. This work was developed
with Ludovico Boratto, Joao Saude and Carlos Caleiro and was submitted for publication,
see [Ramos et al., 2018c].
A Novel Similarity Measure for Group Recommender Systems with Optimal Time
Complexity: In Chapter 6, we again explore the novel Kolmogorov-based similarity mea-
sures introduced in Chapter 3 in the context of group recommender systems. Once we sub-
scribe to an e-commerce portal, or to a social media website, we interact with multiple brands
and with content from numerous providers. However, a unique user profile is created, contain-
ing all our preferences. Suppose that a company wants to understand who are its customers.
It wants to treat costumers as a target, and understand what campaigns the company should
run on them. On the one hand, an approach that clusters the users and performs group rec-
ommendations would be useful, while on the other hand, a generic user profile would not be
helpful, since the preferences in it are not specific for a brand. Hence, we have to determine
6
multiple user clusterings (one for each brand). This task makes the problem of producing
group recommendation challenging, since little and very sparse information about the users
is available, and for each pair of users we have to detect as many similarities as the brands
existing in the system. To tackle this problem, in this chapter, we introduce, in this context,
a novel and optimal measure to compute the similarity between users, based on Kolmogorov
complexity. Further, we test it in the group recommendation scenario. The results show that
our similarity measure can provide similar accuracy when compared to classical measures, but
with significant performance gains, having a strictly lower time complexity than the state-
of-the-art similarity measure. This work was developed with Ludovico Boratto and Carlos
Caleiro and was submitted for publication, see [Ramos et al., 2018a].
Part II – Control of dynamical systems: In Part II, we solve two controllability prob-
lems, that consist in selecting the minimal set of state variables that need to be actuated
such that the underlying system is controllable under the scenario where a set of actuators
may fail along the time (robust to attacks/actuators failures). This problem is solved for
linear time-invariant systems, in Chapter 8, and for switched linear continuous-time systems,
in Chapter 9. In Chapter 10, we make use of a digraph decomposition commonly utilized in
the areas of control systems (in particular, in structural control systems) to present a new
bound on the index of convergence of Boolean matrices.
Preliminaries and Notation: In Chapter 7, we introduce the preliminary results and
notation needed for the chapters of the Part II of the manuscript.
The Robust Minimal Controllability Problem: In Section 8, we address the robust
minimal controllability problem, where the goal is, given a linear time-invariant system, to
determine a minimal subset of state variables to be actuated to ensure controllability under
additional constraints. We study the problem of characterizing the sparsest input matrices
that assure controllability when the autonomous dynamics’ matrix is simple when a specified
number of inputs fail. We show that this problem is NP-hard, and under the assumption that
the dynamics’ matrix is simple, we show that it is possible to reduce the problem to a set
multi-covering problem. Additionally, under this assumption, we prove that this problem is
NP-complete, and polynomial algorithms to approximate the solutions of a set multi-covering
problem can be leveraged to obtain close-to-optimal solutions. This work was developed with
Sergio Pequito, Soummya Kar, Antonio Pedro Aguiar and Jaime Ramos and is published
in [Pequito et al., 2016b].
7
The robust minimal controllability problem for switched linear continuous-time
systems: In Chapter 9, we address the robust minimal controllability problem for switched
linear continuous-time systems, an extension of Chapter 8. The problem is to determine the
minimal subset of state variables to actuate such that the switching linear system is control-
lable, under the scenario where a set of actuators may fail along the time. Two variations
of this problem are considered, depending on whether we want to design an input matrix
for each mode (i.e., a different set of actuators that may fail in each mode), or if we want
to design an input matrix common across all the modes, when a set of actuators may fail.
In both cases, we want to ensure that, given an initial condition, we can drive the system
towards any desired state. For both problems, we characterize the sparsest input matrices
which ensure that the system is controllable, whenever the autonomous dynamics’ matrix of
each mode is simple, and for which a left-eigenbasis is available. We reduce these problems to
set multi-covering problems, showing that using a sufficient condition for controllability, the
first problem, i.e. to design an input matrix for each mode, is NP-complete. These allow us
to deploy known, close-to-optimal, (polynomial for the first case) algorithms approximating
the solutions of the problems we study. This work was developed with Sergio Pequito and
Carlos Caleiro and is published in [Ramos et al., 2018b].
On the index of convergence of Boolean matrices with commutative SD-decompo-
sition: In Chapter 10, we present a new bound for the index of convergence of Boolean matri-
ces that correspond to digraphs with commutative SD-decomposition, and we revisit the previ-
ously known bounds for the index of convergence of Boolean matrices, by Wielandt, Dulmage-
Mendelsohn, Schwarz, Kim and Gregory-Kirkland-Pullman. Boolean matrices emerge in con-
trol systems theory, more specifically in structural control systems theory and they are used
to ensure structural controllability of a system, a necessary condition to have a controllable
system. We use ideas from structural control to achieve this new bound, and we illustrate it
with examples. This work was developed with Carlos Caleiro and is submitted for publication,
see [Ramos and Caleiro, 2018].
1.3 Main Contributions of the Dissertation
Here, we summarize the main contributions of this thesis:
• we developed a multipartite reputation-based ranking system more robust to noise and
to attacks than the state-of-the-art ranking systems, that, by clustering users with
similarity measures based on their preferences, presents to users a personalized ranking
for each item, and we proved its convergence and efficiency;
8
• we studied the effect of bribing in ranking systems, we computed the optimal bribing
strategies, and we compared the resistance to bribery of bipartite ranking systems with
multipartite ranking systems, being the former less resistant than the latter;
• we explored the similarity measures proposed to cluster users (from the first bullet point)
to design a recommender system that, when evaluated with real-world data produces
similar results to the benchmark results;
• we explored the use of one of the proposed similarity measures in the context of group
recommender systems, where we use it to detect groups of users, and we obtained similar
results as the benchmark similarity measure but with a significant gain in terms of time
complexity and also in space complexity;
• we characterized the exact solutions to the robust minimal controllability problem for
continuous linear-time invariant systems, we showed that it is an NP-complete problem,
and we provided efficient approximated solutions to the problem;
• we extended the previous bullet point to characterize the solutions of two versions of the
robust minimal controllability problem for switched and continuous linear-time invariant
systems, also providing approximated algorithms, which only in one of the version may
have polynomial time complexity;
• we used ideas from control and structural control theory to present a new bound for the
index of convergence (transient) of a Boolean matrix.
9
Part I
Ranking and Recommender
Systems
11
Chapter 2
Preliminaries and Notation
In this part of the dissertation, we resort to information theory to contribute in the areas of
ranking systems and recommender systems. In particular, we use concepts from Kolmogorov
complexity theory, see [Ming and Vitanyi, 1990], from graph theory, see [Cormen et al.,
2001, Bollobas, 2013], from data mining (clustering), see [Jain and Dubes, 1988], and from
collaborative filtering1 (CF), see [Masthoff, 2015]. The outline of this part is as follows:
In Chapter 3, we present a reputation-based ranking system for multipartite networks. In
Chapter 4, we study the bribery effect in reputation-based bipartite and multipartite ranking
systems, including the one introduced in Chapter 3. In Chapter 5, we explore the similarity
measures that we introduce in Chapter 3 to design a recommender system. In the last chapter
of this part, Chapter 6, we use one of these similarity measures to design a group recommender
system.
We now introduce some notation and definitions that we use in the subsequent chapters.
Let U be a set of users, I a set of items, R⊥, R> ∈ Z+ the minimum and maximum ratings,
respectively. We denote by R = [R⊥, R>] ∩ Z+ the set of strictly positive integers, by ∆R =
R>−R⊥ the allowed ratings, and by R ⊆ U ×I×R the set of ratings given by users to items.
Note that we can consider continuous ratings of the form [R⊥, R>] ∩ R, with R⊥, R
> ∈ R,
and the results also apply (we just need, in the case of the ranking systems, to normalize
the ratings to be in ]0, 1]). For instance, if user u rates item i with rating Rui, then we
write it as (u, i, Rui) ∈ U × I × R, or simply Rui ∈ R. We denote the set of items rated by
user u as Iu = i | ∃Rui ∈ R s.t. (u, i, Rui) ∈ R, and the set of users who rated item i as
Ui = u | ∃Rui ∈ R s.t. (u, i, Rui) ∈ R.
A reputation-based ranking system assigns a reputation, cu ∈ R+, to each user, u ∈ U ,
and then utilizes it to weigh their ratings on products to compute the products’ rankings.
1Collaborative filtering is a technique to make automatic predictions (filtering) about the interests of a user,
by gathering preferences information from many users (collaborating).
13
Figure 2.1: Bipartite graph representing n users, m items, and the ratings given by user u to
item i, Rui.
From now on, for ranking systems, we consider normalized ratings (dividing by R>) and
reputations, Rui, cu ∈]0, 1] (i.e., the rankings and reputations take values in ]0, 1]).
To model a ranking system, we first consider a weighted bipartite graph, B = (U, I,R),
like the one presented in Figure 2.1, with two sets of vertices, U and I, representing the
users and the items. If a user rated an item, then there is an edge with the weight of the
rating connecting the two in B. From B, we build a graph G = (U,E), with the set of users
as vertices and where two users are connected if they are similar. Merging B and G gives
rise to a multipartite graph, M (Figure 2.2). A multipartite graph is a graph such that two
vertices that are connected by an edge have different colors [Bollobas, 2013]. Here, we need
one color for the items and at least two more whenever there is a cluster with more than one
user. This multipartite graph can either model users’ network generated by users themselves,
for instance, social networks, see [Symeonidis et al., 2011], or, as we propose, automatically
generated by the ranking system, based on ratings of items given by users. Note that a
partition on subnetworks may also be done on the items’ side, although we do not explore
this possibility in the context of ranking systems.
If |U | = n and |I| = m, n,m ∈ Z+, then we denote the n ×m matrix of ratings also by
R, because the matrix R is isomorphic to the set of ratings R. Again, Rui denotes the rating
that user u gave to item i. The entries take values on the allowed ratings together with a
special number denoting the absence of rating, ⊥ (in the following chapters, ⊥ is either 0
or 99, depending on the dataset used). Therefore, the set of given ratings corresponds to
the entries of the matrix R different from the special number. We denote by |R| the same
amount in both the set and matrix versions of the ratings, which is the number of ratings.
In other words, |R| is the number of elements of the set R or number of elements different
from ⊥ in the matrix R. We use the set version of R in the context of ranking systems and
the matrix version in the context of recommender systems. Note that, in this case, we have
14
Figure 2.2: Multipartite graph representing n users, m items, and the ratings given by user
u to item i, Rui. The lines represent the connection, through ratings, from the users to the
items. The dashed lines represent links between users, through their similarities, suv with
u, v ∈ U .
that the set Ui = u : Rui 6= ⊥ and u ∈ U. Similarly, the set Iu can now be written as
Iu = i : Rui 6= ⊥ and i ∈ I. We adopt standard notation to denote matrices and vectors.
For a matrix R, we denote the uth row of R by Ru, the ith column of R by Rᵀi , and the
ith column of the uth row by Rui. Further, we denote by Ru the average the ratings that
user u gave to items, i.e., Ru =∑
i∈Iu Rui/|Iu|. Given a set of objects X , a similarity is a
function s : X × X → [0, 1] such that whenever x ∈ X , s(x, x) = 1. For a square matrix
representing similarities, we use the letter S indexed by U or I, if the similarity matrix
represents similarities between users or items, respectively. Further, given two vectors with
dimension n, x and y, we denote by x y the vector whose entries are the product of the
entries of x and y, i.e., x y = (x1y1, . . . , xnyn). We use the semi-norm ‖ · ‖0. Given a vector
x, ‖x‖0 is the number of non zero entries of x.
Given a vector X ∈ Rn, we denote by X the average of the vector, i.e.,
X =
∑ni=1Xi
n. (2.1)
Further, we denote by σX the standard deviation of the vector X, i.e.,
σX =
√∑ni=1(Xi − X)2
n− 1. (2.2)
Let Σ = w1, . . . , wk be a finite alphabet (a set of characters). A word is an element in
Σ∗, i.e., a sequence of characters, such that the empty word ε ∈ Σ∗, and if x, y ∈ Σ∗ are two
words, then its concatenation xy is also a word, i.e., xy ∈ Σ∗. The length of a word x ∈ Σ∗
is inductively defined as |x| = 0 if x = ε and |x| = 1 + |z| if x = wz, with w ∈ Σ and z ∈ Σ∗.
15
Chapter 3
A Robust Reputation- and
Cluster-based Ranking System
In this chapter, we propose a reputation-based ranking system for multipartite networks.
Our system clusters users by their similarities in terms of tastes. Inspired by information
theory, We propose three similarity measures for this end, two of them based on Kolmogorov
complexity. Not only the proposed system is more user personalized (we may present different
rankings for the same item to distinct clusters of users), but it is also more robust to attacks
and spam/noise than the state-of-the-art ranking systems. Therefore, we take a step forward
in the information security of ranking systems. This work was submitted for publication,
see [Saude et al., 2018].
3.1 Introduction
In our daily life, electronic commerce, streaming media, and collaborative economy are ubiq-
uitous. Moreover, people’s opinions can be as effective as an advertisement. These facts
promoted the development of crowdsourced ratings/reviews. Consumers started to use, and
rely, on this information to decide whether or not to buy a product/service, have a meal
at a restaurant, or attend an event [Sparks and Browning, 2011, Forman et al., 2008]. The
sellers, aware of how the ratings of products/services impact sales [Chevalier and Mayzlin,
2006], started to rely on the ratings and reviews of their products to assess their commercial
viability as well as to predict sales [Dellarocas et al., 2007]. Further, they began to use this
information to improve their products and to target advertisement campaigns.
A domain in which these ratings and reviews can be employed efficiently is the systems
that rank the items for the users (e.g., Netflix and IMDB provide to the logged in users a
ranking of the available items). Given the relevance that ratings and reviews have for both
17
the users and the companies, it is of primary importance to detect and, automatically, correct
rating manipulations through fake users’ ratings.
Previous work. A simple way to collect and process ratings is to compute their arithmetic
average (AA). The main drawback of the AA is the indistinguishability of users, as it treats,
in the same way, the most relevant raters and spam. Therefore AA is prone to manipulation
of ratings through malicious attacks or spamming. Further, AA might be misleading, because
it does not capture the possible multimodal behavior of ratings [Hu et al., 2006]. For instance,
in a bimodal ratings’ distribution on the opposite extremes, the average is in the middle where
the density of votes is low.
Using weighted average algorithms allows the attribution of different importance to users.
This was explored in previous works, [Yu et al., 2006a,De Kerchove and Van Dooren, 2010].
The authors in [Li et al., 2012] used a modification of the weighted average. In [Mizzaro,
2003], the author used an additional time-dependent quantity to weigh the ratings of users.
These methods are more robust to spamming and attacks than the AA.
The mentioned methods have a bipartite graph structure because there are two types
of nodes, users, and items, with weighted edges (ratings) linking the two; see Figure 2.1 in
Chapter 2.
In prior work [Symeonidis et al., 2011], its authors extended the bipartite graph ap-
proaches. The authors considered implicit social networks, known as online Social Rating
Networks (SRN) (that emerge from different users commenting similarly on a given set of
products), and explicit social networks (built by users, through friendship or working rela-
tionship). They used the SRN to recommend products by a weighted combination of users’
similarities although not clustering users.
Open issues. By using AA or weighted average to rank the items, we are not taking into
account possible relations between users or users’ preferences. Furthermore, these approaches
do not incorporate the (possible) multimodal behavior of ratings of items, subjugating all
users to the average, hindering the rise of a multitude of preferences. There are two natural
negative consequences to this. On one hand, a ranking system does not take the most out
of the efforts made by the users to rate the items. Indeed, the current solutions do not offer
any form of personalization to the users, while it would be desirable that, if they belonged
to a segment with specific preferences, these should be reflected in the ranking (i.e., the
users should be presented first with items they might be interested in). On the other hand,
spurious users or spamming/malicious attacks might significantly affect the quality of the
ranking. Consequently, the ranking system would not reflect users’ preferences. Hence, there
would be negative consequences for the platform, regarding trust of their users.
Our contributions. In this chapter, we propose a generic class of iterative reputation-
18
based ranking systems on multipartite graphs. A reputation-based ranking system is a ranking
system in which each user has a reputation, and the ranking is the weighted average of users
ratings by their reputation. We prove, for the first time in the literature, that the algorithms
in that class converge and are efficient, extending the results in previous work. To design the
system, we use similarities between users. These similarities allow us to clusters users based,
solely, on their ratings (see Figure 2.2 of Chapter 2, where two subnetworks of users are
depicted in dashed lines, i.e., u1, u2, u3 and uN−1, uN). To cluster users, we propose two
novel similarity measures, the linear similarity (LS) and the Kolmogorov similarity (KS), and
we test them against the normalized compression similarity (CS), derived from the distance
measure proposed in [Li et al., 2004]. For an extensive introduction to Kolmogorov complexity
and its applications we refer the reader to [Ming and Vitanyi, 1990].
After, we compute (possible) different rankings for the same item, on different subnet-
works/clusters. Therefore, our method enables us to present, custom-built, items’ rankings
to each cluster.
Our approach adapts better to the preferences of similar users and also improves robust-
ness against both spurious users and spamming/malicious attackers. Further, it embeds the
multimodal behavior of ratings’ distribution. These aspects contrast with the algorithms that
we overviewed since they neglect the smaller subgroups that do not identify with the major-
ity because they are averaged out. Our proposal overcomes this issue, since we perform the
ranking on the base of a user clustering, while the other approaches consider the whole set of
users.
Our proposed similarities not only perform better but also have smaller computational
complexity than using CS. When comparing LS with KS, the former responds better to noisy
spam, and the latter is more robust to targeted attacks to a set of items. Both carry the same
order of computational complexity, although in our implementation KS is slightly faster than
LS. Finally, by using LS, we obtain better robustness results than state-of-the-art approaches.
Chapter structure. In Section 2, we introduce a generic class of reputation-based rank-
ing iterative algorithms and prove their convergence and efficiency. In Section 3.3, we design
a new reputation-based ranking system, we prove its convergence and explain its implemen-
tation. The experimental setup is described in Section 3.4 and we discuss our results in
Section 3.5. In Section 3.6, we conclude the chapter.
3.2 Reputation-based ranking algorithms
Next, we introduce the two classes of reputation-based ranking systems that we will study in
this chapter, and that will also be the subject of study of Chapter 4, but in the context of
19
bribery.
3.2.1 Bipartite graph algorithms
Here, we generalize the iterative reputation-based ranking methods, discussed above, as:rk+1 = gR(ck)
ck+1 = hR(rk+1), (3.1)
where k denotes the iteration index and c0 the vector of initial reputation of users, with
c0u ∈]0, 1]. Here, r = (r1, . . . , r|I|), where ri denotes the ranking of item i, computed by
gR : [0, 1]|U | → [0, 1]|I|, with the set of ratings, R, as a parameter. The users’ reputations,
c = (c1, . . . , c|U |), where cu denotes the reputation of user u, are determined by hR : [0, 1]|I| →[0, 1]|U |.
In [Li et al., 2012], the authors prove that their reputation-based ranking system converges
with a certain convergence rate. In this section, we prove that a class of reputation-based
ranking systems (more abstract) converges and with what convergence rate. Hence, we present
more general results that subsume all the proofs of convergence and efficiency in [Li et al.,
2012]. This allows for designing a more extensive range of convergent and efficient reputation-
based ranking systems.
Consider a Banach space, X , with an induced distance d : X 2 → [0, 1]. Using the Lipschitz
condition [Kreyszig, 1989], we prove the following results.
Lemma 3.2.1. Consider the iterative scheme (3.1). Let gR and hR be ηg and ηh-Lipschitz
maps, respectively. Then gR hR is an η-Lipschitz map, with η = ηgηh. If η < 1, then (3.1)
is a contraction.
Proof. Since the domain of g contains the codomain of h and both are Lipschitz the compo-
sition, g h, is also Lipschitz. Let d be a distance, we prove the induction’s basis:
d(r2, r1) = d(gR(c1), gR(c0)
)= d
((gR hR)(r1), (gR hR)(r0)
)≤ ηd(r1, r0),
where η ∈ [0, 1[ is the Lipschitz constant for gR hR. The induction step then reads
d(rn, rn−1) = d(gR(cn−1), gR(cn−2)
)= d
((gR hR)(rn−1), (gR hR)(rn−2)
)≤ ηd
(rn−1, rn−2
)= ηd
(gR(cn−2), gR(cn−3)
)≤ ηn−1d(r1, r0),
and the last inequality holds by the induction hypothesis.
20
Because we are working in a Banach space, if the algorithm (3.1) converges, then it
converges to a unique value. Using the previous lemma, we prove the following result:
Theorem 3.2.2. The class of iterative reputation-based ranking algorithms (3.1) converges.
Proof. Let m,n ∈ N. For any ε > 0, there exists an order, N , from which ηN < (1 −η)ε/d(r1, r0). Using the triangle inequality we have
d(rn, rm) ≤n∑
k=m+1
d(rk, rk−1) ≤n∑
k=m+1
ηk−1d(r1, r0)
≤ ηmd(r1, r0)+∞∑k=0
ηk ≤ ηNd(r1, r0)
1− η< ε,
since 0 < η < 1, therefore the algorithm (3.1) converges.
Theorem 3.2.3. Let d : X → [0, 1] be a normalized distance. Then the algorithm (3.1) has
exponential rate of convergence.
Proof. The basis of the induction reads:
d(r∗, r1) = d(gR(c∗), gR(c0)
)= d
((gR hR)(r∗), (gR hR)(r0)
)≤ ηd
(r∗, r0
)≤ η.
Assume that the induction hypothesis holds, for k = n, then it follows that
d(r∗, rn+1) = d ((gR hR)(r∗), (gR hR)(rn))
≤ ηd (r∗, rn) ≤ ηn+1d(r∗, r0
)≤ ηn+1.
To attain, at most, an error of ε > 0, we need κ = logη ε iterations, with η the Lipschitz
constant of gR hR.
3.2.2 Similarity Measures
To group users according to their preferences, we need to quantify how similar they are. For
each pair of users that have, at least, one rated item in common, we compute a similarity,
based on the item-rating information. We specify three similarities: one linear and two non-
linear. In the following, let Iu,v = Iu ∩ Iv denote the set of items that both users u and v
rated. Further, for each user, u, we denote by u the string composed by the concatenation of
the pairs (item, rating) of his/her rated items.
21
Linear similarity. We define the linear similarity as: LS (u, v) = 0 if Iu,v = ∅, and
otherwise
LS (u, v) = `(|Iu,v|)
1− 1
|Iu,v|∑i∈Iu,v
|Rui −Rvi|∆R
,where the function ` : Z+ → [0, 1] penalizes on how confident we are in the users’ similarity.
LD is a linear function of the absolute value of the rating difference, linearly encoding the
similarity between users, based on ratings of common rated items. If two users rated an item
with the same rating, the rating difference is zero, hence the similarity is one, on the other
hand, if the absolute rating difference is ∆R then the similarity is zero.
Next, we propose two compression-similarities based on Kolmogorov complexity [Cover
and Thomas, 2012]. Given the description of a string, x, its Kolmogorov complexity, K(x),
is the length of the smallest computer program that outputs x. In other words, K(x) is
the length of the smallest compressor for x. Although the Kolmogorov complexity is non-
computable, there are efficient and computable approximations by compressors. Let C be a
compressor; we denote by C(x) the length of the output string resulting from the compression
of x using C.
Compression similarity. Based on the normalized compression distance [Li et al.,
2004], we define the compression similarity as 1 minus the distance, i.e., CS (u, v) = 0 if
Iu,v = ∅, and otherwise
CS (u, v) = 1− C(uv)−minC(u), C(v)maxC(u), C(v)
,
for the string uv, the concatenation of u and v. Intuitively, we are measuring the information
(rating pattern) that users u and v have in common and normalizing it, by subtracting the
minimum and dividing by the maximum. If u and v have the same rating pattern, then
C(uv) ≈ C(u) = C(v) and CS (u, v) ≈ 1, while if the rating patterns are completely different
(they have nothing in common) we have that C(uv) ≈ C(u)+C(v) and CS (u, v) ≈ 0. Trivially,
when distance is maximum, 1, the similarity is minimum, 0, and vice-versa.
The main drawback of CS is that we need to compute the compression of each possible pair
of users with common rated items. To overcome this, we propose to use a nonlinear function
of the absolute disparity of users descriptions’ compressions, with lower time complexity.
Kolmogorov similarity. We define the Kolmogorov similarity as: KS (u, v) = 0 if
Iu,v = ∅, and otherwise
KS (u, v) =1
1 + |C(u)− C(v)|.
When the size of the compression of user u and user v rating patterns is the same, KS(u, v) =
1, and KS(u, v) goes to 0 when the absolute value of the compression sizes difference goes to
infinity.
22
3.2.3 Multipartite graph algorithms
We group users in subnetworks using a similarity measure SM . For a specified affinity level
threshold, α, we set Su,v = 1 if SM (u, v) > α and 0 otherwise, where S is the (possible
sparse) adjacency matrix, that characterizes the undirected graph M ≡ M(S). A large α
means that users need to be more strongly related in order to be connected, translating to a
larger number of clusters (automatically computed). We compute the subnetworks ofM,Mj
for j ∈ J , which are the connected components of M. Subsequently, we apply a reputation-
based ranking algorithm to each subnetwork Mj , to compute the reputation of users and
the ranking of items. Let rep rank denote a reputation-based ranking algorithm (3.1). We
summarize the clustering reputation-based ranking algorithm in Algorithm 1. Each time a
new user enters the system or an existing user changes his/her rating behavior all have to
be recomputed, as in most of the approaches in the literature. However, the update may be
performed only from time to time, not assigning new users to clusters until either some period
of time has passed or there is a sizable amount of new information.
Algorithm 1 Clustering reputation-based ranking algorithm.
1: input: α, dataset
2: build S from dataset and apply threshold α
3: build G, computing its adjacency matrix M≡M(S)
4: find the connected components of M, Mjmj=1
5: output: weighted average of rep rank(Mj)mj=1
3.3 Implementation
Here, we show how to compute the ranking of an item and the reputation of a user, by
considering the algorithms defined in equations (3.2) and (3.3) below. We compute the ranking
of the item, ri, as a weighted average. That is, the rating of user u to item i is weighted by
the user reputation, cu, and therefore gR in (3.1) becomes:
rk+1i =
∑u∈Ui
Ruick+1u
/∑u∈Ui
ck+1u . (3.2)
It is worth highlighting that our formulation of ranking differs from the one presented in [Li
et al., 2012] that, instead of normalizing by the sum of the users’ reputations, divides by the
number of users that rated the item i, |Ui|. Our definition allows us to have a ranking that
is based on the reputation of the users (thus more robust), but it makes more challenging
to prove the convergence of the method, which we present in Section 3.3.1 (indeed, having a
23
sum, instead of a constant value, means that we cannot simply get the absolute value of the
constant |Ui| out of the norm).
For hR in (3.1), we tested three different functions, parametrized by fλ,s:
ck+1u = 1− fλ,s(Iu)eR,u(r), (3.3)
where
eR,u =
1|Iu|
∑i∈Iu
|Rui − rki |p
maxi∈Iu|Rui − rki |p
mini∈Iu|Rui − rki |p
.
The users’ reputation is chosen as a function of the average, maximum, or minimum disagree-
ment of individual user’s ratings, Rui, and the rankings of the rated items, ri. In order to
control the penalization a user incurs on, for not rating according to the ranking, we define a
decay function fλ,s. We consider four different decay functions:
i) f1λ,s(x) = λ,
ii) f2λ,s(x) = λ
(1− e−
x2
),
iii) f3λ,s(x) = λ
[1− (1− υ)(1 + es−x)−1
],
iv) f4λ,s(x) =
1 if x ≥ 10
1/2 otherwise,
where λ ∈ [0, 1[ determines the penalization a user occurs in for rating differently than the
ranking, υ ∈]0, 1[ is the lowest penalization an user can incur, and s ∈ N is a parameter based
on the number of rated items such that the penalization is decreased by a half. The role of
the decay function is to control the penalization a user u suffers if it does not rate the item,
Rui, close to its ranking ri. The first, constant, function f1λ,s above is proposed in [Li et al.,
2012], the second is an exponential decrease function, the third is a logistic function, while the
fourth is a threshold function. In the second and third cases the penalization increases and
decreases, respectively, with the number of rated products. In the remaining of the chapter we
fix for hR the average and for fλ,s the constant function, f1λ,s, denoting by bipartite weighted
average (BWA) the resulting iterative scheme in equations (3.2) and (3.3). The choice to fix
these two functions is because they are easy to compute and, considering the different fλ,s and
the datasets used to evaluate our proposal, there is not much difference between the functions
(more details are provided in Section 3.5). Hence, they represent a good trade-off between
efficiency and effectiveness.
24
3.3.1 Convergence
Here, we prove the convergence of the proposed method. In what follows, for a given vector
x ∈ Rn and p ∈ Z+, the p-norm of x is ‖x‖p = (∑n
j=1 |xj |p)1p , and the ∞-norm is ‖x‖∞ =
maxj∈1,...,n |xj |.
Lemma 3.3.1. For all λ ∈ [0, (1 + ∆R)−1[, the iterative method in (3.1) with functions gR
and hR defined as in (3.2) and (3.3) converges.
Proof. Between iterations, rk+1 and rk, we get
‖rk+1i − rki ‖∞ =
∥∥∥∥Ri · ck+1
‖ck+1‖1− Ri · ck
‖ck‖1
∥∥∥∥∞.
Here, Ri ∈ [0, 1]|U | denotes a vector that contains the rating Rui that each user u gave to item
i (the element corresponding to a user is 0 if s/he did not rate the item).
Without loss of generality, assume that ‖ck+1‖1 ≥ ‖ck‖1, then the above difference is equal
to ∥∥∥∥Ri · ck+1
‖ck+1‖1− Ri · ck
‖ck+1‖1+
Ri · ck
‖ck+1‖1− Ri · ck
‖ck‖1
∥∥∥∥∞
≤∥∥∥∥Ri · ck+1
‖ck+1‖1− Ri · ck
‖ck+1‖1+Ri · ck
‖ck‖1− Ri · ck
‖ck‖1
∥∥∥∥∞
≤ R>‖ck+1‖1
∣∣∣ck+1γ − ckγ
∣∣∣ ,where
∣∣ck+1γ − ckγ
∣∣ = maxu∈Ui∣∣ck+1u − cku
∣∣. The iteration step for the reputation, c, gives us
|ck+1u − cku| ≤
|fλ,s(Iu)||Iu|
∑i
∣∣∣∣∣∣Rui − rki ∣∣∣p − ∣∣∣Rui − rk−1i
∣∣∣p∣∣∣≤ λ|rkβ − rk−1
β |,
where∣∣∣rk+1β − rkβ
∣∣∣ = maxi∈Iu
∣∣∣rk+1i − rki
∣∣∣, and using the triangular inequality, the translation
invariance of norms and the fact that |fλ,s(Iu)| ≤ 1. Combining the previous inequalities we
get
|rk+1i − rki | ≤
λ
‖ck+1‖1|rkβ − rk−1
β |, (3.4)
which is a contraction for λ < (1 + ∆R)−1, since 1 − ∆Rλ ≤ ‖c‖1 ≤ 1. Therefore (3.1)
converges.
In this work, we consider ∆R = 1− 0.2. Therefore, if λ ≤ 59 then the algorithm converges.
However, we may ensure convergence for any λ ∈ [0, 1[ changing the denominator of (3.3) to
max‖ck+1‖1, 1.
25
3.3.2 Computational complexity analysis
The time complexity of Algorithm 1 is given by the sum of the complexities of each step. Let
G = (V,E) denote a graph, where V is a set of vertices and E a set of edges. Step 3 consists in
building G, where V = U , this is done computing its sparse adjacency matrix,M, where each
rating is used once. Henceforth, the time complexity is O(|C||R|), where |C| = O(1) for simi-
larities LS and KS, and where, for the CS, |C| is the worst case complexity of compressing the
concatenation of pairs of users. Step 4 can be performed using Tarjan’s Algorithm [Hopcroft
and Tarjan, 1973], with time complexity in the worst case of O(|V | + |E|). Step 5 has, in
the worst case, the same time complexity of [Li et al., 2012], i.e., O(κ|R|). In summary,
Algorithm 1 has worst case time complexity of O ((κ+ |C|)|R|+ |V |+ |E|). In theory |E|can be, in the worst case |U |2, leading to a time complexity of O
((κ+ |C|)|R|+ |U |2
). In
practice, since (often) the users are sparsely connected in G, |E| = O(|U |), resulting in a
time complexity of O ((κ+ |C|)|R|+ |U |). In all cases, the space complexity of Algorithm 1
is O(|R|).
3.4 Experimental setup
Next, we detail the metrics we use to evaluate the ranking systems we propose. Further, we
detail the type of attacks and spam that we consider and that we explore in two datasets, in
Section 3.5.
3.4.1 Evaluation metrics
To quantitatively assess the quality of the ranking systems, we compute the Kendall rank
correlation coefficient, a.k.a. Kendall’s tau1, τ [Kendall, 1938]. This statistic measures the
ordinal association between two quantities. Intuitively, the Kendall correlation between two
variables is higher when observations are identical and lower otherwise.
The effectiveness is given by the Kendall tau of the rankings’ vector, r, versus a ground
truth, r, that is τ(r, r). Usually the used ground truth is the AA, due to its simplicity and
its popularity among ranking systems, [De Kerchove and Van Dooren, 2010, Jurczyk and
Agichtein, 2007]. However, evaluating the discrepancy between the ranking vector, r, and the
AA might not be very informative, since it does not capture the possible multimodal behavior
of ratings, and therefore might not be very useful to evaluate the quality of a ranking system.
For this reason, we opt for the robustness metric. Notice that, in the multipartite case, the
effectiveness is helpful to check for homogeneity within the clusters. We generalize the Kendall
1Given two sets X and Y , let C and D denote the sets of concordant and discordant pairs of elements in
X × Y , respectively. The Kendall’s tau is defined as τ = (|C| − |D|)/(|C|+ |D|).
26
tau as
τ =1
|M|
N∑j=1
|Mj |τMj , M =
N⋃j=1
Mj , Mj
⋂Mq = ∅,
whereMj is a subgraph ofM. We denote the effectiveness of a cluster,Mj , by τ(rMj , rAA|Mj).
The robustness evaluates the ability of the system to cope with noise or spamming
attacks. A noisy user gives random ratings to a random set of products [Aggarwal, 2016].
A spamming attacker targets a set of items with the intent of increasing (Push Attack) or
decreasing (Nuke Attack) their rankings.
For the multipartite case, the robustness Kendall tau is τ = τ(r, rspam), where r is a vector
of ri’s given by
ri =1
|M|∑m
|Mm|ri,Mm , where M =⋃m
Mm (3.5)
is the union of subnetworks where users rated item i, and ri,Mm denotes the ranking of
item i for the subnetwork Mm (if any user in the subnetwork rated the item, otherwise it
is undefined). This measure is useful to assess the quality of the partition of the original
network, and it can be used to tune the affinity level, α, between users so that they are in the
same cluster. For items not ranked in a subnetwork, or for new users, we average the rankings
among subnetworks, r, using weights proportional to the size of the subnetworks. Because
the weighted average is not a sufficient statistic, this protects the system against attacks.
We do not present the analysis of the personalization perspective in this chapter. That
is the analysis of how much closer to the real user preferences our cluster-based ranking
system is, compared to the ranking systems that use just the AA or a weighted average of
the ratings. This option because it is trivial to notice that a ranking produced by considering
the preferences of highly similar users is more personalized than a global one. Hence, in this
work, we focus on the robustness perspective and leave the personalization aspect as future
work.
3.4.2 Spamming and Attacks
In the bipartite graph scenario, the information available to a new user is every products’
rankings. This information can be used by malicious users to tamper with the ranking of
an item in a malicious way (push or nuke it.) For instance, in a reputation-based system,
an attacker can give ratings matching the ranking of items to increase its reputation, before
attacking an item.
When allowing for subnetworks, either the user is already classified into a cluster and
he/she accesses the item’s ranking within that cluster, or he/she is a new user. In this case,
the displayed ranking, rj , of the item, j, is the weighted average of its ranking within each
27
subnetwork. Both of these scenarios mitigate the spamming effect. Since the information
made available is not a sufficient statistic, a user cannot fully recover all the information to
efficiently attack the underlying ranking system.
Next, we discuss the robustness of the algorithm to different kinds of spamming/attacks:
• Random spamming : A set of spammers gives random ratings, uniformly distributed on
R, to a random number of items, following a Poisson distribution, starting at 1 with
parameter λP = 5. The rated items are randomly sampled from the initial dataset
distribution of ratings’ number per item.
• Love/hate attack : A set of spammers targets one item to push/nuke and selects another
set of items to nuke/push. In our simulations, each attacker nukes the most voted item
and pushes another random set of nine filler items.
• Reputation attack : In this case, a set of spammers targets one item to push/nuke its
ranking. They randomly select another fixed number of items, from the initial dataset,
typically the most popular ones, and give them the closest ratings to their rankings.
In all experiments we set λ = 0.3, α = 0.8, and for the LS method the confidence level
function `(|Iu,v|) = θ−1 if |Iu,v| ≤ θ and 1 otherwise. The parameter θ sets the number of
common rated items of users u and v from which we are confident that they can be similar.
We choose θ = 3. To evaluate the effect of the attacks/spamming, we compute the robustness
Kendall tau, τ(r, rspam).
3.5 Experimental results
We run all experiments on MATLAB 2016, using macOS 10.11 (2.8 GHz Intel Core 2 Duo
and 4 GB RAM).2
Datasets. In this work, we use two real world datasets obtained from the Stanford Large
Network Dataset Collection, [Leskovec and Krevl, 2014]. We use, as the first dataset, the 5-
core version of “Amazon Instant Video” dataset (Dataset A) that consists of users that rated
at least 5 items, as in [McAuley et al., 2015]. It has 5, 130 users, 1, 685 items and 37, 126
ratings, with R⊥ = 1 and R> = 5, see Table 3.1. We use, as the second dataset, the 5-core
version of “Tools and Home Improvement” (Dataset B), also in [McAuley et al., 2015], see
Table 3.1. This dataset has 16, 638 users, 10, 217 items and 134, 476 ratings, also with R⊥ = 1
2The code will be made available as soon as the anonymity restrictions are no longer required.
28
“Amazon Instant Video” “Tools and Home Improvement”
Users 5, 130 16, 638
Items 1, 685 10, 217
Ratings 37, 126 134, 476
Table 3.1: Details of the datasets A and B.
and R> = 5. Both datasets consist in tuples of the form (user, item, rating, timestamp), and
for both we normalize the ratings by dividing them by R>.
The choice to employ the 5-core version of the datasets is intrinsically related to the
scenario we consider in the evaluation, i.e., a ranking system where users are clustered based
on their similarity. Therefore, having information about the user preferences is key to measure
effective similarities (indeed, if two users did not rate any common items, their similarity would
be 0).
Benchmarks. We compare our results with the reputation-based ranking system in [Li et al.,
2012]. The authors already compared their algorithm with the state-of-the-art algorithms.
Namely, the HITS [Kleinberg, 1999], the Mizz [Mizzaro, 2003], the YZLM [Yu et al., 2006a]
and the dKVD [De Kerchove and Van Dooren, 2010] algorithms, showing that their algorithm
outperforms all the others, in the standard metrics.
Next, we test the robustness of the ranking systems against spamming (noise) and attacks
using the two real datasets. First, we analyze the behavior of the ranking system in the
presence of noise for the two datasets, in Section 3.5.1. Next, we evaluate the robustness of
the algorithms against Love/Hate and Reputation attacks, in Section 3.5.2. Finally, we discuss
how the robustness of the proposed ranking systems responds to changes in the parameters
of the system, namely the parameter α, in Section 3.5.3. We also test this response for the
different decay functions fλ,s and different parameters λ, but since the gains are small, we
omit the tests.
3.5.1 Robustness against random spamming (noise)
We test the random spamming (noise) by simulating a proportion of spammers ranging from
0 to 0.75 of the total number of ratings. The results are reported in Figure 3.1. Using the
multipartite ranking systems, we notice an increase of robustness for the LS, whereas for the
KS and CS we obtain similar robustness to the bipartite methods, because these similarities
29
××
×
××
××
× ×
++
+
++
++
+ +
0 0.15 0.35 0.55 0.75
0.6
0.7
0.8
0.9
1.0
Proportion of spammers
τ
(a) Dataset “Amazon Instant Video”.
××
×
×
××
××
×
++
+
+
++
++
+
0 0.15 0.35 0.55 0.75
0.6
0.7
0.8
0.9
1.0
Proportion of spammers
τ
(b) Dataset “Tools and Home Improvement”.
Figure 3.1: Evolution of the τ for random spamming with the proportion of spammers.
××
××
× × × × ×
+ +
++
+ + + + +
0 0.15 0.35 0.55 0.75
0.80
0.85
0.90
0.95
1.00
Proportion of spammers
τ
(a) Dataset “Amazon Instant Video”.
× × × × × × × × ×
+ + + + + + + + +
0 0.15 0.35 0.55 0.75
0.80
0.85
0.90
0.95
1.00
Proportion of spammers
τ
(b) Dataset “Tools and Home Improvement”.
Figure 3.2: Evolution of the τ for the love/hate attack with proportion of spammers.
××
××
××
××
×
++
++
++
++
+
0 0.15 0.35 0.55 0.750.5
0.6
0.7
0.8
0.9
Proportion of spammers
r target
(a) Dataset “Amazon Instant Video”.
××
×
××
××
××
++
+
++
++
++
0 0.15 0.35 0.55 0.750.5
0.6
0.7
0.8
0.9
Proportion of spammers
r target
(b) Dataset “Tools and Home Improvement”.
Figure 3.3: Evolution of the ranking of the targeted item, rtarget, for love/hate attack with
proportion of spammers.
30
××
×
××
××
××
++
+
++
++
++
0 0.15 0.35 0.55 0.75
0.6
0.7
0.8
0.9
Proportion of spammers
r target
(a) Dataset “Amazon Instant Video”.
××
×
××
××
××
++
+
++
++
++
0 0.15 0.35 0.55 0.75
0.6
0.7
0.8
0.9
Proportion of spammers
r target
(b) Dataset “Tools and Home Improvement”.
Figure 3.4: Evolution of the ranking of the targeted item, rtarget, for reputation attack with
proportion of spammers.
××
× × × × × × ×
++
+ + + + + + +
0 0.15 0.35 0.55 0.75
0.80
0.85
0.90
0.95
1.00
Proportion of spammers
τ
(a) Dataset “Amazon Instant Video”.
× × × × × × × × ×
+ + + + + + + + +
0 0.15 0.35 0.55 0.75
0.80
0.85
0.90
0.95
1.00
Proportion of spammers
τ
(b) Dataset “Tools and Home Improvement”.
Figure 3.5: Evolution of the τ for the reputation attack with proportion of spammers.
accommodate new users by rearranging the clusters, and this degrades the τ .
3.5.2 Robustness against attacks
Now, we simulate two different attacks to the most voted item, rtarget, ranging the proportion
of attackers from 0 to 0.75 of the total number of voters, in this case, of the target item.
Love/Hate attack. For the love/hate attack, we obtain the results in Figures 3.2 and 3.3.
We can see that, using the multipartite ranking systems, the attack is less effective on both
datasets. In the case of the variation of τ , in Figure 3.2, the results are significantly better
when we perform the clustering with the LS. In both datasets, the effect of the attack on the
ranking of the target item, rtarget, in Figure 3.3, is more dimmed in the multipartite scenario,
thus less effective. The best similarity measure to avoid the effect of the attack on the target
31
item’s ranking is the KS.
A remarkable property is that while the KS is effective to deter the attack on the ranking of
the product, it has the most nefarious effect on τ . This is a consequence of the reorganization
of the subnetworks, to minimize the effect of the attack on rtarget, and our generalization of
the Kendall tau does not account for this repercussion. This indicates that the attackers are
not grouped with normal users and thus do not affect the rankings of items in the cluster.
The ranking is not nuked in the multipartite cases as when using the ranking system in [Li
et al., 2012] and BWA.
Reputation attack. In both datasets, using subnetworks, the effect of the reputation attack
on the ranking of the targeted item is dimmed, see Figure 3.4, because the intelligent attacker
chooses the closest rating to the ranking of the filler items, it should not affect drastically
the rankings of the filler items, while the attackers increase their reputation. In fact, in the
multipartite ranking systems, the ranking of the nuked item drops less than in the bipartite
ranking systems, and the best case is when using LS. The robustness τ , see Figure 3.5, has
a similar behavior as in the love/hate attack, and the best robustness is achieved in the
multipartite scenario when using the LS.
The organization of subnetworks changes with the increasing number of spammers, this is
a collateral effect of the system, that helps to cope with the attack. Thus, this effect produces
a bigger change in τ , because it reduces drastically the effect of the targeted attack. Since
the ranking of the filler items does not change drastically (the attackers rate those items with
their weighted average ranking), this is not an important side effect. Moreover, in the larger
cluster, containing users who rated the targeted item, the ranking of the item is almost kept
unchanged, when using LS. For the KS, it has a small variation and has a large variation
for the CS. Both variations reflect the opposite effect on the ranking of the targeted item as
what is intended by the attacker, see Figure 3.6. The clustering produced by the KS and the
CS aggregate attackers with legit users (that gave smaller ratings to the target item) on a
separated cluster, leaving raters who gave high ratings on the biggest cluster. Recall that for
new users, the displayed rankings are a weighted average of the ratings by user’s reputations,
whereas in each cluster they are the weighted average within the users of the cluster.
An important observation is that the experiments, in both datasets, are coherent in the
sense that we obtain similar results for the different attacks and spam. This indicates that
when we scale the size of the dataset, we expect to get similar robustness to the attacks for
the evaluated metrics, τ and Rtarget. As we pointed out, the multipartite scenario allows us
to present rankings of items to users that allow a multimodal behavior, and this can also
32
××
××
××
××
×
+
+ +
+ +
+
+ +
+
0 0.15 0.35 0.55 0.750.5
0.6
0.7
0.8
0.9
1.0
Proportion of spammers
r targetinlargestcluster
(a) Dataset “Amazon Instant Video”.
××
××
××
××
×
+
+
+
+
++
+
++
0 0.15 0.35 0.55 0.750.5
0.6
0.7
0.8
0.9
1.0
Proportion of spammers
r targetinlargestcluster
(b) Dataset “Tools and Home Improvement”.
Figure 3.6: Evolution of r with proportion of attackers, for reputation attack, in the largest
cluster.
be explored for the item recommendation scenario, where we expect to get recommendations
more tailored to the users.
3.5.3 Sensivity to parameters
Here, we discuss the response of our reputation-based ranking system to the variation of its
parameters, using Dataset B (the results for Dataset A are almost the same, so they have not
been reported due to space constraints).
In Figure 3.7, we look for the affinity level, α, such that the variation of rtarget with the
increase of the number of attackers is smaller, in the case of the love/hate attack. For the LS
case, Figure 3.7a we have best results for α ∈ [0.4, 0.6]. For the CS, Figure 3.7b we obtain
best results α ∈ [0.5, 0.6]. Finally, for KS, Figure 3.7c the best level of the affinity level is
α ∈ [0.6, 0.9]. To chose a satisfactory level of affinity we need to analyze the effect of α, not
only on the ranking of the attacked item, rtarget , but also on the robustness, τ . In this case,
we look for values of τ close to one.
When using LS for clustering and comparing both Figures 3.7a and 3.8a, we see that the
best affinity level lays in the interval α ∈ [0.4, 0.6]. Choosing some α in this interval allows the
systems to protect the ranking of an item, rtarget, maintaining the robustness of the system, τ ,
close to one. In both the CS and KS cases, the affinity level that protects better the ranking
of the attacked item produces worst robustness to attacks, not only within the clustering
method, but also when compared to LS. These results are in line with the previous discussion
in Section 3.5.2. We suspect that the effect of parameter α on the τ metric is due to the
fact that CS and KS produce more clusters (without a bigger one) and the users tend to be
33
Proportion of Attackers× 0.05 0.15 0.25
× × × ××
×
×
× × ×
0.1 0.3 0.5 0.7 0.90.70
0.75
0.80
0.85
0.90
0.95
1.00
α
τ
× × ×
× × × × ×
×
0.1 0.3 0.5 0.7 0.9
0.75
0.80
0.85
0.90
α
r target
(a) rtarget versus α, using LS.
× × × ×
××
× × ×
0.1 0.3 0.5 0.7 0.9
0.75
0.80
0.85
0.90
α
r target
(b) rtarget versus α, using CS.
× × × × ×× × × ×
0.1 0.3 0.5 0.7 0.9
0.75
0.80
0.85
0.90
α
r target
(c) rtarget versus α, using KS.
Figure 3.7: Variation of rtarget with the affinitity parameter, α, for different proportions of
attackers.
regrouped as the proportion of attackers change, and this effect is not captured by τ .
3.6 Concluding Remarks
In this chapter, we advanced state of the art in ranking systems, both theoretically and
algorithmically. We developed a new multipartite ranking system that allows the coexistence
of multiple preferences by enabling different rankings for the same item for different users. This
is achieved by automatically clustering similar users, based on their given ratings. For each
cluster, we used a bipartite reputation-based ranking system, for which we proved convergence
and efficiency in a more general setting than previous results. Our method favors the creation
of bubbles, i.e., segregates users into groups, which we show that makes the ranking system
more robust to attacks and spamming.
As future work, we will analyze the impact of our approach regarding personalization. We
will also investigate the effect of bribing users to influence the ranking of items, as in [Grandi
and Turrini, 2016]. Also, in order to reduce the rate of change in the clusters, we will
34
Proportion of Attackers× 0.05 0.15 0.25
× × × ××
×
×
× × ×
0.1 0.3 0.5 0.7 0.90.70
0.75
0.80
0.85
0.90
0.95
1.00
α
τ
× × × × × × × × ×
0.1 0.3 0.5 0.7 0.90.70
0.75
0.80
0.85
0.90
0.95
1.00
α
τ
(a) τ versus α, using LS.
× ××
×
×
×
× × ×
0.1 0.3 0.5 0.7 0.90.70
0.75
0.80
0.85
0.90
0.95
1.00
α
τ
(b) τ versus α, using CS.
×× ×
× ×
× × × ×
0.1 0.3 0.5 0.7 0.90.70
0.75
0.80
0.85
0.90
0.95
1.00
α
τ
(c) τ versus α, using KS.
Figure 3.8: Variation of τ with the affinitity parameter, α, for different proportions of attack-
ers.
explore the use of steadiness functions, based on a timestamp, so that established clusters do
not change so easily. Another possible extension of the proposed algorithm is to use it for
recommendation systems.
35
Chapter 4
Reputation-based Ranking Systems
and their Resistance to Bribery
This chapter studies the effect of bribing in two classes of reputation-based ranking systems,
the bipartite and the multipartite reputation-based ranking systems. For both scenarios,
we define bribing strategies and the profit of playing a bribing strategy. We compute the
optimal bribing strategies, and we test the bipartite and multipartite reputation-based ranking
systems, introduced in Chapter 3. As expected, the latter is more robust to bribery than the
former. Hence, we endorse the step forward we gave in Chapter 3, when designing a ranking
system that assures more information security. This work is published in [Saude et al., 2017].
4.1 Introduction
The evolution of contemporary society towards an information economy boosted the devel-
opment of e-commerce. The fast pace of information spreading around the world has been
reconfiguring our social interactions. The social networks and online fora instigated the ex-
change of opinions and the online word of mouth (WOM), which in turn gained the potential
to drive e-commerce sales, see [Davis and Khazanchi, 2008, Kietzmann and Canhoto, 2013]
and [Maslowska et al., 2017], for instances.
Nowadays, the importance of reviews and rankings became paramount for sellers, as the
visibility and the sales numbers are related with them, [Chevalier and Mayzlin, 2006,Dellarocas
et al., 2007] and [De Maeyer, 2012]. Further, studies pointed out that, in several cases, online
reviews may be more influential than traditional marketing, see [Bickart and Schindler, 2001].
This strong influence increased the attempts to manipulate them, [Hu et al., 2012], and it
fostered the need for designing ranking systems robust to spam and attacks, as in [Li et al.,
2012, Saude et al., 2017] and references therein. The companies invest money to convince
37
users to vouch for their products/services, either by giving samples of them so that users can
comment on them or by directly paying users to provide positive feedback on their products
and negative on competitor ones, see [Cialdini and Garde, 1987]. Aware of the importance
and influence of bribing to manipulate rankings, we model this phenomenon and characterize
it quantitatively, so that its impact can be better understood. We mitigate the impact of such
behaviors, by showing that reputation-based ranking systems using clusters are, in general,
more robust to bribery.
Previous work. The influence of individual decisions on global properties in network-
based rating systems was studied in several works. In [Apt and Markakis, 2014], the authors
investigate how to turn a product into a tendency among users by changes on a social network.
In [Simon and Apt, 2015], the authors explore how to design an impartial mechanism for peer
review to mitigate the effect of a reviewer interfering with the likelihood of its work being
accepted.
In Chapter 3, we proposed a reputation-based ranking system that clusters users by their
ranking pattern similarities, an idea that we also explore in the context of recommender
systems in the subsequent chapters of the dissertation. We showed that, by doing so, our
approach is more robust to both spamming users and users trying to attack the ranking
system to change the ranking of a set of items.
The authors of [Grandi and Turrini, 2016] analyzed the resistance of two ranking systems,
one that simply averages the ratings of users (AA), another that takes into account the
influence network of a given user, using the AA to compute the ranking for each network.
They showed that the AA ranking system is bribable, and, in particular, bribing users who did
not rate is profitable. When considering social networks of users, they show that the bribery
effect is diminished. Their work assumes a fixed set of users with only one item to rate. The
AA does not capture a possible multimodal ratings’ behavior, as noticed in [Hu et al., 2006].
This motivated us to study bribing in reputation-based ranking systems and to explore the
case where users are clustered in groups which, intuitively, must lessen the bribing effect.
Our contribution. Here, we study the resistance to bribery of a class of ranking systems
that assign reputations to users. We show that a ranking system computing the items rank by
the weighted average of users ratings with their reputations is bribable since users that rated
the item with a reputation above the users’ average reputation are bribable. By clustering
users by their rating pattern and assigning possibly different rankings for the same item for
each cluster, we increase the bribery resistance of the ranking system. This makes the ranking
system that we propose in Chapter 3 much more robust to bribing. Further, a user is bribable
if its reputation is larger than the average reputation of the users that rate the item. This
bound increases in the clustering scenario, since within each cluster the number of users that
38
rated the item is smaller than the non-clustering scenario. Our model also applies to evaluate
marketing strategies, where a company is willing to invest money to augment its sales, either
increasing the users base or boosting positive reviews.
4.2 Definitions
Here, we set up the notational conventions and definitions to keep the chapter self-contained.
Recall that reputation-based ranking systems assign weights to users to aggregate their rank-
ings on given items, see Section 2. Further, we discuss two classes of ranking systems, namely,
the bipartite reputation-based ranking systems and the multipartite reputation-based ranking
systems, that include the instances presented in Chapter 3.
Schematically, as we stated in Chapter 3, we represent these ranking systems as a bipartite
graph, in which a set of vertices corresponds to users and the other to items. The edges
connecting vertices are weighted by the ratings that users gave to items and only connect
users to items, recall Figures 2.1 and 2.2 from the Chapter 2 that we condensate in Figure 4.1.
The bipartite graph corresponds to Figure 4.1 when not considering the dashed lines.
Recall that a multiparite ranking takes two steps. In the first step, the users are clustered
by their rating pattern similarities. In the second step, the ranking of each item is iteratively
computed, as in the bipartite ranking system case, only using information from each cluster,
producing (possibly) different rankings for different clusters. Since the ranking of an item may
differ from cluster to cluster, it is worth recalling that every user on the same cluster accesses
the local ranking of a given item, in the case where the item was rated by, at least, one user
belonging to that cluster. If for a given item, within a given cluster, no user rated that item,
the available ranking for that item is the weighted average of that item’s rankings among
clusters where the item was rated. Each user can link to several items, by edges weighted by
its ratings, as before. Now, we allow for edges between users (encoding similarities between
them), with clusters of connected users, forming a multipartite graph, see Figure 4.1. In the
bipartite ranking, recalling (3.2) and writing it without the iteration index, the ranking of
the item i is computed by
ri =1
α
∑u∈Ui
cuRui, where α =∑u∈Ui
cu,
and Ui denotes the set of users that rated item i ∈ I, where I is the set of all users.
Next, we recall the definition of rankings for the multipartite ranking systems, introduced
in (3.5). In the multipartite scenario, let M1, . . . ,MN be a partition of the set of users U
39
Figure 4.1: Bipartite/multipartite graph representation of users and items with edges inter-
connecting them weighted by the users’ ratings for items, not considering/considering the
dashed links.
into N disjoint groups of users, that is,
U =N⋃n=1
Mn and, for m 6= n, Mm
⋂Mn = ∅.
We denote the set of items rated by users from cluster Mn by IMn , with IMn =⋃u∈Mn
Iu,
where Iu is the set of items rated by user u. The set of users in the clusterMn that rated item
i is denoted by UMni , where UMn
i = Ui ∩Mn. Now, the ranking is computed independently
for each cluster as
rMni =
1
α
∑u∈UMn
i
cuRui,
where α =∑
u∈UMni
cu. As described in Chapter 3, note that for users belonging to a cluster
Mn, the displayed ranking of item i can be one of the following two possibilities: (i) the
ranking of the item for that cluster rMni , whenever there are users in the cluster that rated
item i; (ii) otherwise, the ranking of item i is the weighted average of the rankings of i for
the clusters with users that rated item i, that is,
ri =∑n∈Xi
|UMni |rMn
i
/ ∑n∈Xi
|UMni |,
where Xi = m : i ∈ IMm and m = 1, . . . , N. In what follows, for a set of users U ′ ⊆ U ,
cU ′ =∑
u∈U ′ cu/|U ′|.
Suppose the seller of item i has an initial wealth proportional to the item’s ranking and
to the number of customers that rated the item. To boost the sales of i, the seller may invest
his resources (wealth) to promote the item’s popularity so that users like it more, and/or to
expand his consumer base by making people buy it, like it, but not necessarily love it. Here,
we model this setting assuming that the popularity is an increasing function of the ranking
40
of the item, ri, and supposing that the number of consumers that bought the item is an
increasing function of the number of users that rated the product, |Ui|.We define the reward function or wealth, in the bipartite and multipartite ranking systems,
for the seller of item i as
Ji = |Ui| ri and Ji =∑n∈Xi
JMni ,
respectively, where JMni = |UMn
i |rMni .
We define the strategy of the seller of item i as a vector σi ∈ Si, with size |U |, where the
u-th entry is the value of the invested wealth to convince user u to increase his rating by ρu,
and Si ⊆ [0, 1]|U | \ 0, where 0 is the null strategy that does not bribe any user. If user u
rated item i with Rui, then ρu ≤ 1 − Rui. If ρu = 0 this means the seller does not try to
persuade user u to rate or to change his rating on item i.
For the seller of item i, we denote by
Ξi = σi ∈ Si : σi(u) = ρu = 0 for all u /∈ Ui
the set of strategies that consists, exclusively, in bribing users that already rated the item i.
Analogously, we denote by
Ξi = Si \ Ξi = σi ∈ Si : σi(u) = ρu = 0 for all u ∈ Ui,
the set of strategies of bribing users that did not rate item i. We say that a bribing strategy
σi is an elementary strategy if for some user u ∈ U we have that σiu > 0 and, for all v ∈ Uwith v 6= u, σiv = 0. To easy notation, instead of denoting by σi(u) the strategy of seller of
item i to bribe user u, we write σiu. Further, the wealth spent by playing strategy σi is given
by
‖σi‖1 =∑u∈U
σiu.
After strategy σi, the wealth of seller i becomes
Jσi = |Uσi | rσi −∑u∈Uσi
ρu, with Jσi =∑n∈Xi
∣∣∣UMn
σi
∣∣∣ rMn
σi−∑u∈Uσi
ρu,
respectively, for the bipartite and the multipartite ranking systems, where rσi is the new value
of ri after σi.
The profit of playing the strategy σi is
πσi = Jσi − Ji and πσi = Jσi − Ji,
respectively, for the bipartite and multipartite ranking systems.
41
4.3 Bribing in ranking systems
Here we study the resistance to bribery of reputation-based ranking systems. To simplify
the analysis, we assume a fixed assignment of reputations to users. First, we describe the
set of decomposable bribing strategies. After, we find the conditions for the strategies to be
profitable. Lastly, we compare bipartite ranking systems with multipartite ranking systems
and show that, by using clusters, the multipartite system is more robust to bribery.
4.3.1 Properties of strategies and its profit in the bipartite ranking systems
First, we investigate what particular conditions allow us to decompose a strategy into elemen-
tary ones. We start by considering the case where item i ∈ I sellers bribe users that already
rated the item, proving that all strategies bribing several users at once are decomposable into
several elementary ones.
Proposition 1. Let u, v ∈ Ui be two users that rated the item i ∈ I. If two strategies, σiu and
σiv, consist in bribing users to change their ratings from Rui and Rvi to Rui+ρu and Rvi+ρv,
respectively, then we have that πσiu+σiv= πσiu + πσiv .
Proof. When the seller of item i plays the strategy σiu, the ranking of item i changes according
to rσiu = ri + α−1cuρu. Thus, an elementary strategy’s profit is:
πσiu = |Ui| rσiu − ρu − |Ui| ri =
(cucUi− 1
)ρu. (4.1)
The profit of the sum of strategies is given by:
πσu+σv = |Ui| rσiu+σiv− (ρu + ρv)− |Ui|ri = πσu + πσv ,
the profits’ sum of elementary strategies, πσiu and πσiv .
Now, we consider the case when a seller opts to bribe users that did not rate the item i.
Proposition 2. Consider a user that did not rate the item i, i.e., u /∈ Ui, and any other user
v ∈ U . The strategy that is to bribe both users, u and v, does not carry the same profit as
the sum of the profits of bribing each user, i.e., πσiu+σiv6= πσiu + πσiv , unless both elementary
strategies have zero profit.
Proof. If both users did not rate the item i, their strategies change the ranking of the product
in the same way
rσiu =
∑v∈Ui cvRvi + cuρu
α+ cu=αri + cuρuα+ cu
.
Thus, we have
πσiu = (α− |Ui| cu) (ri − ρu)/(α+ cu). (4.2)
42
Hence, the profit for the sum of strategies, σiv + σiu, is
πσiu+σiv= |Uσiu+σiv
|rσiu+σiv− (ρu + ρv)− |Ui| ri
=α+ cuα
πσiu +α+ cvα
πσiv +1
α(ρu − ρv)(cu − cv),
where α = α+ cu + cv. To have a positive profit of the sum of strategies that is equal to the
sum of the profits of each elementary strategy, we need the following conditions to hold
α+ cuα
=α+ cvα
= 1 and (ρu − ρv)(cu − cv) = 0,
this implies cu = cv = α− α > 1, which contradicts the fact that cu, cv > 0. However, in the
case that cu = cv = cUi the sum of the strategies’ profit (each being zero) is zero.
The case where one of the users to be bribed did not rate the item, v /∈ Ui, but the other
user did, u ∈ Ui, yields a profit
πσiu+σiv=∣∣Uσiv ∣∣ rσiu+σiv
− (ρu + ρv)− |Ui| ri
=|Ui|α+ cv
cv(ρv − ri) +|Ui|α+ cv
cuρu
+α
α+ cv(ri − ρv) +
1
α+ cvρu(cu − cv)−
α
α+ cvρu
=α
α+ cvπσiu + πσiv +
1
α+ cvρu(cu − cv),
(4.3)
which carries the same conclusion as above.
As we noted in the previous proof, special conditions on the users’ reputation make the
profit zero, hence decomposable into elementary strategies. We discuss this in the next result.
Proposition 3. Pick an item i ∈ I. Consider the following case: The seller of i bribes
users that already rated the item i, u, v ∈ Ui, and all the users have the same reputation
cu = cv = cUi. In this case, the strategy is not profitable, and the sum of the elementary
strategies is zero, πσiu+σiv= πσiu + πσiv = 0.
Proof. For the strategies composition, the profit is given by (4.3), with cv = cw = cUi , thus
πσiv+σiw= 0. σiv has profit given by (4.1), with cv = cUi , thus πσiv = 0. σiw has profit given
by (4.2), where cw = cUi , hence πσiw = 0.
Next, we analyze strategies regarding the profit they carry, in order to classify users into
bribable and non-bribable ones, based on their reputation. We assume that all information
is publicly available to sellers, both users’ ratings and reputations. First, we analyze bribing
users that already rated the item.
Proposition 4. If user v rated item i, v ∈ Ui, a valid strategy, σi ∈ Ξi, s.t. ‖σi‖1 = σiv = ρv,
is profitable if cv > cUi.
43
Proof. Since, ρv > 0, the profit of such strategy, σiv is given by (4.1), which is positive whenever
cv >αUi
= cUi .
We obtain, as a corollary of Proposition 4, the result of Lemma 2 in [Grandi and Turrini,
2016], if v ∈ Ui, and cv = cu = cUi for all u ∈ Ui (the ranking is given by the arithmetic
average) then πσiv = 0.
Suppose item i seller wants to bribe a user who did not rate the item. We study what
conditions make this action profitable.
Proposition 5. Let v /∈ Ui, the strategy σiv is profitable whenever one of the following holds:
1) cv < cUi and ρv < ri, or
2) cv > cUi and ρv > ri.
Proof. The result follows from (4.2).
Note that this marks a difference from the work in [Grandi and Turrini, 2016], where, in
Example 1, the authors showed that a user that did not rate the item can be bribed and it
always increase the wealth.
Further, notice that the result of Proposition 5 means that, in case 1), if a seller bribes a
user (that did not rate item i) that has reputation below the average then the bribing value,
ρv, must be smaller than ri. This happens because, the effect of bringing a new rater to the
set of raters increases the wealth, as long we do not pay a high price, ρv, since the reputation
of the user is smaller, henceforth the effect on the rating is small. In the case where the bribed
user has a reputation above the average, case 2), its effect on the ranking of the item is large,
so bribing with a value below the ranking degrades it, thus lessening the wealth. Hence, if
the ranking is computed by the AA, it is not profitable to bribe a user that did not rate the
item.
4.3.2 Optimal Strategies in the Bipartite Ranking Systems
Here, we investigate what is the optimal investment strategy that the seller of item i should
use to increase his/her initial wealth, by influencing the opinion of customers. First, we
consider two simpler cases where a vendor either tries to change the opinion of users that
already rated item i or tries to persuade users that did not rate it. Then, we analyze the
more complex case when a seller influences both raters and non-raters. We obtain the optimal
bribing strategies in closed form.
To model these problems, we consider a common set up. The seller of item i has an initial
wealth of Ji, and we consider two reference customers, u and v, with reputations cu > cv. We
44
compute the profit per amount of invested wealth, πσ‖σ‖1 , so we can design the optimal bribing
strategy.
Bribing users that already rated item i. Let us consider the case where the seller wants
to bribe users that already rated item i, i.e. u ∈ Ui. We formulate this problem as
maximize: πσi , subject to: ‖σi‖1 ≤ Ji, σi ∈ Λi, (4.4)
where Λi = Ξi. As we show in Proposition 4, to have a positive profit πσiu , when bribing user
u, we need to have cu >α|Ui| = cUi . Therefore, we do not consider strategies that bribe users,
v, s.t. cv < cUi , since it would not increase the wealth, Ji.
Let cu > cv > cUi , we look into the profit per unit of invested resources, πσiu/ρu−πσiv/ρv =
(cu − cv)/cUi > 0. Hence, the profit per unit of invested wealth is larger for user u than for
user v. The optimal strategy is then: to bribe users by decreasing order of their reputation,
investing all the available wealth until either the exhaustion of available profitable users
(cu > cUi) or the depletion of funds.
Bribing users that did not rate the item i before. Suppose that the seller of item i
wants to bribe users that did not rate the item, i.e., u /∈ Ui. We formulate this problem as
(4.4) with Λi = Ξi. Let users u, v /∈ Ui be s.t. cu > cv, and let
α =∑w∈Ui
cw, γ =|Ui| cu − αcu + α
and δ =|Ui| cv − αcv + α
.
The profit is given by (4.2), hence, we have, for user u and v,
ρu − ricu + α
(|Ui| cu − α) andρv − ricv + α
(|Ui| cv − α) ,
respectively. The difference of profits is
(ρu − ri)γ − (ρv − ri)δ,
and hence for the same amount of wealth spent,
πσiu/(ρu − ri) > πσiv/(ρv − ri),
because γ > δ.
Again, the optimal strategy is to bribe users by decreasing order reputation, investing all
the available wealth until either the exhaustion of profitable users (cu > cUi) or funds.
45
General case. Again, under the same conditions for the seller of item i, we now consider
that all users, u ∈ U , are bribable. The problem of finding the best bribing strategy is (4.4)
with Λi = Si = Ξi ∪ Ξi. Next, we investigate when it is better to bribe a user u ∈ Ui or a
non-rater user v /∈ Ui. For this, we consider the profit change rate, which are
πσiu/ρu = δ and πσiv/(ρu − ri) = γ,
respectively. In the case, cu ≥ cv we always have δ ≥ γ. In the other case, cu < cv, we have
γ < δ whenever either cUi < 1/|Ui| and cu < α, or cUi ≥ 1/|Ui|. Again, the optimal strategy
consists in ordering bribable users by decreasing reputation for each of the sets Ui and U \Ui,and start allocating wealth first to Ui.
4.3.3 Properties of strategies and its profit in Multipartite Ranking Sys-
tems
Now, we explore the profit of bribing on the multipartite case. To simplify the analysis, we
assume that, when a user is bribed and changes his/her rating for an item, his/her reputation
keeps unchanged. This assumption is not unrealistic since not only whenever the user has
rated several items its reputation’s change is small if only one of his ratings changes, but also
because in real systems the re-computation of the reputations is often performed only from
time to time. We assume that the users’ ratings and reputations are publicly available, but
the network of users, i.e., the clusters’ partition is private.
Proposition 6 (Bribing a user in a cluster that already rated the item). Suppose that v ∈UMsi , for some cluster s ∈ 1, . . . , N. If cv > c
UMsi
, then any σv ∈ Ξv is profitable.
Proof. Following the same steps as in the proof of Proposition 4, replacing Ui by UMsi , we
have that
πσiv = Jσiv − Ji = ρv(cv/cUMsi− 1) > 0.
This result is Proposition 4 applied to Ms.
Proposition 7 (Bribing a user in a cluster to rate a non-rated item in the cluster). Suppose
that v ∈Ms, for a cluster s ∈ 1, . . . , N, and consider an item, i, that was not rated by any
member of the cluster, that is i /∈ IMs. In this case, any σv ∈ Ξv is non-profitable.
Proof. Since |UMsi | = 0, then
πσiv =∑m∈Xi
|UMmi |rMm
i + (|UMsi |+ 1)
cvρvcv− ρv −
∑m∈Xi
|UMmi |rMm
i = 0.
46
Proposition 8 (Bribing a user in a cluster to rate an item that he did not rate before, but
i ∈ IMs). Suppose that we want to bribe a user that did not rate item i and the user belongs
to a cluster where some user already rated item i, in other words, v ∈ Ms, v /∈ UMsi and
i ∈ IMs. The strategy σiv is profitable whenever one of the following holds:
1) cv < cUMsi
and ρv < rMsi ,
2) cv > cUMsi
and ρv > rMsi .
Proof. By an adaptation of (4.2), the profit of σiv is
πσiv = (|UMsi |+ 1)rMs
σiv− ρv − |UMs
i |rMsi = (α− |UMs
i |cv)rMsi − ρvα+ cv
,
where α =∑
u∈UMsi
cu. It is profitable if 1) or 2) holds.
4.3.4 Optimal Strategies in Multipartite Ranking Systems
Next, we study the optimal bribing strategies for the multipartite ranking system, as we did
in Section 4.3.2 for the bipartite ranking systems. Again, we consider three scenarios: (i)
bribing users that rated the item; (ii) bribing users that did not rate the item; (iii) bribing
users from the set of all users. We compute the close form of the optimal strategies for some
cases, for the others LP can be used.
To model these problems we assume that the seller of item i disposes of an initial wealth
given by Ji, and we consider two reference customers, u and v, with reputations s.t. cu > cv.
Bribing users that rated item i. Consider that the seller wants to bribe users that
already rated item i, u ∈ Ui, i.e.,
maximize: πσi , subject to: ‖σi‖1 ≤ Ji, σi ∈ Υi, (4.5)
where Υi = Ξi. There are two cases to explore: (i) both users are in the same cluster; (ii)
each user is in a different cluster.
(i) Suppose that u, v ∈ Ms are two users that already rated item i. By Proposition 6, to
have a positive profit πσiu , when bribing user u, we need to have cu > cUMsi
. Thus, we do
not consider strategies that bribe a user, v, s.t. cv < cUMsi
, because it would not increase the
wealth, Ji.
Let cu > cv > cUMsi
, we compute the profit per unit of invested resources,
πσiu/ρu − πσiv/ρv = (cu − cv)/cUMsi
> 0.
Thus, the profit per unit of invested wealth is larger for user u. Hence, as we obtained for the
bipartite ranking systems, the optimal strategy is: to bribe users by decreasing reputation,
47
investing all the wealth until either the lack of available profitable users (cu > cUMsi
) or the
exhaustion of funds to bribe profitable users.
(ii) When each reference user belongs to distinct clusters, u ∈Ms, v ∈Mt and s 6= t, we have
that if |UMsi | ≥ |UMt
i |, then the profit per unit of invested wealth (πσiu/ρu versus πσiv/ρv) is
larger for user u. If |UMsi | < |UMt
i | then the profit per unit of invested wealth is larger for
user u if |UMsi | > (cu− cv)−1 and |UMt
i | < (|UMsi |cu− 1)/cv, and larger for user v, otherwise.
Bribing users that did not rate the item i. Under the same conditions for item i seller,
suppose that he wants to bribe users that did not rate i, i.e. u /∈ Ui. We formulate this as
(4.5) with Υi = Ξi. Recalling Proposition 7, we only need to explore the case where the seller
of item i wants to bribe users belonging to clusters with users that already rated the item,
clusters m s.t. i ∈ IMm , otherwise the profit is zero. Let users u, v ∈ Ms and u, v /∈ Ui be
s.t. cu > cv, and let
α =∑
w∈UMsi
cw, γ =
∣∣∣UMsi
∣∣∣ cu − αcu + α
and δ =
∣∣∣UMsi
∣∣∣ cv − αcv + α
.
By Proposition 8, we have that the profits for bribing users u and v are
ρu − rMsi
cu + α(|UMs
i |cu − α) andρv − rMs
i
cv + α(|UMs
i |cv − α),
respectively. The difference of profits is
(ρu − rMsi )γ − (ρv − rMs
i )δ,
hence, for the same amount of spent wealth,
πσiu/(ρu − ri) > πσiv/(ρv − rMsi ),
because γ > δ.
Again, the optimal strategy is to bribe users by decreasing order reputation, investing all
the available wealth until either the exhaustion of profitable users (cu > cUMsi
) or funds.
In the case both users are in distinct clusters and did not rate item i, we cannot derive
simple conditions and we need to solve a LP for each instance.
General case. Under the same conditions for item i seller, we consider that all users,
u ∈ U , can be bribed. The problem of finding the best bribing strategy is written as (4.5)
with Υi = Si = Ξi ∪ Ξi. Next, we investigate when it is better to bribe a user u ∈ UMsi or a
non-rater user v /∈ UMsi . The result is the adaptation of the one for the general case in 4.3.2.
48
We consider the profit change rate, which areπσiuρu
= δ andπσiv
ρu−ri = γ, respectively. In the
case, cu ≥ cv we always have δ ≥ γ. In the other case, cu < cv, we have γ < δ whenever either
cUMsi
< 1/|UMsi | and cu < α, orcUi ≥ 1/|UMs
i |.
Again, the optimal strategy is to order bribable users by decreasing reputation for each of the
sets UMsi and U \ UMs
i , and start allocating wealth to UMsi and, afterward, to U \ UMs
i . If
the reference users are in different clusters, we cannot draw simple conditions, and we need
to solve the LP for each instance.
4.3.5 Bipartite vs. Multipartite Ranking Systems
Here, we compare the profits obtained in the multipartite case and bipartite case, for same
conditions. In the case where the user rated the item, we have the following result:
Proposition 9. Suppose that the seller of item i wants to bribe a user v that already rated
the item, i.e. v ∈ Ui. Let the user v be in cluster Ms, then the profit is larger in the bipartite
ranking systems, πσi < πσi, if and only if c(Ui\UMs
i )< c
UMsi
, the average of the reputations in
(Ui \ UMsi ) and UMs
i , respectively.
Proof. By definition, πσi < πσi is the same as∣∣∣UMs
i
∣∣∣ cv∑u∈UMs
icu− 1
ρv <
(|Ui| cv∑u∈Ui cu
− 1
)ρv,
which is equivalent to
|UMsi |
∑u∈Ui
cu < |Ui|∑
u∈UMsi
cu.
Noticing that
Ui = UMsi ∪ (Ui \ UMs
i ),
we can rewrite it as
|UMsi |
∑u∈UMs
i
cu +∑
u∈Ui\UMsi
cu
<(|UMsi |+ |Ui \ UMs
i |) ∑u∈UMs
i
cu.
This is
c(Ui\UMsi ) < c
UMsi
.
49
5 10 15 20 25 30 350
100
200
300
400
σ1 σ2 σ3 σ4
0 100 200 300 400403
404
405
406
407
(a)
0 100 200 300 400
396
398
400
402
404
(b)
Figure 4.2: Profit of bribing strategies of the most rated item’s sellers in (a) bipartite ranking
system (σ1 – σ4), and (b) multipartite ranking system (σ1 and σ2).
Hence, there are cases where bribing a user in the multipartite ranking system is more
profitable than in the bipartite ranking system. Since the clusters’ partition is assumed to be
unknown for the sellers, they cannot determine the users that verify the previous condition.
Unlike users’ reputations that are often public. Now, we compare the profit of bribing a user
that did not rate the item i in the case the bribed user v belongs to a network where no users
rated the item, v ∈ Ms and i /∈ IMs . In this case, bribing user v in the multipartite ranking
system yields zero profit, but in bipartite one the strategy can be profitable, as we showed in
Proposition 5. In the case that the bribed user did not rate the item, but he/she belongs to a
cluster where some user rated the item, we cannot draw simple conditions as in the previous
cases. We need to check for each concrete case which one is the most profitable.
4.4 Simulations
Here, we illustrate the main results of the chapter with real data, using the 5-core version of
“Amazon Instant Video” data set, [McAuley et al., 2015], as in Chapter 3, with 5130 users,
1685 items, 37126 ratings, and where each user rated, at least, 5 items.
We simulate bribing strategies for the seller of the most rated item (455 ratings). This
item allows us to have more data to explore (the results would be similar for other items).
We study the effect of four strategies in bipartite and multipartite ranking systems, which
are: σ1 – bribe users that rated the item, by a random order; σ2 – bribe users that rated the
item, by decreasing reputation; σ3 – bribe users uniformly at random, from all users (only
for bipartite ranking systems); σ4 – bribe users in decreasing order of reputation (only for
bipartite ranking systems). In Figure 4.2 (a) and (b), we show the results of different bribing
50
Fixed reputations Dynamic reputations
0 100 200 300 400
404.0404.5405.0405.5406.0406.5
Figure 4.3: Profit of bribing strategy σ2 in the bipartite ranking system, fixed users’ reputa-
tions versus reputations recomputed after each user being bribed.
strategies for bipartite and multipartite ranking systems, respectively. The steps where the
rewards are constant, in Figures 4.2 (a) and (b), represent choosing users that already rated
the item with the maximum allowed rating. For the bipartite ranking system, Figure 4.2 (a),
after bribing the same users in strategies σ1 and σ2, both strategies yield the same profit,
as stated in Proposition 1. Finally, the strategy σ3 of Figure 4.2 (a) is to bribe users, from
the set of all users, by decreasing reputation. As expected, bribing users among the ones
who rated the item and are more influential (have a larger reputation) results in a faster
increase of reward, whereas random bribing among the item’s raters has an expected profit
close to zero, and does not increase wealth. The strategy σ4 is the most profitable, but only
after a certain number of users in comparison to strategy σ2. For all strategies, the profit
is positive. In the multipartite ranking system scenario, Figure 4.2 (b), for σ1 and σ2, the
wealth is strictly smaller at the end of the bribing strategy. This illustrates the fact that
the multipartite rankings system is more robust to bribery than the bipartite one, which
meets the discussion in Section 4.3.5. Lastly, we apply strategy σ2 to the bipartite ranking
system, assuming that the users’ reputations are fixed, or without this restriction (each time
a user is bribed, both rankings and reputations updated) as in Chapter 3. The results of this
experiment are depicted in Figure 4.3. In Figure 4.3, we see that the reputations of the bribed
users decrease therefore the impact of their ratings is smaller as well their profits than when
the reputations are fixed.
4.5 Concluding Remarks
We model bribing in two reputation-based ranking systems. The first ranking system does
not aggregate users, while the second one clusters users by their similarities and, therefore,
presents a dedicated ranking of items for each cluster. In both settings, we show which users
to bribe to get positive profit, and we show that clustering users decrease the number of
profitable bribing strategies. We illustrate our results with a real-world dataset. In future
51
work, we would like to study the interactions between bigger and smaller players, and the
scenario where sellers bribe users to decrease a competitor item’s ranking through a game
theory model with the sellers as players. Another aspect we want to explore and incorporate
into the bribery analysis is the impact on the profit of strategies of dynamic reputations that
changed when the ratings change. Therefore studying new conditions to design profitable
bribing strategies.
52
Chapter 5
Recommendation via Matrix
Completion Using Kolmogorov
Complexity
This chapter explores the two Kolmogorov complexity based similarity measures introduced
in Chapter 3, KS and CS, to design a recommender system. The proposed system performs
neighborhood-based collaborative filtering (CF), and it allows to compute individual user
recommendation lists incurring in low time and space complexity costs, faster than the state-
of-the-art approach to which we compare. Further, when tested, our algorithm presents
comparable results to the algorithms we test, sometimes even better. This work was submitted
for publication, see [Ramos et al., 2018c].
5.1 Introduction
Recommender systems suggest products that might be interesting for the users [Ricci et al.,
2015]. Collaborative Filtering (CF) is by far the most popular and effective recommendation
technique [Ning et al., 2015,Koren and Bell, 2015].
The CF approaches divide into two classes: model-based and neighborhood-based [Ricci
et al., 2015]. The first tries to model latent factors of both users and items, and it is widely
employed due to its success for movie recommendation in the Netflix prize [Bennett et al.,
2007]. The second class does recommendation based on users with similar preferences or
items that are similar to the users’ preferences. This class further divides into three main
approaches, user-based, item-based, and hybrid. User-based methods select a set of similar
users based on similarity among them to recommend items [Zhao and Shang, 2010]. Item-
based methods are analogous, but performed using similarities among the items [Sarwar et al.,
53
2001a]. The hybrid approaches combine the previous [Wang et al., 2006].
In general, in order to choose the CF algorithm that will implement the recommender
system, the authors need to do several assumptions. Moreover, in order to produce effective
suggestions, we have to set several parameters inside the algorithm. Model-based algorithms
assume that there is an underlying model that uncovers latent features and explains unob-
served ratings [Koren and Bell, 2015]; however, in order to employ this class of algorithms,
it is necessary to assume which model will fit the data (and thus will lead to effective pre-
dictions), and its parameters have to be learned. When a neighborhood-based approach is
chosen, it is necessary to make an assumption on the ratio between the number of users and
the number of items. Indeed, if there are more users, item-based CF leads to more accurate
predictions [Ning et al., 2015, Fouss et al., 2007, Sarwar et al., 2001b]. Moreover, both user-
and item-based approaches require to set the number of neighbors at predictions stage, which
is key to build accurate predictions [Ning et al., 2015].
It should be clear that it might be hard to make these assumptions in advance, especially
for a new business and for a system that will grow over time.
Contributions. In this chapter, we propose a hybrid (user and item) neighborhood-
based CF. We made the choice of employing a neighborhood-based approach, because they
are known to be simple, justifiable, and stable [Ning et al., 2015]. Moreover, with respect to
approaches in the literature, our solution is:
• efficient: the algorithm that we propose is modular, and the recommendation for each
user can be computed independently. Moreover, the computations of the algorithm can
be done in a distributed fashion, making it scalable;
• assumption-free: our algorithm works with a small number of data points, it works for
both low-rank and high-rank matrix completion, without the need for any initialization;
• model-free: the entries are not assumed to be a function of some latent variables;
• parameter-free: our algorithm does not need a parameter to select the neighbors, since
they are represented by the users who rated items in common with the target users
(more details on this will be provided in Sections 5.3 and 5.4).
While these problems might have been tackled individually, this is the first approach
that combines together efficiency, with the absence of assumptions on the data, models, and
parameters.
Our method explores Kolmogorov complexity to construct a similarity measure from in-
formation theory [Cover and Thomas, 2012], and to propose new similarity measure. A large
evaluation of our approach on 9 datasets shows that our approach outperforms or competes
with 10 state-of-the-art approaches chosen as baseline.
54
Chapter structure. We organized the rest of the chapter as follows. In Section 5.2, we
present related work. In Section 5.3, we introduce some notation and present our setup spec-
ification. In Section 5.4, we use our matrix completion algorithm to evaluate its performance,
with both synthetic data and real-world datasets. In Section 5.5, we conclude the chapter,
and we draw avenues for further research.
5.2 Related Work
Several approaches to the matrix completion problem reformulate it into an optimization
problem, assuming that the matrix to recover has low rank and that the observed entries’
positions are sampled according to a uniform distribution, see [Candes and Tao, 2010]. Al-
though the rank minimization problem is NP-hard, approaches following the ideas in [Candes
and Tao, 2010] are used with relative success. It consists of relaxing the problem so that it
becomes convex, and then in minimizing the nuclear norm of the matrix. These methods are
very used in practice. In other approaches, authors assume that the matrix to complete has a
high rank. These approaches also lead to deal with a NP-hard problem. Nonetheless, under
certain assumptions, we can complete some incomplete high-rank or even full-rank matrices,
as in [Balzano et al., 2012]. In their work, the authors assume that the columns of the matrix
to complete belong to a union of multiple low-rank subspaces. In a recent work by [Ganti
et al., 2015], the authors addressed the matrix completion problem without assuming that
the matrix is low rank, as it is usual. They considered recovering the entries of a low-rank
matrix through a Lipschitz monotonic function. In [Song et al., 2016], the authors address
the matrix completion problem using a novel framework for nonparametric regression over
latent variable models. They propose to model the unknown matrix entries as a Lipschitz
function of two latent variables, one for users and another for items. In [Wang et al., 2006], the
authors presented a generative probabilistic framework that considers the similarity between
users and between items. The prediction of each unknown matrix entry is the average of the
individual ratings weighted by the users’ confidence.
5.3 Setup
Next, we present our matrix completion algorithm and its computational complexity analysis.
5.3.1 Setup specification
We propose a recommender system, by making matrix completion as in hybrid neighborhood-
based CF approaches. Our approach computes two matrices of similarities, one between users,
55
...
...
... ...
un-1 unu1 u2
i1 i2 im-1 im
SU1n
SU12 SUn-1n
SIm-1m
SI1m
SI12
Ru1i1Ru2i1
Runim-1
Runim
Figure 5.1: Graph representing n users, u1, . . . , un and m items, i1, . . . , in. The filled
edges between users and items represent the products each user rated weighted by the rating.
The top dashed edges (between users) represent the weights computed in the matrix SU . The
dashed bottom edges (between items) represent the weights computed in the matrix SI .
SU , and another between items, SI .After, we complete each entry of user u and item i by
assigning a convex combination of two quantities, by a parameter α1. The first quantity is a
weighted average of the ratings that user u gave to other items by the similarities of the other
items with item i. The second is a weighted average of the ratings of item i given by other
users similar to user u, see Figure 5.1.
To build the matrices SU and SI , we propose the two compression similarities based on
Kolmogorov complexity [Cover and Thomas, 2012] that we introduced in Chapter 3. Recall
that, given the description of a string, x, its Kolmogorov complexity, K(x), is the length
of the smallest computer program that outputs x (i.e., K(x) is the length of the smallest
compressor for x). Although Kolmogorov complexity is non-computable, there are efficient
and computable approximations by compressors. Let C be a compressor and C(x) denote the
length of the output string resulting from the compression of x using C. The first measure
we proposed in Chapter 3 is the following.
1Note that parameter α is a way to weigh the importance that each component (i.e., user- and item-based)
should take, in classic hybrid fashion. As stated in the Introduction, the individual components can perform
matrix completion without parameters.
56
Compression similarity. Using the normalized compression distance, see [Li et al.,
2004], we define the compression similarity as:
CS (x, y) = 1− C(xy)−minC(x), C(y)maxC(x), C(y)
,
where string xy is the concatenation of x and y. We implement the description of users/items
as the string composed by the index of rated items/rating users and respective rating. For
instance, if user u rated the items i1u, i2u, . . . , i
lu, l ≤ M , then we write the description of user
u as the string “i1uRui1ui2uRui2u . . . i
luRuilu”.
Inspired by CS, in order to reduce the computational complexity, we proposed, also in
Chapter 3, another similarity measure.
Kolmogorov similarity. We define the Kolmogorov similarity as:
KS (x, y) = (1 + |C(x)− C(y)|)−1 .
Here, to compress the description strings, we use the standard compression tools from the
zlib library2. Intuitively, both similarities measures quantifies how related are the compactest
descriptions of a pair of users or a pair of items.
The compression similarity measures are used to compute the two similarity matrices, SU
and SI .
To complete the rating matrix R, we set each non-filled entry Rui in the completed one,
R, as a convex combination of two quantities by the parameter α. The first quantity is
based on users that are similar to the user for which we are estimating the rating. A known
phenomenon, as pointed in [Schafer et al., 2007b], is that users may vary the scale of the ratings
that they use. For example, an optimistic user may only give high ratings and a pessimistic one
only low ratings. To compensate for this effect, we do the following rating prediction, based
on the similarities between users, that accounts for each user’s rating average, as proposed
in [Schafer et al., 2007b].
predU (u, i) = Ru +
∑v∈Ui
|Iu,v|2SUuv(Rvi − Rv)∑v∈Ui
|Iu,v|2SUuv.
The second is the sum of the ratings of each item j 6= i, weighed by the square of the number
of user rating the item together with SIij , predI(u, i).
predI(u, i) =
∑j∈Iu
|Ui,j |2SIijRuj∑j∈Iu
|Ui,j |2SIij.
2https://tools.ietf.org/html/rfc1950
57
Lastly, fixed the parameter 0 ≤ α ≤ 1, we estimate each non filled matrix entry as
Rui = α predU (u, i) + (1− α) predI(u, i).
Observe that if α = 1, it corresponds to user-based CF, and if α = 0, it corresponds to
item-based CF. The previous steps are summarized in Algorithm 2.
Our approach allows to decouple the problem into a set of independent user-by-user sub-
problems. Hence, to generate recommendations for a user, we do not need to complete the
entire rating matrix, but only the corresponding matrix row.
Algorithm 2 Matrix completion algorithm: KolMaC
1: input: α, training set R
2: compute SU from the training set
3: compute SI from the training set
4: set R = R
5: for each user u do
6: for each item i such that Rui = ⊥ do
7: set Rui = α predU (u, i) + (1− α) predI(u, i)
8: end for
9: end for
10: output: R
5.3.2 Complexity analysis
To build the user similarity matrix SU , we pre-compute the quantity C(u) for each user
u ∈ U . After, we do not need to build an n × n matrix where each entry SUuv = KS(u, v)
for each u, v ∈ U , because we can compute each of this entries in O(1) time by accessing the
pre-computed values. The pre-computed values consist in, at the end, compressing strings
that are a partition of the string with all the ratings, which takes O(|R|). Hence, to compute
the similarity of each pair of users, we need time complexity of O(maxn2, |R|) and space
complexity of O(n). Mutatis mutandis, for the time and space complexities of the items’
similarities, which are O(maxm2, |R|) and O(n), respectively. Note that |R| ≤ n×m, but
usually, |R| n2 and |R| m2.
For the CS measure, we perform the same pre-computations, but to build SU and SI ,
we further need to compute the compression of the concatenation of pairs of users and pairs
of items, respectively. Hence, the time complexity is O(n2m) and O(nm2), whilst the space
complexity is O(n2) and O(m2), respectively for SU and SI .
58
For the matrix completion problem, steps 4-9 of Algorithm 2, the time complexity is
O(maxn,m) (to compute the weighted averages in step 7) times the number of elements of
the matrix nm. This yields a time complexity of O(maxn2m,nm2). The space complexity
of those steps is O(nm).
Hence, the time complexity of Algorithm 2, when using KS and CS, is O(maxn2m,nm2).Note that, in fact, for the KS case it is O(maxn2m,nm2)+O(maxn2, |R|+maxm2, |R|),and O(maxn2m,nm2) > O(maxn2, |R| + maxm2, |R|). Hence, it is strictly less than
for the CS case, which is 2×O(n2m+nm2). The space complexity when using KS is O(nm),
and when using CS it is O(maxn2,m2).
5.4 Experimental setup
Next, we describe our experimental settings and analyze the results.
5.4.1 Datasets
We test Algorithm 2 on synthetic and real-world datasets. All experiments were done in a
3.33GHz Six-core Intel Xeon, with 6GB 1333MHz RAM, using Matlab 2016 and Python 3,
and with OS X 10.13. For the synthetic data, we generate randomly four full-rank matrices,
with dimension 20× 30, and with entries in [1, 5].
For the real-world datasets we use MovieLens 100k (ML–100k) and MovieLens 1M (ML–
1M)3, and both have ratings in [1, 5], with ⊥ = 0. Further, we use the Jester datasets
in “Dataset 1” of http://eigentaste.berkeley.edu/dataset/. The datasets consist in
ratings to a set of 100 jokes, with continuous ratings in ] − 10, 10[, with ⊥ = 99. Jester-1
has 24,983 users who rated 36 or more jokes. Jester-2 consists of 23,500 users that rated
36 or more jokes. Jester-3 has 24,938 users who rated between 15 and 35 jokes. Table 5.1
contains a compact description of the datasets.
ML–100k ML–1M Jester-1 Jester-2 Jester-3
|U | 983 6040 24,983 23,500 24,938
|I| 1682 3952 100 100 100
|R| 100,000 1,000,000 1,810,455 1,708,993 616,912
Table 5.1: Details of datasets MovieLens 100k and 1M.
3http://movielens.umn.edu
59
5.4.2 Evaluation metric
To evaluate and compare the performance of the proposed algorithm, Algorithm 2, we use
the 5-fold-cross-validation method on both synthetic and real data. For the ML–100k, the
dataset already provides a set of 5 train and test files. For the ML–1M, we randomly split
the original dataset in a set of 5 train/test files. In the synthetic data, the four randomly
generated full rank matrices, with dimension 20× 30, were split as in the ML–1M case.
We use the root-mean-square error (RMSE) [Koren, 2008] to evaluate the accuracy of our
algorithm, by measuring the difference between the estimated and the original values. Let R
be the original matrix, R∗ equal to R except on the missing entries of the test set T , where
it has the value ⊥, and let R be the estimation of M by a matrix completion method when
applied to R∗. The RMSE is given by
RMSE(R, R) =
√√√√ 1
|T |∑
(u,i)∈T
(Rui − Rui)2.
5.4.3 Experimental results
We compare our algorithm, using both similarity measures KS and CS, against the following
algorithms: NormalPredictor, BaselineOnly [Koren, 2010], KNNBasic [Altman, 1992], KN-
NWithMeans [Altman, 1992], KNNBaseline [Koren, 2010], SVD [Salakhutdinov and Mnih,
2007], SVD++ [Koren, 2008], NMF [Lee and Seung, 2001], Slope One [Lemire and Maclachlan,
2005] and Co-clustering [George and Merugu, 2005]. The Python toolkit Surprise4 presents
an implementation for these algorithms. We summarize the results of the experiments in
Table 5.2, for the synthetic data, and in Table 5.3, for the real datasets. In Table 5.3, the “-”
in the competitive neighborhood-based algorithms means we could not get the results with
the available RAM memory (Jester case), and in the SVD++ case means we could not get
a result in reasonable time. Hence, these methods suffer from scalability problems. For the
synthetic data, the best result is obtained by our approach with the similarity CS. When using
similarity KS, the result is the third best. We obtain these results because the majority of
the compared methods assume that the matrix they are completing is low rank, which might
be the case in these datasets, but might not be the case in general.
With real data, using both KS and CS similarity measures, our algorithm does not present
the lowest RMSE, except for the Jester-1 dataset. The reason may be the fact that most
of the compared methods assume that the completed matrix is low rank. However, the
results are comparable and of the same order as the best-reported ones. In conclusion, our
proposal represents an effective and efficient solution, since it combines the intrinsic values of
4http://surpriselib.com/
60
Method M1 M2 M3 M4
NormalPredictor 1.8692 1.8944 1.7140 1.9263
BaselineOnly 1.4667 1.4663 1.4306 1.4803
SVD 1.5155 1.5120 1.4660 1.5222
SVD++ 1.5205 1.5176 1.4698 1.5279
NMF 1.6999 1.6703 1.7052 1.7686
Slope One 1.5270 1.5287 1.4760 1.5310
Co-clustering 1.5808 1.5630 1.5461 1.6442
KNNBasic 1.4665 1.4840 1.4383 1.5049
KNNWithMeans 1.5107 1.5150 1.4721 1.5269
KNNBaseline 1.4838 1.4998 1.4549 1.5126
KolMaC KS 1.4663 1.4676 1.4303 1.4848
KolMaC CS 1.4530 1.4520 1.4260 1.4714
Table 5.2: RMSE of a 5-fold-cross-validation in four synthetic random and full rank 20× 30
matrices.
neighborhood-based CF (i.e., its simplicity, justifiability, and stability), with the advantages
offered by our algorithm (i.e., its efficiency, and being assumption-, model-, and parameter-
free).
Method ML–100k ML–1M Jester-1 Jester-2 Jester-3
NormalPredictor 1.5228 1.5037 7.4572 7.2695 7.4490
BaselineOnly 0.9445 0.9086 4.5877 4.3139 4.5971
SVD 0.9396 0.8936 4.4957 4.5594 4.4957
SVD++ 0.9200 – 4.5192 4.7277 4.5192
NMF 0.9634 0.9155 6.2372 7.1256 6.2768
Slope One 0.9454 0.9065 4.5187 4.2517 4.5187
Co-clustering 0.9678 0.9155 4.6634 4.3627 4.6693
KNNBasic 0.9789 0.9207 – – –
KNNWithMeans 0.9514 0.9292 – – –
KNNBaseline 0.9306 0.8949 – – –
KolMaC KS 0.9582 0.9330 4.4719 4.5164 4.7160
KolMaC CS 0.9465 0.9216 4.4582 4.5027 4.7107
Table 5.3: RMSE for the datasets ML–100k and ML–1M.
5.5 Concluding Remarks
We presented a novel hybrid neighborhood-based CF recommender system. Our system makes
independent, user-by-user, matrix completion, utilizing Kolmogorov complexity. Our method
61
does not require assumptions about the rank of the matrix, we do not need to specify dimen-
sions of subspaces, and it is model-free. Therefore, it is more general. We present experimental
results on both synthetic and real dataset which show that our approach is comparable with
state-of-the-art approaches. The avenues for further research include exploring matrix com-
pletion under the presence of noise and, to extend this work, in an initial step, clustering by
similarities both users and items.
62
Chapter 6
A Novel Similarity Measure for
Group Recommender Systems with
Optimal Time Complexity
In line with the previous chapter, allied to the fact that KS is very fast to compute, in this
chapter, we explore applying this similarity in the context of group recommender systems. A
crucial phase of a group recommender system is to generate the groups of users. By using KS
in this step of a group recommender system, we obtain statistically the same results as when
employing the standardly utilized Pearson similarity. Moreover, we get a considerable gain in
terms of time complexity when using KS. This gain translates to, in the computations of the
users’ similarities phase, spending a few minutes against a few hours when using Pearson’s
similarity. This work was submitted for publication, see [Ramos et al., 2018a].
6.1 Introduction
Our online experience can tell much about our preferences. Indeed, from the analysis of
browsing sessions to the comments, likes, and ratings we leave, lots of implicit and explicit
traces are available on the Web. These preferences are usually stored in a user profile and
can be exploited by those running services, such as e-commerce websites or social media
platforms, to turn them into actionable knowledge, and provide us tailored services, like
recommendations.
Given that the experience of users on the Web is usually individual, these services are
single users’ tailored. Group recommendation operates in contexts in which more than one
person is involved in the recommendation process [Boratto and Carta, 2011]. This area usually
focuses on offline scenarios, in which people have to experience something together (e.g., a
63
group that goes to dinner, or watches a movie).
Suppose that a brand wants to run an advertising campaign specific to its customers.
The usual way of doing it would be targeting a group of users, and recommend them a set of
possibly interesting products. Usually, we base this targeting on users’ global preferences, i.e.,
on the whole user’s profile. However, it is not trivial to understand how the users interacted
with the items of a specific brand. Indeed, the preferences contained in the user profiles
are available only to persons who are running an e-commerce website. This website should
provide a personalized service to the brand, by extracting segments of users, based only on
how they interacted with that brand.
We can effectively employ group recommendation in this scenario, by first analyzing how
the users interacted with a brand and detecting groups of users, and by providing group
recommendation to these groups, treating them as a target.
Open problem and scientific contribution Group recommendation is naturally consid-
ered a challenging area [Jameson and Smyth, 2007,Ricci, 2014], due to the fact that we have
to take into account multiple preferences in the recommendation process. Hence, the problem
we are tackling is even more challenging. Indeed, the information about the user preferences
becomes even more sparse than the usual recommendation scenarios, and detecting brand-
specific groups is not trivial. This problem is due to the fact that the similarity for each pair
of users does not have to be detected just once, considering the whole profile, but once for
each brand. Hence, we need a fast similarity measure, able to deal both with the fact that
the group recommendation problem has to be solved multiple times, and with the continuous
evolution of the user preferences.
In this chapter, we propose a new similarity measure to group users. This similarity has
lower time and space complexity than the state-of-art Pearson correlation similarity measure
that presents statistically the same root mean-squared-error results (RMSE) when tested in
offline datasets. More specifically, our contributions are the following:
• we propose a novel similarity measure based on Kolmogorov complexity that detects
the similarity for the users in an efficient and effective way;
• we show that our measure has lower and optimal time complexity than the state-of-the-
art measure used to compute the similarities between users (Pearson’s correlation);
• we embed our similarity measure in a group recommender system, and test its effective-
ness on two real-world datasets;
• our group recommender system is the first in the literature in which the recommen-
64
dations are both provided and meant to be consumed online1. This contrasts with
classic group recommender systems, in which the users consume the items together in
real-world scenarios.
Chapter structure. The chapter is organized as follows. In Section 6.2, we present re-
lated work in group recommendation. In Section 6.3, we propose a novel similarity measure.
In Section 6.4, we analyze the computational complexity of the proposed similarity metric.
In Section 6.5, we describe the setup of our group recommender system, and we test it in
Section 6.6. In Section 6.7, we conclude the chapter and draw avenues for further research.
6.2 Background and Related Work
Group recommender systems provide suggestions in contexts in which the objective of the
recommendations is not an individual, but multiple users [Boratto and Carta, 2011,Masthoff,
2015].
Group recommender systems naturally adapt to any scenario that involves a group of users.
Indeed, approaches have been developed for people who perform activities together, such as
going to the cinema [O’Connor et al., 2001], planning a travel [Ardissono et al., 2003,McCarthy
et al., 2006,De Pessemier et al., 2015], watching TV [Goren-Bar and Glinansky, 2004,Yu et al.,
2006b], or working out in a gym [McCarthy and Anagnost, 1998] (to name a few).
Providing group recommendations in online scenarios is an approach that has been ex-
plored, mostly taking advantage of social networks [Sanchez et al., 2014]. However, as men-
tioned in the Introduction, the group is meant to consume the items offline (e.g., in the
previously mentioned paper, movies are recommended based on user preferences and social
interactions). Recent literature has also shown that the offline interaction between the users
can help moving from individual to group preferences [Delic et al., 2016].
It is also worth highlighting that Ntoutsi et. al, in [Ntoutsi et al., 2012], previously
introduced the concept of fast group recommendation. By “fast” the authors meant that the
users are clustered in order to speed up the computation when the neighbors are selected
at the prediction stage. However, the group recommender system is run once for the whole
dataset, so this would not solve the problem tackled in this chapter. However, a comparison
with this approach will be presented in Section 6.6.
As the analysis of the literature shows, no approach in the literature performed group
recommendation at subsets of a dataset, thus facing the efficiency and effectiveness problems
1It is worth noting that, even if the users do not consume the recommendation together, this is still a group
recommendation, since the same set of items is recommended to a group of users.
65
at the same time. Moreover, our approach is the first where recommendations may be provided
and consumed online.
6.3 The Kolmogorov-based similarity
First, we introduce a standard similarity measure to group users for group recommender
systems. The Pearson product-moment correlation coefficient [Lee Rodgers and Nicewander,
1988], or Pearson similarity, is the standard to measure the similarities between users [Schafer
et al., 2007a]. Given two vectors X,Y ∈ Rn, the Pearson similarity between the two vector is
given by
Pearson(X,Y ) =
∑ni=1(Xi − X) · (Yi − Y )
(n− 1) · σX · σY. (6.1)
To keep the chapter self-contained, in this section, we recall the similarity measure we
proposed in Chapter 3 and that we used in Chapter 5 to design a recommender system. The
similarity is inspired by the notion of Kolmogorov complexity [Cover and Thomas, 2012], from
information theory. Given the description of a string, x, its Kolmogorov complexity, K(x),
is the length of the smallest computer program that outputs x. In other words, K(x) is the
length of the smallest compressor for x. Although Kolmogorov complexity is non-computable,
there are efficient and computable approximations by compressors. Let C be a compressor
and C(x) denote the length of the output string resulting from the compression of x using C.
Kolmogorov-based similarity. We define the Kolmogorov-based similarity between strings
x and y as
KS (x, y) =1
1 + |C(x)− C(y)|. (6.2)
In the context of this work, different from the previous chapters, the string x is the
string with the pairs of items and ratings, given by a user, or the pairs of items and ratings
estimated/predicted for that user. To compress the description strings, we use the standard
Python function, from the numpy package, savez compressed. Intuitively, the Kolmogorov
similarity KS measures how related are the compactest descriptions of a pair of users or a
pair of items.
The presented Pearson similarity and the KS have, here, the same purpose. However, it
is not easy to compare them mathematically, although we are able to compare them in terms
of computational complexity.
66
6.4 Complexity Analysis
Now, we compare the theoretical time and space complexity that we need to compute the
Pearson and the KS similarities between every pair of users.
Lemma 6.4.1. Let U = u1, . . . , un be a set of users, I = i1, . . . , im a set of items and
R a set of ratings given by users to items. The time and space complexity of computing the
Pearson similarity of each pair of users are O(mn2) and O(n), respectively.
Proof. First, we compute and store the mean and standard deviation of the ratings for each
user. It takes O(m) time to compute expression (2.1) for each user. Therefore, we need
O(nm) time to compute the mean vector of the ratings of all users and O(n) space to store
it. Having the mean vector, we may compute the standard deviation, expression (2.2), also
in O(nm) times and store it in O(n) space. Now, we need to compute expression (6.1). For a
pair of users ratings, having the vector of means and the vector of standard deviation stored,
we need O(m) time to compute the Pearson similarity between the pair of users. To compute
the Pearson similarity of every pair of users, O(n× n) pairs, we need O(mn2) time.
Lemma 6.4.2. Let U = u1, . . . , un be a set of users, I = i1, . . . , im a set of items and
R a set of ratings given by users to items. The time and space complexity of computing
the Kolmogorov-based similarity (LS) of each pair of users are O(maxn2, nm) and O(n),
respectively.
Proof. First, we compute, and store, the compression size of each set of ratings of each user.
For each user, the set of ratings has O(m) size, and we can compute its compression in O(m)
times, using [Williams, 1991], for instance. The O(n) compressions for each user take O(nm)
time and O(n) space to store them. Now, we can compute the KS between a pair of users
in O(1) time, using expression (6.2) and the stored values of the previous step. Finally, to
compute the KS similarity for each pair of users, O(n2) pairs, hence we need O(maxn2, nm)time.
Notice that for the Pearson similarity we need 2n space to store the vector of means and
the vector of the standard deviation of the users’ ratings. For KS, we need only n space
to store the size of the compressions of the set of ratings that each user gave. The time
complexity of KS contrasts with the one of Pearson similarity, because we need less an order
of operations, as we summarize in Table 6.1.
Observe that KS has optimal time complexity whenever n ≥ m, since to compute the
similarity between each pair of users we always need to compute O(n2) values, which is the
67
Pearson KS
Time O(mn2) O(maxn2, nm)Space O(n) O(n)
Table 6.1: Time and space complexity of the similarities.
minimum possible time complexity and the complexity of computing these similarities with
KS.
6.5 The group recommender system
Here, we present the group recommender algorithm we use in this work, Algorithm 3.
Algorithm 3 Group Recommender System.
1: input: ratings’ matrix R, number of clusters k, a similarity measure s, and rating pre-
diction function Pred
2: compute P = Pred(R)
3: compute similarity of each pair of users with function s
4: compute G the k groups of users, clustered by similarities
5: group estimated ratings in P for each group as the average of predictions
6: output: Recommendation list for each group in G
In particular, in this work, we use as the prediction function Pred the benchmark SVD
algorithm for matrix completion, see [Mnih and Salakhutdinov, 2008]. It has time complexity
of O(minmn2,m2n) [Holmes et al., 2007]. Also, for a comparison, we test the k-nearest
neighbors algorithm (KNN) as the prediction function Pred, see [Koren, 2010], with time
complexity of O(m2n).
Further, we use a polynomial time approximation of the k-means clustering algorithm.
The k-means Algorithm [MacQueen et al., 1967] is, in general, NP-hard for a generic number
of clusters k, even in the plane, see [Mahajan et al., 2009]. In practice, k-means may be
approximated by Lloyd’s heuristic algorithm [Hartigan and Wong, 1979], which has time
complexity of O(nkdi), where k is the number of clusters, d is the dimension of the elements
that we are clustering, and i is the number of iterations needed until convergence, which is
usually small. In our case d = 1, and the number of clusters k is a parameter of Algorithm 3.
Theorem 6.5.1. Let i denote the number of iterations that Lloyd’s algorithm takes to compute
the users’ clusters. Let n be the number of users, m the number of items and k the number
68
of users’ groups. The time complexity of Algorithm 3, using SVD as the Pred algorithm of
step 2, is:
• O(mn2 + nki+ km logm), using Pearson similarity;
• O(minmn2,m2n+ maxn2,mn+ nki+ km logm), using KS.
Proof. The time complexity of Algorithm 3 is the sum of the time complexities of each
step. The SVD step, step 2, takes time of O(minmn2,m2n). The Lloyd’s algorithm,
the clustering step 4, takes O(nki), where i is the number of iterations needed until con-
vergence. If we user the Pearson similarity in step 3, by Lemma 6.4.1, we need O(mn2)
time to compute the similarities between every pair of users. This yields a total time of
O(n2m+nki)), because mn2 ≥ minmn2,m2n. If we use the KS in step 3, by Lemma 6.4.2,
we need O(maxn2,mn) time to compute the similarities between every pair of users, and
maxn2,mn < minmn2,m2n. Step 4 takes O(mn) to average the ratings’ estimations
for the users of each group. Step 5 takes O(km logm) to sort the ratings’ prediction for
each of the k groups. Hence, the total amount of time is O(n2m + nki + km logm) and
O(minmn2,m2n+ maxn2,mn+ n2i+ km logm) when using Pearson similarity and KS,
respectively.
In fact, Theorem 6.5.1 states that the complexity order of Algorithm 3 is strictly better
when using KS whenever n > m, otherwise it has the same order than when using the Pearson
similarity. However, we notice that if we compare not only the complexity order but also the
exact complexity, we have the following. When we use Pearson similarity, the total amount
of time is c1mn2 + c2 minmn2,m2n + c3nki + c4km logm, for some non zero constants
c1, c2, c3, c4 ∈ R+. When we use KS, the total amount of time is c2 minmn2,m2n+ c3nki+
c4km logm+c′mn+c′′maxn2, nm, for the same c2, c3, c4 ∈ R+ as in the Pearson’s scenario,
and c′, c′′ ∈ R+. Therefore, the time complexity is always strictly better in the case that we
use KS.
Observe that if, instead of using the SVD algorithm in step 2 of Algorithm 3, we use the
KNN algorithm then the time complexity is the following. Using the Pearson similarity, we
get O(mn2 + m2n + nki + km logm), and using KS we always obtain a better complexity
order of O(m2n+ nki+ km logm+ maxn2, nm).
6.6 Experimental Setup
In this section, we test our similarity measure in two real-world datasets. We use the Movie-
Lens 100k (ML–100k) and the MovieLens 1M (ML–1M), available in http://movielens.
umn.edu, and both datasets have ratings in 1, . . . , 5, with ⊥ = 0. Recall Table 5.1, which
69
we repeat in Table 6.2. The choice of two relatively-small and very sparse datasets was made
to simulate our scenario, in which group recommendations have to be computed for medium-
and large-sized companies.
All experiments were done in a 3.33GHz Six-core Intel Xeon, with 6GB 1333MHz RAM,
using Python 3, and with OS X 10.13. We use the Surprise scikit [Hug, 2017] to compute the
individual predictions with the SVD algorithm and the KNN algorithm, step 2 of Algorithm 3.
Further, to compute the Pearson similarity we use the pearsonr function from the Python
package scipy.stats.
ML–100k ML–1M
|U | 983 6040
|I| 1682 3952
|R| 100,000 1,000,000
Table 6.2: Details of datasets MovieLens 100k and 1M.
6.6.1 Evaluation metric
To evaluate and compare the performance of the proposed algorithm, Algorithm 3, we use,
again, the 5-fold-cross-validation method. For the ML–100k, the dataset already provides a
set of 5 train and test files. For the ML–1M we randomly split the original dataset in a set of
5 train/test files.
We use, as in the previous chapter, the root-mean-square error (RMSE) [Koren, 2008]
to evaluate the performance of the proposed group recommender algorithm. Recall that it
measures the difference between the estimated missing values and the original values as we
detail next. Let R be the original ratings matrix, and let R∗ be the train set, equal to R
except on the missing entries of the test set T , where it has the value ⊥. Let Rui denote the
estimated rating of the group where user u belongs for item i, and R the matrix with all the
estimated ratings. The RMSE is given by
RMSE(R, R) =
√√√√ 1
|T |∑
(u,i)∈T
(Rui − Rui)2. (6.3)
Observe that, here, the RMSE is not measuring the same as in Chapter 5. Here, it is measuring
the difference between the real rating that a user gave to an item and the estimated group
rating of that item, for the group that user belongs to.
70
6.6.2 Experimental Results
Now, we present the experimental results of Algorithm 3. We test the users’ clustering/grouping
phase using the Pearson similarity versus our proposed Kolmogorov-based similarity (KS). For
the ratings’ prediction phase, step 2 of Algorithm 3, we tested with the SVD and the KNN
algorithms2.
Figure 6.1 and Figure 6.2 depict the RMSE (6.3) evolution (yy axis) with the number of
users’ groups (xx axis) as the average of a 5-fold-cross-validation method. The blue points
correspond to using the Pearson similarity and the yellow points to using the KS.
0 200 400 600 800
0.94
0.96
0.98
1.00
1.02
Movielens 100K with SVD
Pearson KS
Figure 6.1: RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with SVD for its step 2, with Pearson similarity (blue points) and
KS (yellow points) for the ML-100K.
Next, we test that the results depicted in Figure 6.1 and Figure 6.2 are not related with
the prediction algorithm (SVD), step 2 of Algorithm 3. For this purpose, we replace the SVD
by the KNN algorithm, and we obtain the results in Figures 6.3 and Figure 6.4.
We obtain better RMSE results when using SVD for the ratings’ prediction phase than
when using KNN, which is expected. More important, we get the same behavior in the
evolution of the RMSE with the number of groups for the KS and the Pearson similarity,
when using either the SVD or the KNN in the prediction step.
2In order to speed up the process furthermore and embrace the concept of fast group recommendation
proposed by Ntoutsi et al. [Ntoutsi et al., 2012], we also considered an alternative to the KNN approach, in
which the neighbors were only selected inside the cluster of the target user. However, results show that, in our
context, the effectiveness decreases. These results are not presented, to improve the readability of the chapter.
71
0 1000 2000 3000 4000 50000.90
0.92
0.94
0.96
0.98
Movielens 1M with SVD
Pearson KS
Figure 6.2: RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with SVD for its step 2, with Pearson similarity (blue points) and
KS (yellow points) for the ML-1M.
We can see in Figures 6.1–6.4 that the RMSE results when using the Pearson similarity
or the KS are very close. Hence, we test the null hypothesis (H0) that the differences we
obtain in the results are due to randomness. We compare the two means for the 5-fold tests
of each different group size with the Student’s t-test [O’Mahony, 1986]. For both datasets,
we obtained p-values larger than 0.05 and, thus, we must accept the H0. In other words, the
RMSE results depicted in Figure 6.1 and Figure 6.2, and the ones depicted in Figure 6.3 and
Figure 6.4 are, essentially, the same.
Finally, in Table 6.3, we present average and standard deviation of the time that Algo-
rithm 3 spends in step 3, the computation of the similarities between each pair of users, also
using a 5-fold-cross-validation method. Table 6.3 compares, in practice, the time complexity
ML–100k ML–1M
Pearson 2’5.1710”±1.1862” 3h23’20.3505”±10’29.6983”
KS 4.5608”±0.0688” 1’13.9046”±3.1560”
Table 6.3: Average and standard deviation of the computation time of the similarities between
every pair of users in a 5-fold cross validation.
results of Lemma 6.4.1 and Lemma 6.4.2, which are part of Algorithm 3 and responsible for
the difference of the two cases of time complexity in Theorem 6.5.1. Recall that, to compute
the similarity of each pair of users, we need O(mn2) time using the Pearson similarity, and
O(n2) using KS. In practice, we notice that for the ML-100k the KS takes a few seconds
against the 2 minutes needed in the Pearson similarity case. Further, for the ML-1M the
72
0 200 400 600 800
0.94
0.96
0.98
1.00
1.02
Movielens 100K with KNN
Pearson KS
Figure 6.3: RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with KNN for its step 2, with Pearson similarity (blue points) and
KS (yellow points) for the ML-100K.
gain is even more notorious because KS takes only about 1 minute versus more than 3 hours
needed for the Pearson similarity scenario.
6.7 Concluding Remarks
In this chapter, we tackled the problem of producing brand-specific group recommendations,
i.e., recommendations to groups of users, considering only the preferences expressed for a
specific brand. Since we need to compute the similarity for a pair of users multiple times,
we devised a novel and fast to compute similarity measure, the Kolmogorov-based similarity
(KS). Our similarity measure has better (and optimal) theoretical computational complexity
than the state-of-the-art Pearson similarity, which is widely used in the group recommendation
community. We tested these similarity measures in the context of group recommendation in
two real-world datasets. The RMSE that we obtained for both similarities is statistically the
same, up to some randomness. For the larger dataset, the computation of users’ similarities,
took 1 minute, using the KS, while it took more than 3 hours when using the Pearson similarity.
In future work, we will analyze the obtained clusters. This analysis allows us to explain to
a brand what are the characterizes of each targeted group, in terms of users’ preferences.
Also as future work, we would like to study the relation of the KS with known Kolmogorov-
based distances, see [Li et al., 2004] and references therein, and also to explore using different
compressors to compute KS.
73
0 1000 2000 3000 4000 5000
0.92
0.93
0.94
0.95
0.96
0.97
0.98
Movielens 1M with KNN
Pearson KS
Figure 6.4: RMSE evolution with the number of users’ groups of a 5-fold-cross-validation
method, using Algorithm 3 with KNN for its step 2, with Pearson similarity (blue points) and
KS (yellow points) for the ML-1M.
74
Part II
Control of dynamical systems
75
Chapter 7
Preliminaries and Notation
In this part of the thesis, we solve the robust minimal controllability problem for both linear-
time invariant (LTI) systems and switched LTI systems, with some additional assumptions.
Under the scenario that a specified number of actuators may fail over the time, the problem
consists in determining a placement of the minimal number of actuators that ensures that
the system is controllable. Then, we use a digraph decomposition, used in structural control
theory, to present a more general bound for the index of convergence of Boolean matrices.
The outline of this part is the following: In Chapter 8, we solve the robust minimal
controllability problem for LTI systems. In Chapter 9, we extend the result of the previous
chapter for switched LTI systems. Finally, in Chapter 10, we present a more general bound
for the index of convergence of Boolean matrices.
Now, we introduce the notation used in the first two subsequent Chapters, to avoid repeat-
ing definitions and notation. The notation of Chapter 10 is introduced within the chapter, to
improve the readability of the manuscript.
We denote vectors by small font letters such as v, w, b and its corresponding entries by
subscripts. A collection of vectors is denoted by vjj∈J , where the superscript indicates an
enumeration of the vectors using indices from a set such as I,J ⊂ N. We use square brackets
in vectors or matrices, to separate an enumeration of those from their entries, e.g., [Bk]i,j
stands for the jth column of the ith row of the kth matrix of an enumeration Bk, or [vji ]k
stands for the kth entry of the vector vji . The number of elements of a set S is denoted by |S|.We denote by In the n-dimensional identity matrix. Given a matrix A, σ(A) denotes the set
of eigenvalues of A, the spectrum of A. Given two matrices M1 ∈ Cn×m1 and M2 ∈ Cn×m2 ,
the matrix [M1 M2] is the n × (m1 + m2) concatenated complex matrix. If I = i1, . . . , ikand B ∈ 0, 1n×m, with m,n ≥ k, B(I) is the matrix where [B]j,i = 1 for i ∈ I and the
remaining entries of B are equal to zero.
The structural pattern of a vector/matrix or a structural vector/matrix have their entries
77
in 0, ?, where ? denotes a non-zero entry, and they are denoted by a vector/matrix with a
bar on top of it. We denote by Aᵀ the transpose of A. The function · : Cn×Cn → C denotes
the usual inner product in Cn, i.e., v·w = v†w, where v† denotes the adjoint of v (the conjugate
of vᵀ). With some abuse of notation, · : 0, ?n×0, ?n → 0, ? also denotes the map where
v · w 6= 0, with v, w ∈ 0, ?n if and only if there exists i ∈ 1, . . . , n such that vi = wi = ?.
Similarly, given u, v ∈ 0, ?n, we extend the plus operation, + : 0, ?n × 0, ?n, to be
w = u + v, with wi = 0 if ui = vi = 0 and wi = ? otherwise, for i = 1, . . . , n. Additionally,
‖v‖0 denotes the number of non-zero entries of the vector v in either 0, ?n or Rn. Given a
subspace H ⊂ Cn we denote by Hc its complement with respect to C, i.e., Hc = Cn \H. With
abuse of notation, we will use inequalities involving structural vectors as well – for instance,
we say v ≥ w for two structural vectors v and w if and the only if the following two conditions
hold: (i) if wi = 0, then vi ∈ 0, ?, and (ii) if wi = ? then vi = ?.
A multiset is a set where each element may occur more than once. Consider multisets Xand Y. The multiset X t Y is such that, if a ∈ X or a ∈ Y then a ∈ X t Y. If a occurs n1
times in X and n2 times in Y, then a occurs maxn1, n2 times in X tY. Although not very
intuitive, this ‘unusual’ union will be useful to address (ii), in Section 9.3.
78
Chapter 8
The Robust Minimal Controllability
Problem
In this chapter, we solve the robust minimal controllability problem for LTI systems, with
some additional assumptions. The problem is to find the minimal number of actuators and
their placement, ensuring that the system is controllable in the scenario where a specified
number of actuators may fail. This may happen due to an external agent tampering with
the system or due to natural phenomenon reasons. We show that the problem in hands is
NP-complete, and we present an algorithm that solves it, explicitly. Further, we provide
polynomial time algorithms that approximately solve the problem.
Hence, we gain ground on the topic of information security applied to the are of control
systems. We published this work in [Pequito et al., 2016b].
8.1 Introduction
The problem of guaranteeing that a dynamical system can be driven toward the desired state
regardless of its initial position is a fundamental question studied in control systems and it is
referred to as controllability. Several applications, for instance, control processes, multi-agents
networks, control of large flexible structures, systems biology and power systems [Egerstedt,
2011, Siljak, 2007, Skogestad, 2004] rely on the notion of controllability to safeguard their
proper functioning. Subsequently, it is important to identify which subsets of state variables
need to be actuated, or what is the placement of actuators required, to ensure controllabil-
ity [van de Wal and de Jager, 2001,Olshevsky, 2014,Pequito et al., 2016a].
Moreover, actuators may malfunction over time due to the adverse nature of the environ-
ments where the actuators are deployed, e.g. due to the wear and tear of the materials, or
due to external (adversarial) influence of an agent aiming to disrupt the proper functioning
79
of the dynamical system. In fact, a classical example of such malicious attack is the Stuxnet
malware incident [Langner, 2011], in which the controller’s input response to a tempered mea-
sured output lead the system away from its normal operating conditions. Thus, the control
designer needs to consider such scenarios, while accounting for the actuator placement [Velde
and Carignan, 1984]. Additionally, as the systems become larger (i.e., the dimension of their
state space), we aim to identify a relatively small subset of state variables that ensure the
controllability of the system, for instance, due to economic constraints [Olshevsky, 2014].
Consequently, in this chapter we address the following natural design question:
Q1: What is the minimum number of actuated state variables, and what is the configuration
of actuators, we need to consider to ensure the controllability of a dynamical system if a specific
number of actuators failures occur?
To formally capture Q1, we introduce and study the robust minimal controllability prob-
lem (rMCP) that aims to determine the minimum number of state variables that need to be
actuated to ensure system’s controllability, under the possible failure of a specified number of
actuators. This is a generalization of the of the minimal controllability problem (MCP) [Ol-
shevsky, 2014], which can be obtained as a particular case of the rMCP when no actuator
fails. Therefore, the MCP is the first step to understand resilience and robustness properties
of dynamical systems since it unveils which variables need to be actuated.
Finally, it is important to mention that the rMCP can be stated regarding observability,
by invoking the duality between controllability and observability in LTI systems [Hespanha,
2009]. In particular, [Shoukry and Tabuada, 2014,Chen et al., 2015,Fawzi et al., 2012] provide
necessary and sufficient conditions concerning the sensor deployment to ensure that a reliable
estimate of the system is recovered. More importantly, those conditions can be achieved by
design, when solving the rMCP. Hence, guaranteeing the design of stable observers to proper
monitor the state evolution of an LTI system. Furthermore, the results presented in this
chapter are for discrete-time, but they readily applicable to continuous-time LTI systems.
Related Work: This chapter follows up and subsumes previous literature by consider-
ing the deployment of actuators to ensure controllability under possible actuation failures.
When no actuators fail, it extends the results available for the MCP, as we overview next.
In [Nabi-Abdolyousefi and Mesbahi, 2013] the controllability of circulant networks is ana-
lyzed by exploring the Popov-Belevitch-Hautus eigenvalue criterion, where the eigenvalues
are characterized using the Cauchy-Binet formula. The controllability in multi-agents with
Laplacian dynamics was initially explored in [Tanner, 2004]. Later, in [Rahmani et al., 2009]
and [Egerstedt et al., 2012], necessary and sufficient conditions are given in terms of partitions
of the Laplacian graph. In [Parlangeli and Notarstefano, 2012], the controllability is explored
for paths and cycles, and later extended by the same authors to the controllability of grid
80
graphs by means of reductions and symmetries of the graph [Notarstefano and Parlangeli,
2013], and considering dynamics that are scaled Laplacians. In [Kibangou and Commault,
2014] and [Zhang et al., 2011], the controllability is studied for strongly regular graphs and
distance-regular graphs. Recently, new insights on the controllability of Laplacian dynamics
are given regarding the uncontrollable subspace, in [Aguilar and Gharesifard, 2014] and [Chap-
man and Mesbahi, 2014]. In addition, in [Pasqualetti and Zampieri, 2014] the controllability
of isotropic and anisotropic networks is analyzed.
Furthermore, [Aguilar and Gharesifard, 2014] concludes by pointing out that further study
of non-symmetric dynamics and the controllability is required – which we address in the
present chapter. Therefore, we consider a much less restrictive assumption: A is a simple
matrix, i.e., all of its eigenvalues are distinct. Moreover, there are several applications where
A satisfies this assumption, for instance, all dynamical systems modeled as random networks
of the Erdos-Renyi type [Tao and Vu, 2014], as well as several known dynamical systems used
as benchmarks in control systems engineering [Ogata, 2001,Siljak, 1991,Siljak, 2007].
Observe that the MCP problem presents both continuous and discrete optimization prop-
erties, captured by the controllability property and the number of non-zero entries, respec-
tively. To avoid the nature of this problem, in [Olshevsky, 2014], the non-zero entries of the
input matrix were randomly generated. In the present chapter, we ‘decouple’ the continuous
and discrete optimization properties, and show that by first solving the discrete nature of
the problem, it is always possible to deterministically obtain a solution to MCP in a second
phase. Besides, the first step reduces the MCP to the set covering problem – well known to be
NP-hard. Nonetheless, the set covering problem is one of the most studied NP-hard problems
(probably second only to the SAT problem). Subsequently, although the set covering problem
is NP-hard, some subclasses of the problem are equipped with sufficient structure that can
be leveraged to invoke a polynomial algorithm that approximate the solution with ‘almost’
optimality guarantees [Bronnimann and Goodrich, 1995]. This contrasts with the approach
proposed in [Olshevsky, 2014], where an approximated solution particular to the MCP prob-
lem was provided. In addition, we study the rMCP which has not been previously addressed
in the literature. Similarly to the MCP, we show that the rMCP can be polynomially re-
duced to the set multi-covering problem, i.e., a set covering problem that allows the same
elements to be covered a predefined number of times. Furthermore, extensions of polynomial
approximation algorithms are also available with similar optimality guarantees.
Alternatively, when the parameters of the LTI system are not exactly known, and assumed
to be independent, structural systems theory [Dion et al., 2003] can be used to address the
MCP and rMCP while ensuring structural controllability, see [Pequito et al., 2016a] and [Liu
et al., 2015], respectively. Notwithstanding, the tools and conditions to ensure structural
81
controllability are quite different from those adopted in this chapter, and a solution to the
MSCP is not necessarily a solution to the MCP when the dynamics’ matrix is simple [Pequito
et al., 2016b].
Main Contributions of the present chapter are as follows: (i) we characterize the exact
solutions to the MCP; (ii) we show that for a given dynamics’ matrix almost all input vectors
satisfying a specified structure are solutions to the MCP; (iii) we show that the rMCP is an
NP-hard problem; (iv) we characterize the exact solutions to the rMCP; (v) we prove that
the decision version of both MCPs are NP-complete; (vi) we provide approximated solutions
to the rMCPs and discuss their optimality guarantees; and, finally, in (vii) we discuss the
limitations of the proposed methodology.
The remainder of this chapter is organized as follows. In Section 8.2, we formally state
the rMCP addressed in this chapter. Next, in Section 8.3, we review some concepts required
to prove the main results of this chapter. In Section 8.4, we present the main results of this
chapter, i.e., we characterize the solutions to the rMCP, its complexity, and a polynomial
algorithm that approximates the solutions. Finally, in Section 8.5, we provide some examples
that illustrate the main results of the chapter and discuss the limitations of the proposed
methodology.
8.2 Problems Statement
Under the adverse scenarios of failure or malicious temper of the actuators, the dynamics of
the system can be modeled by
x(k + 1) = Ax(k) +BM\Au(k), (8.1)
where x(k) ∈ Rn is the state of the system, u(k) ∈ Rp is the input signal exerted by the
actuators, and k ∈ N denotes the time instance. The matrix A ∈ Rn×n, which is referred to
as the system dynamics’ matrix, describes the coupling between state variables. In addition,
BM\A consists of the subset of columns with indices in M\A, the set M = 1, . . . , p is the
set of inputs’ labeling indices and A the set of indices of malfunctioning actuators. Therefore,
an extra set of actuators should be in place to ensure that it is still possible to control the
system if some inputs fail. By identifying the system (8.1) with the pair (A,BM\A), we aim
to ensure that this pair is controllable, so the rMCP can be posed as follows.
P: Given a dynamics’ matrix A ∈ Rn×n and the number of possible input failures s, determine
the matrix B∗ ∈ Rn×(s+1)n such that
82
B∗ = arg minB∈Rn×(s+1)n
‖B‖0 (8.2)
s.t. (A,BM\A) is controllable,
|A| ≤ s, A ⊂M,
where M ⊂ 1, . . . , n are the indices of the non-zero columns of the matrix B. Notice
that the dimension of B is n× (s+ 1)n, to ensure that a solution always exist. In particular,
in the worst case scenario the matrix B that concatenates s times the identity matrix is a
feasible solution. In practice, only the non-zero columns of B matter, which we refer to as
effective inputs. Notice that when s = 0, we recover the MCP problem, so we first provide
the solution to the MCP, which we later extend to provide the characterize the solution to
the rMCP.
The main assumptions in this chapter are as follows:
Assumption 1: The dynamics’ matrix is simple, i.e., all the eigenvalues of A are distinct.
Observe that Assumption 1 is not very restrictive since there are several applications where
A satisfy this assumption. For example, dynamical systems modeled as random networks
of the Erdos-Renyi type [Tao and Vu, 2014], as well as known dynamical systems used as
benchmarks in control systems engineering [Ogata, 2001,Siljak, 1991,Siljak, 2007].
Assumption 2: A left-eigenbasis of A is available, i.e., the eigenbasis consisting of left-
eigenvectors of A.
The second assumption is required by technical reasons, since an eigenbasis is determined
using numerical methods. Therefore, in practice, it may be composed of approximated eigen-
vectors to a given floating-point error – see Section 8.4.1 for further discussion.
8.3 Preliminaries and Terminology
In this section, we use introduce some basic concepts of computational complexity required
to characterize the rMCP using the following NP-hard problem.
Definition 8.3.1 ([Chekuri et al., 2012]). (Minimum Set Multi-covering Problem) Given a set
ofm elements U = 1, 2, . . . ,m referred to as universe, a collection of n sets S = S1, . . . ,Sn,with Sj ⊂ U , with j ∈ 1, . . . , n,
⋃nj=1 Sj = U , and a demand function d : U → N that
indicates the number of times an element i needs to be covered. In other words, d(i) is
the minimum number of sets in S that need to be consider such that i is member of all
of this sets. The minimum set multi-covering problem consists of finding a set of indices
J ∗ ⊆ 1, 2, . . . , n corresponding to the minimum number of sets covering U , where every
83
element i ∈ U is covered at least d(i) times, i.e.,
J ∗ = arg minJ⊆1,2,...,n
|J |
s.t. |j ∈ J : i ∈ Sj| ≥ d(i) .
In particular, if d(i) = 1 for all i ∈ 1, . . . , n, then we obtain the minimum set covering
problem.
The minimum set multi-covering problem plays a double role in this chapter: (i) we reduce
the rMCP to a minimum set multi-covering problem; and (ii) by polynomially reducing [Garey
and Johnson, 1979] it to the rMCP, we show the latter to be NP-hard. Such reduction is useful
to determine the qualitative complexity class a particular problem belongs to, see [Garey and
Johnson, 1979] for an introduction to the topic.
8.4 Robust Minimum Controllability Problem
In this section, we propound the main results of this chapter. First, notice that when there
are no input failures (i.e., s = 0) in the rMCP, we recover the MCP problem. Therefore, we
first provide the solution to the MCP, which we later extend to provide the characterize the
solution to the rMCP.
To obtain the solution to the MCP, we perform the following two steps: (i) we polynomial
reduce the structural optimization problem in (8.3) to a set-covering problem using Algo-
rithm 4, and (ii) we determine a numerical parametrization of an input matrix with a specific
input structure in a deterministic polynomial fashion, by solving (8.4). Simply speaking, by
performing these two steps, we are ‘decoupling’ the discrete and continuous properties of the
MCP without losing optimality. In fact, in Theorem 8.4.2, we provide a generic characteriza-
tion of the solutions to the MCP, and a particular instance can be found using Theorem 8.4.6.
Next, we design a similar procedure to that used to solve MCP to obtain the solution to
the rMCP, which we show to be NP-hard (Theorem 8.4.8). Specifically, we determine the
sparsity of an input matrix, by polynomially reducing the problem to a minimum set multi-
covering problem (see Theorem 8.4.9), which is later used to characterize the solutions to the
rMCP (Theorem 8.4.11).
Complementary to the solutions to the MCPs, in what follows, we show that (under
Assumption 1) the decision versions of the rMCP is NP-complete (Theorem 8.4.12). Subse-
quently, we provide a polynomial approximation algorithm (see Algorithm 5), which solution
is feasible (see Theorem 8.4.15) and has sub-optimality guarantees (see Theorem 8.4.16).
Finally, in Section 8.4.1, we explore numerical implications of waiving Assumption 2.
84
Let us start by considering the MCP and only one input, i.e., instead of an input matrix
B, we only consider an input vector b. The first set of results provides necessary conditions
on the structure that an input vector b must satisfy to ensure that (A, b) is controllable, and
a polynomial complexity procedure (Algorithm 4) that reduces the problem of obtaining such
necessary structural patterns to a minimum set covering problem.
Lemma 8.4.1. Given a collection of non-zero vectors vjj∈J with vj ∈ 0, ?n, the procedure
of finding b∗ ∈ 0, ?n such that
b∗ = arg minb∈0,?n
‖b‖0
s.t. vj · b 6= 0, for all j ∈ J(8.3)
is polynomially (in |J | and n) reducible to a minimum set covering problem with universe
U and a collection S of sets by applying Algorithm 4.
Proof. Consider the sets S and U obtained in Algorithm 4. The following equivalences hold:
let I ⊂ 1, · · · , n be a set of indices and bI the structural vector whose i-th component is
non-zero if and only if i ∈ I. Then, the collection of sets Sii∈I in S covers U if and only
if ∀j ∈ J , ∃k ∈ I such that j ∈ Sk, which is the same as ∀j ∈ J , ∃k ∈ I such that vjk 6=0 and bk 6= 0 , this can be rewritten as ∀j ∈ J , ∃k ∈ I such that vjk bk 6= 0 and therefore
∀j ∈ J vj · b 6= 0. In summary, bI is a feasible solution to the problem in (8.3). In addition,
it can be seen that by such reduction, the optimal solution b∗ of (8.3) corresponds to the
structural vector bI∗ , where Sii∈I∗ is the minimal collection of sets that cover U , i.e., I∗
solves the minimum set covering problem associated with S and U . Hence, the result follows
by observing that Algorithm 4 has polynomial complexity, namely O(max|J |, n3).
Next, we show that given the structure obtained in Lemma 8.4.1, almost all possible real
numerical realizations lead to a vector b ∈ Rn that is a solution to the MCP.
Theorem 8.4.2. Let vii∈J to be the set of left-eigenvectors of A, and b a solution to (8.3).
Then, almost all numerical realizations b of b are solutions to the MCP.
Proof. The proof follows by showing that if vii∈J with countable J such that vi 6= 0 for
all i ∈ J and b a solution to (8.3), then the set Ω = b ∈ Rn : vi · b = 0 for some i ∈J , and b is a numerical instance of b has zero Lebesgue measure. The proof follows similar
steps to those proposed in [Wonham, 1985], but due to the additional sparsity constraint we
devise an independent proof. Let vii∈J , with countable J , be given and let b be a solution to
problem (8.5). For b ∈ Rn, the equation vi · b = 0 represents a hyperplane Hi ⊂ Cn (provided
85
Algorithm 4 Polynomial reduction of the structural optimization problem (8.3) to a set-
covering problem
Input: vjj∈J , a collection of |J | vectors in 0, ?n.
Output: S = Sii∈1,...,n and U , a set of n sets and the universe of the sets, respectively.
Step 1. set Si = for i = 1, . . . , n
Step 2. for j = 1, . . . , |J |for i = 1, . . . , n
if vji 6= 0 then
Si = Si ∪ j;end if
end for
end for
Step 3. set S = S1, . . . ,Sn and U =⋃n
i=1 Si.
vi 6= 0 for all i), thus the equation vi · b 6= 0 defines the space Cn \Hi. Therefore, the set of b
that satisfies vi ·b 6= 0 for all i ∈ J , is given by⋂i∈J
(Cn \ Hi
)= Cn\
( ⋃i∈JHi)
and the set Ω of
values which does not verify the equations is the complement, i.e.,
(Cn \
⋃i∈JHi)c
=⋃i∈JHi,
which is a set with zero Lebesgue measure in Cn, since |J | is countable.
Now, if vii∈J is taken to be the set of left-eigenvectors of A and b the corresponding
solution to problem (8.5), each member of the set Ω constitutes a solution to (8.5) and hence
the MCP. Since, by the preceding arguments, Ω has Lebesgue measure zero in Cn, it follows
readily that almost all numerical instances of b are solutions to the MCP.
Remark 8.4.3. The generic properties that characterize structural controllability [Dion et al.,
2003] imply that almost all parameters of both dynamics and input matrices satisfying a
given structural pattern are controllable. Although, in Theorem 8.4.2 the dynamics’ simple
matrix A is fixed, i.e., a numerical instance with specified structure, density arguments are
provided to the numerical realizations of the input vector with certain structure that ensure
controllability of the system.
Although Theorem 8.4.2 ensures that almost all parameterizations provide a feasible solu-
tion to the MCP, we need to determine one parameterization that guarantees controllability,
which can be determined by solving the following optimization problem.
B∗ = arg minB∈Rn×m
0
Bl,k = 0 if Bl,k = 0, l, k = 1, . . . , n.(8.4)
Remark 8.4.4. In fact, suppose the objective function in the optimization problem (8.4) is
given by f(B). Then, this can be chosen to satisfy additional design constraints. For instance,
86
f(B) = cᵀB1, where c could capture an actuation cost, i.e., entry ci captures how desirable is
to actuate xi, and 1 is a vector of ones with appropriate dimensions. Subsequently, one may
need additional constraints such that the total actuation budget r available is bounded, for
instance, |f(B)| ≤ r and Bi,j ≥ 0 to avoid negative entries that will restrain the objective goal.
Alternatively, f(B) can also be considered to be nonlinear, while capturing control-theoretic
properties; in particular, it can be a function of the controllability Grammian [Pasqualetti
et al., 2014], with some appropriate constraints to ensure the problem to be well defined.
Next, we show that the (sparsest) pattern given by Lemma 8.4.1 with the optimization
problem (8.4) leads to a numerical realization that is a solution to the MCP.
Lemma 8.4.5. Given vii∈J with vi ∈ Cn, the procedure of finding b∗ ∈ Rn that is a solution
tob∗ = arg min
b∈Rn‖b‖0
s.t. vi · b 6= 0, for all i ∈ J ,(8.5)
is polynomially (in |J | and n) reducible (by Algorithm 4) to a minimum set covering
problem, with numerical entries determined using the optimization problem (8.4).
Proof. By Lemma 8.4.1, given vii∈J , problem (8.5) is polynomially (in |J | and n) reducible
to a minimum set covering problem. Now, given a solution b to (8.3), the optimization
problem (8.4) can be used to obtain a numerical instantiation b with the same structure as
b such that vi · b 6= 0 for all i ∈ J , which incurs polynomial complexity (in |J | and n).
Furthermore, it is readily seen that any feasible solution b′ to (8.5) satisfies ‖b′‖0 ≥ ‖b‖0 =
‖b‖0. Hence, b obtained by the above recipe is a solution to (8.5) and the desired assertion
follows by observing that all steps in the construction have polynomial complexity (in |J | and
n).
Now, we state one of the main results of the chapter.
Theorem 8.4.6. The solution to the MCP can be determined by first identifying the sparsity
of the input vector as in Lemma 8.4.1, followed by determining the numerical realization of
the non-zero entries as in Lemma 8.4.5.
Proof. The proof follows by invoking the PBH eigenvector test. The left-eigenbasis is available
by Assumption 1, the problem in (8.5) is a restatement of the MCP.
Next, based on the previous solution to the MCP, we extend the result to find a dedicated
solution to the MCP.
Theorem 8.4.7. Let b ∈ Rn be a solution to the MCP as described in Theorem 8.4.6, b its
sparsity and N ⊂ 1, . . . , n the indices where b is non-zero, i.e., N = i : bi = ?, and i =
87
1, . . . , n. If B ∈ 0, ?n×n has exactly one non-zero entry in the i-th row, where i ∈ N ,
then the output B ∈ Rn×n of (8.4), when B and the left-eigenbasis of A are considered, is a
solution to the MCP.
Proof. The feasibility of the solution is ensured by proceeding similarly to Theorem 8.4.2,
when the left-eigenbasis of the dynamics’ matrix is considered to invoke the PHB eigenvector
criterion. The optimality follows similar steps to those presented in Lemma 8.4.5.
Before characterizing the solutions to the rMCP, we notice that this problem is computa-
tionally challenging. Specifically, we obtain the following result which follows from noticing
that a particular instance of the rMCP is the MCP (an NP-hard problem).
Theorem 8.4.8. The rMCP is NP-hard.
Therefore, without incurring in additional computational complexity and similar to the
reduction proposed from MCP to the set covering problem, we can characterize the dedicated
solutions to the rMCP as follows.
Theorem 8.4.9. Let v1, . . . , vn be a left-eigenbasis of A, and s the number of possible
input failures. Further, consider the set multi-covering problem (S1, . . . ,S(s+1)n, U ≡1, . . . , n; d), where the demand is d(i) = s+1 for i ∈ U , and Sk = j : [vj ]l 6= 0, and l−1 = k
mod n for k ∈ K ≡ 1, . . . , (s+ 1)n. Then, the following statements are equivalent:
(i) M∗ is a solution to the set multi-covering problem (S1, . . . ,S(s+1)n,U ≡ 1, . . . , n; d);
(ii) Bn(M∗) is a dedicated solution to rMCP, where [Bn(M∗)]i,l = 1 for l = i mod n and
i ∈M∗ ⊂ K, and zero otherwise.
Proof. First, we observe that, by construction of the sets S1, . . . ,S(s+1)n and the demand
function d(i), for i ∈ 1, . . . , n, there exists always s + 1 entries matching every non-zero
entry of the vectors in a left-eigenbasis. This implies that if at most s sensors fail, at least
one entry of a column c of B is such that for each left-eigenvector v.c 6= 0, implying viᵀB 6= 0
for i ∈ 1, . . . , n. Hence, the system is controllable by the PBH eigenvector test, and we
have a feasible solution. Now we need to show that the solution is optimal, i.e., there is not
another solution with less dedicated inputs to the rMCP. We will proceed by contradiction,
so assume that there is a solution to a demand function d(i) = w for i ∈ 1, . . . , n and some
w < s + 1. Then, for some entry of a left-eigenvector v it is only ensured the existence of w
columns in B whose inner product is not zero. Therefore, if w dedicated inputs fails, i.e., the
corresponding columns of B are now zero, then B is such that vᵀB = 0, for some eigenvector
v. Thus, contradicting the assumption that there is a sparser solution to the rMCP.
88
Remark 8.4.10. A matrix Bn(M′) described by the concatenation of (s+ 1) solutions to the
MCP achieves feasibility to the rMCP, but it is not necessarily an optimal solution – see
Section 8.5 for a counterexample.
Next, we characterize the solutions of the rMCP, i.e., not only the ones that are dedicated.
Towards this goal, we introduce the following merging procedure. Let two distinct effective
inputs i and j, associated with two non-zero columns of the input matrix, bi and bj , be such
that they do not share non-zero entries k, i.e., [bi]k 6= [bj ]k for k ∈ 1, . . . , n. These two
inputs are said to be merged into one input bi′, where [bi
′]k = [bi]k when [bi]k 6= 0, and
[bi′]k = [bj ]k when [bj ]k 6= 0, for k ∈ 1, . . . , n. Further, we implicitly assume that bi
′takes
the place of bi, and bj is set to zero. In other words, the effective input i is associated with
bi′
and the effective input j is discarded.
Theorem 8.4.11. Let Bn(M∗) ∈ Rn×(s+1)n be a dedicated solution to the rMCP as described
in Theorem 8.4.9. In addition, let B ∈ 0, ?n×(s+1)n be the sparsity of the matrix resulting
of the merging procedure between any of the effective inputs in Bn(M∗). Then, the matrix
B ∈ Rn×n obtained using the optimization problem (8.4), with B and the left-eigenbasis of A,
is a solution to the rMCP.
Proof. The proof follows similar steps to those presented in Theorem 8.4.7. In particular,
recall the merging procedure, and the guarantees obtained in Theorem 8.4.9.
Although we reduced the rMCP to a set multi-covering problem, it is interesting to notice
that these are ‘equivalent’ in the sense that the decision version of the rMCP is NP-complete.
Theorem 8.4.12. The MCP and rMCP are NP-complete.
Proof. From [Olshevsky, 2014], we have that the MCP is NP-hard, and, in particular, the
minimum set covering problem can be polynomially reduced to it. Therefore, we just need to
show that the MCP assuming that A comprises only simple eigenvalues and the left-eigenbasis
is known, i.e., under our assumptions, can be reduced polynomially to the minimum set
covering problem.
To this end, note that, given the set vii∈J of left-eigenvectors of A, the MCP is equivalent
to problem (8.5), the latter being polynomially (in |J | and n) reducible to the minimum set
covering problem (see Lemma 8.4.5). Since |J | = n, the overall reduction to the minimum
set covering problem is polynomial in n.
Similar arguments hold for the rMCP. It was shown to be NP-hard, in Theorem 8.4.8, and
a reduction to the minimum set multi-covering problem can be obtained by Theorem 8.4.9.
Therefore, from Theorem 8.4.12, we have the following observation.
89
Remark 8.4.13. A solution of the MCP almost always coincides with a numerical realization
of a solution to the associated minimal structural controllability. Combining this with the
fact that the MCP is NP-complete when the eigenvalues of A are simple (see Theorem 8.4.12),
it follows that the set of simple dynamics’ matrices that lead to NP-complete problems has
zero Lebesgue measure.
Also, we notice that if a problem is NP-hard, then it does not mean that all instances are
not polynomially solvable; notwithstanding, these can be solved exactly [Hua et al., 2009,Hua
et al., 2010].
Remark 8.4.14. The NP-completeness, stated in Theorem 8.4.12, allows us to consider the
subclasses of the set multi-covering problem that are known to be polynomially solvable, to
identify polynomially solvable subclasses of the rMCP. This enables a new characterization of
solutions to the question posed in [Aguilar and Gharesifard, 2014], regarding the existence of
polynomial algorithms to determine controllable graph structures.
Additionally, by the construction proposed in Theorem 8.4.9 and the result in Theo-
rem 8.4.12, if the set multi-covering problem obtained possess additional structure, then this
can be leveraged to use polynomial algorithms to approximate the solutions with close-to-
optimal solutions (see Algorithm 5).
Furthermore, Algorithm 5 leverages the submodularity properties [Bach, 2011] of the set
multi-covering properties to obtain a dedicated solution to the rMCP. Submodularity proper-
ties ensure that the associated polynomial greedy algorithms have sub-optimality guarantees
while performing well in practice [Bach, 2011]. Subsequently, we can obtain the following
result.
Theorem 8.4.15. The matrix Bn(M′) obtained using Algorithm 5, with B and the left-
eigenbasis of A, is a feasible solution to the rMCP. Further, the computational complexity of
Algorithm 5 is O(sn), and it ensures an approximation optimality bound of O(log n).
Proof. Algorithm 5 terminates when each element of the universe set U is covered s+ 1 times
(steps 4-5) by the sets of the set multi-covering problem indexed by J . In other words,
it terminates when we obtain a solution to the set multi-covering problem. By designing
Bn(M′), withM′ = J , we build a matrix that corresponds to dedicated inputs. Thus, using
Theorem 8.4.9, since J is a solution to the set multi-covering problem, then Bn(M′) is a
dedicated solution to the rMCP.
First, notice that the output of Algorithm 5, i.e., Bn(M′), is a feasible solution since the
algorithm stops when each of the elements in the universe of the set multi-cover is s+ 1 times
covered.
90
Algorithm 5 Approximate Solution to the rMCP
Input: Left-eigenbasis v1, . . . , vn associated with A ∈ Rn×n and the number s of possible
input failures.
Output: Dedicated solution Bn(M′) ∈ Rn×(s+1)n.
Step 1. Let S1, . . . ,S(s+1)n, where Sk = j : [vj ]l 6= 0, and l − 1 = k mod n for k ∈ K ≡1, . . . , n(s+ 1).Step 2. set U i = ∅, with i = 1, . . . , s . denote the indices in U that are covered i times and
the indices of the sets covering them, respectively.
Step 3. set J = ∅Step 4. for i = 1, . . . , s+ 1
set U i = k : |k ∈ U : k ∈ Sj , j ∈ J | ≥ i . the indices that are already covered by at
least i sets
Step 5.while U i 6= Uselect Sj with largest number of indices in U \ U i
set J ← J ∪ jset U i ← U i ∪ Sjend while; end for
set M′ ← JStep 6. set Bn(M′), where [Bn(M′)]i,l = 1 for
l = i mod n and i ∈M′ ⊂ K, and zero otherwise.
The computational complexity of Algorithm 5 is obtained by the overall complexity of
steps 1, 4 and 5. In step 1, we need to compute (s + 1)n sets, in step 5 we need to consider
at most n sets, and, in step 4, (s + 1) iterations are performed, each with the number of
steps of step 5, yielding (s + 1)n computational steps. Summing up the complexity of each
step, Algorithm 5 has, in the worst case, complexity of order O(sn). In addition, notice that
the performance attained in a multi-set covering problem is the same as in the rMCP, as a
consequence of Theorem 8.4.12. Furthermore, the solution obtained incurs in an optimality
gap of at most O(log n) since the algorithm implements the greedy algorithm associated with
submodular functions, as it is the case of the multi-set covering problem, and the result
follows.
Finally, by invoking Theorem 8.4.11 and Theorem 8.4.15, we obtain the following result.
Theorem 8.4.16. Let Bn(M′) ∈ Rn×(s+1)n be a dedicated solution to the rMCP as described
in Theorem 8.4.15. In addition, let B ∈ 0, ?n×(s+1)n be the sparsity of the matrix resulting
of the merging procedure between any of the effective inputs in Bn(M′). Then, the matrix
B ∈ Rn×n obtained using the optimization problem (8.4), with B and the left-eigenbasis of A,
91
achieves feasibility to the rMCP and is computed in polynomial time.
8.4.1 Numerical and Computational Remarks
Now, for the sake of completeness, we discuss the implications of waiving Assumption 2 and
the impact on the input vector in the MCP. The results readily extend to the general solution
to the rMCP. Towards this goal, we need the following result.
Theorem 8.4.17 ([Pan and Chen, 1999]). Let A ∈ Cn×n be a matrix with simple eigenvalues.
The deterministic arithmetic complexity of finding the eigenvalues and the eigenvectors of A
is bounded by O(n3)
+ t (n,m) operations, where t(n,m) = O((n log2 n
) (logm+ log2 n
)),
for a required upper bound of 2−m‖A‖ on the absolute output error of the approximation of
the eigenvalues and eigenvectors of A and for any fixed matrix norm ‖ · ‖.
More precisely, Theorem 8.4.17 states that in practice, only a numerical approximation
of the left-eigenbasis is possible in polynomial time. In this case, let ε = 2−m‖A‖ be as in
Theorem 8.4.17, then the results stated in Lemma 8.4.1 and Lemma 8.4.5 (see also Algorithm 4
and the optimization problem (8.4)) can only be used in an ε-approximation of the left-
eigenbasis of the dynamics’ matrix. Therefore, the ε-approximation of the left-eigenbasis may
lead to the following issues:
(i) an entry in the left-eigenvector is considered as zero, where in fact it can be some non-
zero value that (in norm) is smaller then ε. Consequently, the sets generated using Algorithm 4
(see also Lemma 8.4.1) do not contain the indices associated with those non-zero entries. Thus,
additional sets need to be considered to the minimum set covering, which implies that the
structure of the input vector may contain more non-zero entries than the sparsest input vector
that is a solution to the MCP. In other words, we obtain an over-approximation of the sparsest
input vector that is a solution to the MCP.
(ii) an entry of the ε-approximation in a left-eigenvector of the left-eigenbasis is non-zero.
Then, it does not represent an issue when computing the structure of the input vector as
described in Lemma 8.4.1 (see also Algorithm 4), but it can represent a problem when deter-
mining the numerical realization by resorting to the optimization problem (8.4). Nonetheless,
by Theorem 8.4.2 it follows that such issue is unlikely to occur.
To undertake a deeper understanding of which entries fall in the first issue presented above,
several methods to compute eigenvectors can be used and solutions posteriorly compared,
see [Demmel et al., 2000] for a survey on different methods and computational issues associated
with those.
92
8.5 Illustrative Examples
To illustrate the first main result of this chapter, to find a solution to the MCP, consider the
dynamics’ matrix A
A =
6 −3 3 2 −1
0 8 0 0 0
4 3 7 2 1
0 0 0 6 0
−4 −3 −3 −2 3
, (8.6)
where σ(A) = 2, 4, 6, 8, 10 consists of distinct eigenvalues, so the matrix A is simple and
our results are applicable. Consequently, to obtain the solution to the MCP, we first compute
the left-eigenvectors of A that are as follows:
v1 = [ 1 1 0 0 1 ]ᵀ, v2 = [ 0 0 1 0 1 ]ᵀ, v3 = [ 0 0 0 1 0 ]ᵀ,
v4 = [ 0 1 0 0 0 ]ᵀ and v5 = [ 1 0 1 1 0 ]ᵀ.
Using Algorithm 4, since vi for i = 1, . . . , 5, we obtain Sjj=1,...,5, where the j-th set corre-
sponds to the set of indices of the left-eigenvector which have a non-zero entry on the j-th
position. In particular, we obtain
S1 = 1, 5 ,S2 = 1, 4 ,S3 = 2, 5 ,S4 = 3, 5 ,S5 = 1, 2 ,
and the universe set is given by U = 1, 2, 3, 4, 5 . Now, it is easy to see that a solution to
this minimum set covering problem is the set of indices I∗ = 2, 3, 4, since U = S2 ∪ S3 ∪ S4
and there is no pair of sets, i.e., I ′ = i, i′ with i, i′ ∈ 1, . . . , 5 such that U = Si ∪ Si′ .Therefore, a possible structure of the vector b that is a solution to the MCP is
b = [ 0 ? ? ? 0 ]ᵀ. (8.7)
Additionally, to find the numerical parametrization of b, under the sparsity pattern of b,
we have to solve the following system with three unknowns: b2, b3, b4 6= 0 and b3 + b4 6= 0. By
inspection, a possible choice is b = [ 0 1 1 1 0 ]ᵀ, but the numerical parametrization can
be obtained by invoking the optimization problem (8.4), with the set of left-eigenvectors of A
given by vjj∈1,...,5 and the structure of b given by b in (8.7). For the sake of completeness,
we, the controllability matrix is given by
C = [ b Ab A2b A3b A4b ] =
0 2 44 608 7184
1 8 64 512 4096
1 12 120 1176 11520
1 6 36 216 1296
0 −8 −104 −1112 −11264
,
93
and the rank(C) = 5, implying that (A, b) is controllable.
Observe that the single-input solution obtained with b = [ 0 1 1 1 0 ]ᵀ, can be
immediately translated into a solution with two effective inputs, by Theorem 8.4.7. In
particular, two possible solutions are B = [ b1 b2 ] with b1 = [ 0 1 1 0 0 ]ᵀ and
b2 = [ 0 0 0 1 0 ]ᵀ, and B = [ b1 b2 b3 ], with b1 = [ 0 1 0 0 0 ]ᵀ, with b2 =
[ 0 0 1 0 0 ]ᵀ and b3 = [ 0 0 0 1 0 ]ᵀ, where the latter is a dedicated solution. Al-
ternatively, if we consider, for instance, B = [ b1 b2 ], with b1 = [ 0 1 0 0 0 ]ᵀ and
b2 = [ 0 0 −1 1 0 ]ᵀ, then vᵀB = 0 for the left-eigenvector v = [ 1 0 1 1 0 ]ᵀ, and
the pair (A,B) is uncontrollable. Thus, as prescribed in Theorem 8.4.7, by the optimization
problem (8.4), one can obtain a new realization of B that ensures controllability of (A,B);
e.g., the same b1, and b2 = [ 0 0 1210 1 0 ]ᵀ.
Notice that a systematic polynomial approximation to the MCP can be obtained by con-
sidering the rMCP with the number of input failures s = 0. By doing so, we obtain the same
sparsity to b, i.e., b, as in the aforementioned example, and the subsequent analysis follows.
We also observe that the approximate solution is a solution to the MCP.
Now, we illustrate how to find a solution to P. Let us apply the developments of Sec-
tion 8.4, when we consider the dynamics’ matrix in (8.6). First, if we consider that at most
one input fails, we use Algorithm 4, where a set multi-covering problem is considered with
the sets as in Section 8.4, universe U = 1, . . . , 5 and with a demand function d(i) = 2
for i = 1, . . . , 5, i.e., each element must be covered twice. Subsequently, by inspection, we
conclude that the sets S2 and S4 need to be considered twice, since the elements 5 and 4
only appear in these sets, respectively. After this, we need to cover the element 2 and to
this end we can choose S3 or S5 or twice one of them, so a possible solution to the multi-set
covering problem is M∗ = 2, 3, 4, 2, 3, 4. Therefore, Bn(M∗) is a solution to the rMCP,
and, in particular, the solution is the same as concatenating twice a dedicated solution to the
MCP, see Remark 8.4.10. Further, Algorithm 5 produces an optimal solution as often occurs
in practice.
In fact, if we apply our results when s inputs are allowed to fail, i.e., d(i) = s + 1 for
i = 1, . . . , 5, we notice that the sets S2 and S4 need to be considered s + 1 times since
the elements 5 and 4 only appear in these sets, respectively. Besides, we need to cover the
element 2, so we can choose either S3 or S5 s + 1 times, which implies that B(M∗), with
M∗ = 2, 3, 4, . . . , 2, 3, 4 where the elements 2, 3 and 4 appear s + 1 times, is a solution.
Similarly, the solution consists of concatenating s+ 1 times a dedicated solution to the MCP,
and the same remarks are applicable, i.e., Remark 8.4.10.
However, the concatenation of s+ 1 solutions to the MCP is not always a solution to the
rMCP when at most s inputs are allowed to fail. Let us consider the dynamics’ matrix and
94
associated left-eigenvectors as follows:
A =
4 −2 2
−1 3 1
1 −1 5
and V =
| | |v1 v2 v3
| | |
=
1 0 1
1 1 0
0 1 1
. (8.8)
First, we note that σ(A) = 2, 4, 6, so A is simple, and we can apply our results. Secondly,
the structure of the left-eigenvectors of A is given by v1 = [ ? ? 0 ]ᵀ, v2 = [ 0 ? ? ]ᵀ and
v3 = [ ? 0 ? ]ᵀ. Further, we consider that at most one input failure is likely to occur, i.e.,
s = 1. Then, we can invoke Algorithm 4 to build the sets for the set multi-covering problem,
which are as follows: S = S1,S2,S3, with S1 = 1, 2, S2 = 2, 3 and S3 = 1, 3, and
U =⋃3i=1 Si = 1, 2, 3. By inspection, we obtain thatM′ = 1, 2, 3 is the optimal solution,
where the indices cover each element of U twice. Further, observe that a solution to the
dedicated input MCP always has size equal to two, and in this case, the concatenation of
two solutions lead to a solution that has one more input than the optimal solution obtained.
Observe that this is a small dimensional example that incurs into a solution that is already
33% worst than the optimal. Alternatively, if we apply Algorithm 5 to approximate the
solution to the rMCP, we obtain one that is optimal, i.e., B(M′) where M′ = 1, 2, 3.
8.6 Concluding Remarks
In this chapter, we addressed two minimal controllability problems, with the goal of charac-
terizing the input configurations that actuate the minimal subset of variables yielding con-
trollability, under a specified number of failures. The problems explored were shown to be
NP-complete, and a polynomial reduction of these to a set multi-covering problem was pro-
vided. In particular, the strategies followed by us separate the discrete and continues nature
of the minimal controllability problems. Subsequently, we discussed greedy solutions to the
minimal controllability problems that yields feasible (but sub-optimal) solutions to rMCP.
Directions for future research in this line of work include the use of the obtained inputs’
structure and consider methods such as coordinate gradient descent to minimize an energy
cost, and to consider the case where the model is not exactly known. Additionally, it would be
interesting to assess the computational complexity of the rMCP without the assumption on
the spectrum of the dynamics’ matrix, as well as to provide polynomial algorithms to obtain
approximated solutions with suboptimal guarantees.
95
Chapter 9
The robust minimal controllability
problem for switched linear
continuous-time systems
Now, we extend the results of Chapter 8 to switched LTI systems. These systems have a set
of discrete modes among which they may switch. Similarly to the previous chapter, the goal
is to encounter the minimal number of actuators and their placement, such that the system
is controllable in the scenario where a specified number of actuators may fail. We solve two
versions of the problem. In the first version, we may have a different actuators’ placement for
each mode, and, in the second version, we aim to find the same actuators’ placement across all
modes that ensure controllability of the system. We present algorithms to solve both versions
of the problem. Moreover, we provide algorithms to approximate the solution of the problem.
However, due to the combinatoric nature of the second version of the problem, we may only
approximate the first version in polynomial time.
Again, we achieve the purpose of ensuring information security in the area of control
systems. We published this work in [Ramos et al., 2018b].
9.1 Introduction
Switched systems are paramount in an extensive number of applications, such as control
of mechanical systems, process control, automotive industry, power systems, aircraft/traffic
control, see for instance [Lin and Antsaklis, 2009, Sun, 2006]. The systems belonging to the
subclass of switched systems whose subsystems are described by linear differential equations
are called switched linear systems. These systems alone consist of a line of research with
growing attention [Lin and Antsaklis, 2009], and several works aim to study the properties
97
of this class such as controllability, observability and reachability [Cheng, 2005, Ji et al.,
2007,Sun, 2006,Sun et al., 2002].
Recent works studied controllability under the scope of uncertain switched linear systems,
where the state matrices’ entries of each mode are only known to be zero or non-zero [Liu et al.,
2013]. A switched linear system is said to be structurally controllable whenever there exists
a numerical realization of the non-zero entries of the state matrices leading to a controllable
switched linear system. In [Ramos et al., 2013], the authors introduced a framework to
model check structural properties of switched linear systems, and propose its use to check
the structural controllability of each subsystem. In [Pequito and Pappas, 2017], the authors
addressed the structural minimal controllability problem for switched linear continuous-time
systems, finding the minimum number of inputs that need to be considered to attain structural
controllability of the system. However, the state matrices’ entries may be linearly dependent,
and the system is structurally controllable but not controllable by the same set of actuators.
In contrast, in this chapter, we propose to address the scenario where we have knowledge
of the state matrices entries and that these matrices are simple. We aim to ensure the
controllability of the system, the ability to drive the system from an initial state to the
desired state, extending the results of Chapter 8, published in [Pequito et al., 2017].
We assume that either we have access to ‘common’ transitions and knowledge of the
existing modes of the switching system, or that the controller is equipped with supervisory
capabilities enabling the system to switch between modes, as considered in same engineering
applications as in [Pequito and Pappas, 2017,Petreczky et al., 2015]. More specifically, given a
switched linear system with continuous time, we address the problem of finding the minimum
number of inputs/actuators and state variables we need to actuate, ensuring the system’s
controllability under two scenarios (when a specified number of inputs may fail):
(i) design an input matrix, for each system’s mode, that controls the system actuating a small
number of state variables;
(ii) design a common input matrix that controls the system actuating a small number of state
variables.
Main contributions of this chapter consist in addressing (i) and (ii) while providing in-
sights on how the obtained solutions can be exploited to improve the computational complexity
of the proposed algorithms. We reduce both problems to the well studied set multi-covering
problem [Chekuri et al., 2012]. Also, we show that (i) is NP-complete when we use a suffi-
cient condition for controllability. These results allow us to use known polynomial complexity
algorithms that approximate the set multi-covering problem, to get approximations for (i)
and (ii). However, due to the combinatorial nature of (ii), only (i) may be approximated in
polynomial time with those approximation algorithms.
98
Chapter structure. Section 9.2 states the problems we aim to solve and Section 9.4
illustrates the main results with examples. Finally, Section 9.5 concludes the Chapter.
9.2 Problems Statement
Consider a large-scale dynamical system with dynamics modeled by a switched linear continuous-
time system (SLCS). Conceptually, we can see a switched linear continuous-time system
(SLCS) as a set of linear continuous-time systems (LCS), where each element of the set is
called a mode, together with a set of discrete events that cause the system to switch between
modes. Subsequently, an SLCS for which some actuators may fail, due to either a mali-
cious entity tempering with the actuators or natural phenomena reasons, may be described
as follows:
x(t) = Aσ(t)x(t) +BM\Aσ(t)σ(t) u(t), (9.1)
where σ : R+ → M = 1, . . . ,m is a piecewise switching signal, that only switches once in a
given dwell-time, x(t) ∈ Rn the state of the system, and u(σ(t)) ∈ Rp is a piecewise continuous
input signal. Moreover, BM\Aσ(t)σ(t) consists of the subset of columns with indices inM\Aσ(t),
the set M = 1, . . . , p is the set of inputs’ labeling indices and Aσ(t) the set of indices of
affected (i.e., malfunctioning) inputs, for each mode σ(t). Additionally, as discussed in the
introduction, we focus on the scenario where we have the knowledge of the switching signal,
as well as dwell-time, as in [Pequito and Pappas, 2017,Petreczky et al., 2015] and references
therein.
To ease the notation, we refer to the system in (9.1) by the pair (Aσ(t), BM\Aσ(t)σ(t) ). Each
mode of the system corresponds to the time interval where the switching signal is constant,
σ(t) = i and i ∈ 1, . . . ,m. In other words, it corresponds to an LCS system, which we
denote by the pair (Ai, BM\Aii ). It is worth noticing that in each mode the dynamics’ matrix
could have a different dimension. For instance, we may want to model a power system such
that some of its components (e.g., the generators) may be working, depending on the mode
the system is in. Hence, for a mode where some generators are not working, the dynamics’
matrix may be designed with a small number of state variables. Moreover, we can include
this behavior taking n as the maximum of the dimensions of each mode’s dynamics’ matrix
Ai, i ∈ M, and assuming fixed the system variables’ order. Hence, when a variable does not
play a role in a mode, its dynamics’ matrix has zeros in the respective row and column.
Furthermore, from a systems’ engineering perspective, we often want to ensure that the
systems possess properties such as controllability. The SLCS (9.1) is controllable if, for any
initial state x(0) = x0 and any desired state xd there exists a time instance tf > 0, a switching
signal σ : [0, tf [→ M and an input u : [0, tf [→ Rp, s.t. x(tf ) = xd. In other words, we
99
can always design a control law that drives the system from an initial state to any desired
state in a finite amount of time. Thus, for each mode, an extra set of actuators must be
in place to ensure that the system is still controllable if some inputs fail. Besides, due to
economic restrictions, i.e., since more actuation capabilities incur in higher cost, it is of utmost
importance to deploy the minimum number of actuators that can still control (9.1), whenever
some specified maximum number of actuators may fail for each mode as in (9.1). We refer
to this problem as robust minimal controllability problems for SLCS (rMCPS). Subsequently,
given the system (9.1), the rMCPS can be posed as follows:
Problem Statement 1. Determine matrices B1 ∈ Rn×(s1+1)n, . . . , Bm ∈ Rn×(sm+1)n that
are a solution to the minimization problem
minB1,...,Bm
m∑i=1
‖Bi‖0 (9.2)
s.t. (Ai, BM\Aii ) is controllable for all i ∈M, and
|Ai| ≤ si and Ai ⊂M for all i ∈M,
where the dimension of Bi is n × (si + 1)n to guarantee that there exists a solution to the
problem.
Similarly, we can model the case where we want to design a common input matrix, i.e.,
Bi = Bj = B for all i, j ∈ M, that controls the SLCS and for which a certain number of
inputs may fail, as finding an input matrix BM\A such that
x(t) = Aσ(t)x(t) +BM\Au(t) (9.3)
is controllable. Notice that A can be seen as Aσ(t), i.e., the inputs that may fail are the same
across dynamic’s switching. Therefore, another problem we are interested in is as follows:
given a common actuator placement, i.e., (9.3), the common rMCPS (crMCPS) consists in
the following problem.
Problem Statement 2. Determine matrix B from the minimization problem
minB∈Rn×(s+1)n
r ‖B‖0 (9.4)
s.t. (Ai, BM\A) is controllable for all i ∈M, and
|A| ≤ s and A ⊂M for all i ∈M,
where the dimension of B is n× (s+ 1)n to assure that there exists a solution to the problem.
Note that, concatenating s times the identity matrix results in an input matrix that is
a feasible solution to both problems, where the only relevant columns of B are the non-zero
100
ones. Although problems (9.1) and (9.3) seem to be very similar, the proposed solutions are
quite diverse, and in fact, they exhibit different computational complexity issues.
Additionally, to solve Problem 1 and Problem 2, we require two technical assumptions
that we now detail.
Assumption 1. The dynamics’ matrix of each mode, Ai with i ∈ M, is simple, i.e., Ai has
distinct eigenvalues.
Note that, many applications have dynamics’ matrices satisfying Assumption 1, e.g., dy-
namical systems modeled as random networks of the Erdos-Renyi type [Tao and Vu, 2016], or
benchmark dynamical systems in control system engineering [Ogata and Yang, 1970, Siljak,
2011].
Assumption 2. A left-eigenbasis of Ai is known for each mode i ∈ M (the set of the left-
eigenvectors of Ai).
Assumption 2 is a technical restriction. In general, the left-eigenbasis is acquired by
numerical methods and, hence, we obtain approximated eigenvectors up to a floating-point
error.
9.3 The Robust Minimal Controllability Problem
In this section, we present the main results of the chapter. We address problem rMCPS (9.2)
and crMCPS (9.4), by ‘decoupling’ the problems into their discrete and continuous optimiza-
tion properties. We start by identifying the structure of the solutions and, after, a numerical
realization of them that ensures controllability under the possible input failures. We introduce
the minimum set multi-covering problem, that we use in Algorithm 6 to build a solution to the
rMCPS and the crMCPS. Then, we find a dedicated solution to the rMCPS, in Theorem 9.3.6,
that is used together with Algorithm 8 to find a general solution to the problem, in Theo-
rem 9.3.7. A solution to the crMCPS is constructed, in Corollary 9.3.8, using Algorithm 7.
Finally, we show that the rMCPS is NP-complete, in Theorem 9.3.9.
We start by noticing that the Popov-Belevitch-Hautus (PBH) eigenvalues controllability
test gives us a sufficient controllability condition for SLCS (9.1) or (9.3).
Proposition 9.3.1. Given an SLCS, if for each mode i ∈M the (Ai, Bi) is controllable, then
the SLCS is controllable.
Hence, Proposition 9.3.1 bestow a polynomial method (in m and n) to check a sufficient
condition for the controllability of an SLCS. In other words, for each mode i ∈ M, and for
each eigenvalue λ ∈ σ(Ai), we only need to compute the rank of [Ai − λIn Bi].
101
However, this criterion does not inform about which entries of each Bi should be non-zero,
as well as with which particular values, to ensure the rank condition. That is, we can verify
in polynomial time that each Bi is a solution. Notwithstanding, we notice that the rMCPS is
computationally challenging to solve since a particular instance of the rMCPS, i.e., when (9.1)
has one mode, we get the minimal controllability problem (MCP) that is known to be NP-
hard, we can polynomially reduce an NP-hard problem to this problem, recall Theorem 8.4.8.
Thus, the rMCPS is as difficult as the latter, which leads to the following result.
Corollary 9.3.2. The rMCPS (9.2) and the crMCPS (9.4) are NP-hard.
Instead of a naıve usage of the PBH eigenvalue test that leads to a strictly combinato-
rial procedure for solving the SLCS, we may consider the PBH test for controllability using
eigenvectors, which allow us to design a sufficient condition for the controllability of an SLCS.
Proposition 9.3.3. Given (9.1), if for each mode i ∈ M and for each left-eigenvector v of
Ai we have that v†Bi 6= 0, then the system is controllable.
Proposition 9.3.3 plays a central role in this chapter’s main results.
As previously mentioned, we first address the discrete part, which requires us to introduce
the following constructs.
Definition 9.3.4. (Minimum Set Multi-covering Problem [Chekuri et al., 2012]) Given a
universe with m elements, U = 1, 2, . . . ,m, a collection of n sets S = S1, . . . ,Sn, where
Sj ⊂ U , for j ∈ 1, . . . , n, s.t.⋃nj=1 Sj = U , and a demand function d : U → N that imposes
the number of times the element i needs to be covered. The minimum set multi-covering
problem consists in finding a smallest set of indices I∗ ⊆ 1, . . . , n s.t. ∪j∈ISj = U and
every element i ∈ U is covered, at least, d(i) times, i.e.,
J ∗ = arg minJ⊆1,...,n
|J |
s.t. |j ∈ J : i ∈ Sj| ≥ d(i) .
A particular case of the problem in Definition 9.3.4 is when each element needs to be cov-
ered once, d(i) = 1 for all i ∈ U , called the minimum set covering problem, see [Chekuri et al.,
2012]. These two problems are ubiquitous in the fields of combinatorics, computer science,
and complexity theory. They are NP-complete problems for which efficient approximation
algorithms are known and well studied [Vazirani, 2013].
These problems are particularly useful to leverage the controllability criterion using the
PHB criterion to ensure the feasibility of the sparsest input. In particular, for the LTI case,
we have the following approach.
102
Lemma 9.3.5 ([Pequito et al., 2017]). Given a non-empty collection of non-zero vectors
vjj∈J , with vj ∈ Rn, the procedure of finding b∗ ∈ Cn that is a solution to
b∗ = arg minb∈Rn
‖b‖0
s.t. vk · b 6= 0, for all j ∈ J(9.5)
is polynomially reducible (in |J | and n) to a minimum set covering problem.
Next, we can build upon this problem and the notion of controllability for the SLCS, and
under the assumptions posed in Section 9.2, to find the sparsest set of vectors that ensure
controllability. Specifically, since we have different modes, the goal is to consider the sparsest
sets of inputs across the different eigenvectors of the left-eigenbasis.
Subsequently, we start by presenting an algorithm (Algorithm 6) that receives a collection
of structural vectors and outputs a setup for a set-covering problem. The sets we build, Sik,have two indices, i and k, the first matches the mode the structural vectors belong to, and
the second ranges from 1 up to the number of such vectors in that mode. A pair (i, j) belongs
to Sik whenever the jth structural vector of mode i is non-zero at index k.
Algorithm 6 Polynomial reduction of the structural optimization problem (9.5), to a set-
covering problem
Input: Consider the eigenvectors of the left eigenbasis of the different modes, vji i∈Mj∈1,...,|Ji|
, and
J =⋃
i∈M Ji.Output: The setup for a set multi-covering problem, S = Sij i∈M
j∈1,...,|Ji|and U , a set with n sets,
and the universe of these sets, respectively.
1: set Sji = ∅, for i ∈M and j ∈ 1, . . . , |Ji|2: for i ∈M
for j = 1, . . . , |J |for k = 1, . . . , n
if [vij ]k 6= 0 then
Sik = Sik ∪ (i, j)3: set S =
Sij
i∈Mj∈1,...,|Ji|
and U =⋃V∈S V.
To make our approach easier to follow, we first characterize the dedicated solutions to the
rMCPS, i.e., the solution where each input actuates a unique state variable.
Theorem 9.3.6. Given a left-eigenbasis, vi1, . . . , vin, for each Ai and for each mode i ∈M, and
given the number of possible input failures for each mode, s1, . . . , sm, consider the multi-set
covering problem (S,U ; d), where
• S =⋃i∈MSi1, . . . ,Si(si+1)n;
103
• U =⋃i∈M Ui, with Ui = (i, 1), . . . , (i, n);
• d(i, j) = si + 1, for (i, j) ∈ U ;
where Sik = (i, j) : [vij ]l 6= 0 and l− 1 = k mod n, for i ∈M and k ∈ K ≡ 1, . . . (si + 1)n.Then, the following conditions are equivalent:
(i) I∗ is a solution to the set multi-covering problem (S,U ; d) and I∗i = (i, a) : (i, a) ∈ I∗;(ii) the set of matrices B1 (I∗1 ) , . . . , Bm (I∗m), where Bi ∈ 0, 1n×(si+1)n, is a dedicated
solution to the rMCPS (9.2), with [Bi(I∗i )]j,k = 1 for k = j mod n and (i, j) ∈ I∗ ⊂ K, and
[Bi(I∗i )]j,k = 0 otherwise.
Proof. By Theorem 8.4.9, Bi(I∗i ) is an optimal feasible dedicated solution to controls mode i,
equivalent to a solution I∗i that covers Ui, for demand d(i, j) = si + 1. By Proposition 9.3.3,
the set of matrices B1 (I∗1 ) , . . . , Bm (I∗m) is a feasible dedicated solution to rMCPS (9.2),
equivalent to a solution of the multi-set covering problem (S,U , d).
Note that the first entry of the pairs constituting the universe of the set multi-covering
problem is what identifies the mode of the input matrix when recovering a solution to the set
multi-covering problem to a solution of the rMCPS (9.2).
The crMCPS (9.4) requires a different approach because we want to design an input
matrix that is equal to every mode. Intuitively, this is computationally more demanding,
since we want to minimize the size of the union of the solutions for each mode, i.e., we want
to actuate a small number of state variables that, across the modes, ensure the system to be
controllable. The first step to solve the problem consists in applying Algorithm 6, to build
the sets of S. The second step is to find a solution for the crMCPS (9.4), using S, U and the
demand function d(i) = s+1. We need to, carefully, choose one solution of indices of the state
variables that we need to actuate, which ensures each mode to be, not only, controllable, but
also robust to s input failures, maximizing the state variables in common across the modes.
To achieve this, we build an algorithm that needs to find all the solutions to several set multi-
covering problems, translating to all possible solution to the crMCPS (9.4) and, afterward,
we need to select one that actuates the smallest number of state variables across all modes.
We summarize this procedure in Algorithm 7 that selects a minimal number of state variables
across all modes that we need to actuate such that Proposition 9.3.3 holds, yielding a solution
to the crMCPS (9.4). Besides, note that Algorithm 7 has worst-case complexity exponential
(in m and n), since each set multi-covering problem may have an exponential number of
solutions.
Now, we go further and characterize the general solutions to the rMCPS (9.2) and the
crMCPS (9.4), not only dedicated ones. We derive the global solutions based on the dedicated
104
Algorithm 7 Find a minimal set of state variables of a problem (9.4) that need to be actuated
Input: An instance of the problem (9.4), a collection of sets S and a universe U , the output of
Algorithm 6 applied to the set of eigenvector for each dynamics matrix Ai, and demand function d,
withvji
i∈M
j∈1,...,nand d(i) = s+ 1, respectively.
Output: A set of state variables’ indices to actuate s.t. the problem instance is controllable,
I ⊆ 1, . . . , n.
1: for i ∈Mset Si = St
k : t = i and k ∈ 1, . . . , nset Ui = k : (i, k) ∈ Ufor j = 1, . . . , n
set Bi as the set of all possible covers for Uiwith the collection of sets Si and demand d
2: set X ∗1 , . . . ,X ∗m = arg minX1∈B1,...,Xm∈Bm|⊔
i∈M Xi|3: set I =
⊔i∈M X ∗i and B = Bii∈M
solutions by combining them. Towards this goal, we propose a merging procedure in Chapter 8
that we summarize in Algorithm 8. The procedure tries to combine in the smallest possible
number of inputs, the entries of the dedicated inputs while ensuring that the PHB eigenvectors
criterion holds. Specifically, Algorithm 8 picks two compatible inputs, i.e., with different
structure and structural inner-product 0. In this procedure, when we combine two compatible
inputs, we set the first one to actuate the variables that both actuate, and we discard the
second one (the respective column is set to zero).
Algorithm 8 Merging procedure
Input: An input matrix B ∈ 0, ?n×m.
Output: The matrix B = [ b1 ... bm ] with inputs merged.
1: while ∃i,j : bi 6= bj 6= 0 and bi · bj = 0
set bi = bi + bj and bj = 0
Up to this point, Algorithm 8 is about the structure of the input matrix B, and we now
build a numerical realization of it. Subsequently, we need to solve the following problem
to perform the second step required to obtain a solution to the rMCPS (9.2) and the crM-
CPS (9.4), i.e., a parametrization B∗ of the structural matrix B (a feasible solution to be
orthogonal to a given set of m = |J | vectors, vjj∈J ):
B∗ = arg minB∈Rn×m
0
s.t. B · vj 6= 0, for all j ∈ J and B has the structure of B.(9.6)
105
As a consequence, we can obtain a solution for the rMCPS (9.2) as described in the
following result.
Theorem 9.3.7. Let Bi(I∗i )mi=1, where Bi ∈ 0, 1n×(si+1)n, be a dedicated solution to the
rMCPS (9.2), obtained with Theorem 9.3.6. Further, let Bimi=1, where Bi ∈ 0, ?n×(si+1)n
be the sparsities of the matrices resulting from the merging procedure, Algorithm 8, between any
of the effective inputs, for each Bi(I∗i ). Then, the set of matrices B∗i mi=1, where each matrix
B∗i ∈ Rn×n is obtained using the optimization task (9.6) for inputs Bi and left-eigenbasis of
Ai is a solution to the rMCPS (9.2).
Proof. By Theorem 9.3.6, then we have a feasible solution to the rMCPS (9.2), and, by
construction, Algorithm 8, we preserve the feasibility by merging only compatible inputs.
Also, by construction, the optimization problem (9.6) ensures that the obtained numerical
realization of the solution’s sparsity verifies the PHB eigenvector criterion. Hence, we obtain
a solution to the rMCPS (9.2).
In fact, the above also applies to the crMCPS (9.4), considering a matrix B, instead of
the set of matrices Bimi=1. Hence, the next result readily follows.
Corollary 9.3.8. Let B(I∗i ), with B ∈ 0, 1n×(s+1)n, be a dedicated solution to the crM-
CPS (9.4), obtained with Theorem 9.3.6. Further, let B ∈ 0, ?n×(s+1)n be the sparsity of
the matrix resulting from the merging procedure, Algorithm 8, between any of the effective
inputs. Then, the matrix B∗ obtained using the optimization task (9.6) for inputs B and
left-eigenbasis of A1, . . . , Am is a solution to the rMCPS.
It is worth noticing that the decision version of the rMCPS (9.2), when using the sufficient
condition of Proposition 9.3.3, is equivalent to the set multi-covering problem and the rM-
CPS (9.2) is NP-complete for that controllability sufficient condition. For the crMCPS (9.4),
since we lose the information about the modes when we build the input matrix B, the deci-
sion version of the problem is not equivalent to the decision version of the set multi-covering
problem when using the sufficient condition of Proposition 9.3.3.
Theorem 9.3.9. By using the sufficient condition for controllability in Proposition 9.3.3, the
rMCPS (9.2) is NP-complete.
Proof. Using Proposition 9.3.3 as the controllability condition, the rMCPS (9.2) is equivalent
to the set covering problem, by Theorem 9.3.6 and Theorem 9.3.7, the result follows.
106
9.4 Illustrative Examples
In this section, we illustrate the main results from this chapter. In order to do so, we fix an
SLCS with two modes and dynamics matrices A1 and A2 given by
A1 =
2 0 0 −3 0
0 3 0 0 0
2 1 4 3 0
0 0 0 5 0
−2 −1 −3 −7 1
; A2 =
2 −1 0 −3 0
0 3 0 0 0
2 1 4 3 0
0 0 0 5 0
−2 −1 −3 −7 1
.
The dynamics matrices’ left-eigenvalues are, respectively,
V1 =
| |v1
1 . . . v51
| |
=
0 1 0 1 0
0 0 1 1 0
1 0 0 1 0
1 1 0 0 1
1 0 0 0 0
; V2 =
| |v1
2 . . . v52
| |
=
0 1 0 1 0
0 0 1 0 0
1 0 0 1 0
1 1 0 0 1
1 0 1 0 0
.
The eigenvalues of each mode are σ(A1) = σ(A2) = 1, 2, 3, 4, 5. By design, the spectrum is
equal, but do not need to be, as the only assumption is that the state matrices need to be
simple.
9.4.1 Example I
Next, we explore this example when we consider that no input can fail (s = 0) as an instance
of both the rMCPS (9.2) and the crMCPS (9.4), and when, in both cases, some inputs may
fail.
The rMCPS (9.2) scenario
First, we consider we want to design input matrices for each mode that actuate the minimal
number of inputs that, by Proposition 9.3.3, ensures the system to be controllable. Applying
Algorithm 6, we obtain the following set: S11 = (1, 3), (1, 4), (1, 5), S1
2 = (1, 1), (1, 4),S1
3 = (1, 2), S14 = (1, 1), (1, 2), (1, 3) and S1
5 = (1, 4), that correspond to the first mode
of the system. For the second mode of the system, we get sets S21 = (2, 3), (2, 4), (2, 5),
S22 = (2, 1), (2, 4), S2
3 = (2, 2), (2, 5), S24 = (2, 1), (2, 3) and S2
5 = (2, 4). This yields
the universe U = ∪2i=1(i, 1), (i, 2), (i, 3), (i, 4), (i, 5). Now, solving the associated set multi-
covering (set covering) problem (S,U ; d = 1), we obtain as a solution that U = S11 ∪S1
4 ∪S21 ∪
S22 ∪ S2
3 .
107
This leads to a structure of a solution to the rMCPS (9.2) of B1 = [ ? 0 0 ? 0 ]ᵀ and
B2 = [ ? ? 0 0 0 ]ᵀ. Subsequently, we can check that by setting every non-zero entry of
B1 and B2 as 1, by Proposition 9.3.3, we obtain a solution to the problem.
The crMCPS (9.4) scenario
Now, our aim is to design a common input matrix that controls the system. First, we
need to compute all possible solutions, for each mode of the system. For both modes,
the universe of the associated set covering problem is U = 1, 2, 2, 3, 4, 5. Now, we ap-
ply Algorithm 7 and we get, for the first mode, S1 = 1, 4, 5, 1, 4, 2, 1, 2, 3, 4,and, thus, each of the following sets of indices constitute a solution that covers the universe
B1 = 1, 4, 1, 2, 4, 1, 3, 4, 1, 4, 5, 1, 2, 3, 4, 1, 2, 4, 5,1, 3, 4, 5, 1, 2, 3, 4, 5, that is, for each set X ∈ B1, we have that U =
⊔i∈X [S1]i. Note that
5 only belongs to the first set of S1, and 3 is only in the forth set of S1. Hence, these two
sets need to belong to the solution. Analogously, for the second mode, we have that S2 =
3, 4, 5, 1, 2, 4, 2, 5, 1, 3, 4, and all solutions of the associated set covering prob-
lem are B2 = 1, 2, 3, 1, 3, 4, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 5 , 1, 3, 4, 5, 1, 2, 3, 4, 5 . By
combining the two sets of possible solutions, we obtain the result that consists in selecting, for
instance, 1, 4 ∈ B1 together with 1, 3, 4 ∈ B2. Thus, 1, 3, 4 are the set of state variables
that we need to actuate, in both modes, to attain controllability. Then, an input matrix with
dedicated inputs may have the following sparsity
B =
? 0 0 0 0
0 0 ? 0 0
0 0 0 ? 0
ᵀ
.
Now, resorting to Algorithm 8, we get the input matrix pattern with merged columns B =
[ ? 0 ? ? 0 ]ᵀ, and we can check that B = [ 1 0 1 1 0 ]ᵀ is a solution to the problem.
Note that the solutions to the problem (9.2) and (9.4) instances are distinct. In fact, if we
set B1 and B2 in Section 9.4.1 as B, we get a solution to the problem (9.2) instance, that is
not minimal in each mode.
9.4.2 Example II
Now, we explore the scenario where a set of inputs may malfunction in the switching system.
We want to account for them when designing the inputs, and the respective variables that
they control, to still be able to control the system.
108
The rMCPS (9.2) scenario
Suppose now that in the first mode, there are not inputs that may fail, but, in the second
mode, one input may fail. In other words, s1 = 0 and s2 = 1, which translates to have, in the
corresponding set multi-covering problem, d(1, j) = 1 and d(2, j) = 2 for j ∈ U , with U and
collection os sets S as in 9.4.1.
Now, a solution to the problem is U = S11 ∪ S1
4 ∪ S21 ∪ S2
2 ∪ S23 ∪ S2
4 , translating to the
pattern for the input matrices, when considering dedicated inputs,
B1 =[? 0 0 ? 0
]ᵀ, and B2 =
? 0 0 0 0
0 ? 0 0 0
0 0 ? 0 0
0 0 0 ? 0
ᵀ
.
We apply Algorithm 8 and obtain that a single input controls the second mode, and each non-
zero value of the inputs matrices being 1 verifies Proposition 9.3.3, i.e., the input matricesB1 =[1 0 0 1 0
]ᵀand B2 =
[1 1 1 1 0
]ᵀare a solution to the problem instance.
The crMCPS (9.4) scenario
Finally, suppose the objective is to design a single input matrix that not only ensures that
the system is controllable, but also that the system remains controllable whenever, at most,
one input fails. In other words, s = 1, which we is reflect in the demand function for the set
multi-covering problems of Algorithm 7, d = 2.
By recalling the sets B1 and B2 from Section 9.4.1, we know that a solution for the first
mode must have twice the indices 1 and 4 so that we cover elements 3 and 5 twice. Hence,
a possible and minimal solution is the set of indices I = 1, 1, 2, 3, 4, 4. In fact, we can
check that this also produces a solution for the second mode. Hence, the pattern of the
solution, when considering dedicated inputs, and a solution, after applying Algorithm 8, are,
respectively,
B =
? ? 0 0 0 0
0 0 ? 0 0 0
0 0 0 ? 0 0
0 0 0 0 ? 0
0 0 0 0 ? 0
, and B =
[1 1 1 1 0
1 0 0 1 0
]ᵀ.
We can find others solutions by changing the merging order. If we set B1 and B2 in Sec-
tion 9.4.2 as B, we get a solution to the problem (9.2) instance, the same we obtained.
However, we want to ensure the system to be robust to one input failure for each mode, while
109
in Section 9.4.2, we want to ensure that the system is robust to one input failure only in the
second mode.
9.5 Concluding Remarks
In this chapter, we addressed two robust minimal controllability problems for switched linear
continuous-time systems, extending the results of Chapter 8. The first is to design an input
matrix for each mode that ensures the switched system to be controllable. The second is
to design a common input matrix guaranteeing the switched system to be controllable. We
showed that the first problem is NP-complete. We reduce both problems to set multi-covering
problems in a two-step, solving first the discrete nature of the problem and afterward the con-
tinuous one. These reductions allow deploying approximated and efficient algorithms to solve
set multi-covering problem instances, to get solutions to rMCPS (9.2) or the crMCPS (9.4)
instances that are feasible and have sub-optimality guarantees.
Future work involves exploring the structure of the problems to attain optimal solutions,
leveraging the eigenbasis’ structure. Further, we want to consider, besides the number of
inputs for controllability, obtaining a certain controllability index, minimizing the number of
inputs and the number of times that we need to actuate the system. Last, we want to relate
macroscopic interconnections between dynamical systems, leading to a modular approach to
the actuation placement that ensures controllability.
110
Chapter 10
On the index of convergence of
Boolean matrices with commutative
SD-decomposition
Finally, we explore a digraph decomposition used in structural control systems, to propose a
new bound for the index of convergence of Boolean matrices that have a digraph decomposition
with certain properties. This work is submitted for publication, see [Ramos and Caleiro, 2018].
10.1 Introduction
In this chapter, we present a new bound for the index of convergence (transient) for a large
class of Boolean matrices that emerge in several application domains. The index of conver-
gence and the period of a matrix are paramount in applications such as transportation systems,
production plants cyclic scheduling, network synchronizers, and distributed algorithms in the
scope of routing or resource allocation, see [Akian et al., 2006]. For instance, the termination
time of the Full Reversal algorithm for message routing in computer networks is equivalent
to the index of convergence of the dynamical system’s matrix [Charron-Bost et al., 2011].
10.2 Preliminaries & Terminology
We next introduce the notation and preliminaries specific to this chapter.
10.2.1 Boolean Matrices
We start by recalling Boolean matrices and the algebra we use to operate them. Next, in
tandem with Boolean matrices, we also overview directed graphs, and we present known
111
results for Boolean matrices based on directed graphs.
Let B = 0, 1. We use the max-min algebra [Gavalec, 1997] to operate elements of B. If
a, b ∈ B, then a ⊕ b = maxa, b, and a ⊗ b = mina, b. If k, n,m ∈ N1 (N1 ≡ N \ 0) and
k, n,m > 0, then a vector v ∈ Bk and a matrix A ∈ Bn×m are a Boolean vector and a Boolean
matrix, respectively. We denote the inner product of u, v ∈ Bn by u v, using the max-min
algebra, i.e., u v =⊕n
i=1 ui ⊗ vi. We denote by In the n× n identity matrix, dropping the
n when it is obvious from the context. Given two matrices A ∈ Bn×m and B ∈ Bm×k, the
product of A and B is a matrix C = A ⊗ B ∈ Bn×k, obtained by the usual matrix product,
but using the inner product . Let s ∈ N, we denote the s-th power of the matrix A ∈ Bn×n
as A⊗s. It is inductively defined as A⊗s = In, if s = 0, and A⊗s = A ⊗ A⊗(s−1), otherwise.
We say that two square matrices A,B ∈ Bn×n commute whenever A⊗B = B ⊗A.
Boolean matrices emerge in several applications. For instance, they emerge in [Liu and
Wang, 2007,Satta, 1994,Prosser, 1959,Cheng, 2011], designated structured matrices, and they
are used to encode the structure of real/complex matrices, to model physical systems where
we do not know the exact values of system variables, but we know the location of zeros. We
used structured matrices in Chapter 8 and Chapter 9, and introduced this notion in Chapter 7,
where we used ? no denote a non-zero entry. In this chapter, because we only use Boolean
matrices, we no longer need to use ? so that we distinguish a non-zero entry from a real entry
equal to 1. Therefore, we use B = 0, 1 to improve readability, and to follow the standard
notation from this area.
The set Bn×n, with the ⊗ times operation, is a semigroup of order 2n2. Also, given
A ∈ Bn×n, the sequence of powers (A⊗i)i∈N forms a finite subsemigroup 〈A〉 of Bn×n. Thus,
there is a least positive integer k = k(A) such that A⊗k = A⊗(k+t), for some integer t > 0.
Further, there is a least positive integer p = p(A) such that A⊗k = A⊗(k+p). Such k and
p are called the index (of convergence) or transient, and the period (of convergence) of A,
respectively.
A Boolean matrix A ∈ Bn×n is said to be reducible if there exists a permutation P ∈ Bn×n
such that P ⊗ A⊗ P ᵀ =
[B 0
D C
], where B,C are square matrices. A is irreducible if it is
not reducible.
A Boolean matrix A ∈ Bn×n is primitive if and only if A⊗k = 1, where 1 is the n × nmatrix with every entry equal to 1. If A is primitive and irreducible then p(A) = 1. If
A⊗k = 1, then k is called the primitive exponent of A, denoted by γ(A). If A is primitive
then k(A) = γ(A).
112
10.2.2 Digraphs
Boolean matrices may be represented as directed graphs (digraphs). A digraph G is a pair
(V,E), where V is a finite set of points in N1, called vertices and E ⊆ V × V is a relation
between vertices such that if u, v ∈ V and (u, v) ∈ E, then we say that the ordered pair (u, v)
is an edge that starts in u and ends in v. Given a Boolean matrix A ∈ Bn×n, its digraph
representation G(A) = (V,E) has V = 1, . . . , n and E = (i, j) : Aij = 1 and 1 ≤ i, j ≤ n.In turn, A is called the adjacency matrix of G(A). Similarly, given a digraph G = (V,E), we
can obtain its equivalent Boolean matrix.
Next, we introduce some graph theoretic notions [Bollobas, 2012] to make the manuscript
self-contained. Let G be a digraph, a walk of size k ∈ N from vertex u to vertex v is a sequence
of vertices 〈v1, v2, . . . , vk, vk+1〉 such that (vi, vi+1) ∈ E for 1 ≤ i ≤ k. A path is a walk that
does not repeat vertices. A cycle in G, 〈v1, v2, . . . , vk, v1〉, is a path 〈v1, v2, . . . , vk〉 such that
(vk, v1) ∈ E. The girth of G is the shortest size of a cycle contained in the G. If G does not
have cycles, then it is a directed acyclic graph (DAG). A self-loop DAG is a digraph with a
self-loop in each node such that, if we remove the self-loops, we obtain a DAG. A directed
tree is a DAG in which a node is assigned to be the root, and there is exactly one path from
the root to each node. A directed r-tree is a directed tree such that each node has, at most, r
edges to other nodes. In a directed tree, a leaf is a node that does not have outgoing edges.
The height of a directed tree is the maximum of the lengths of paths from the root to any leaf.
A balanced directed r-tree is a directed r-tree such that the lengths of any two paths between
the root to a leaf differ from 1. A self-loop directed r-tree is a digraph with a self-loop in each
node such that, if we remove the self-loops, we obtain a directed r-tree.
The shortest path problem is the problem of finding a path between a source vertex s
and a target vertex t in G, such that we minimized the number of its edges. The problem of
finding the path with a maximum number of edges from vertex s to vertex t in G is the longest
path problem, lp(G). We may solve the shortest path problem in polynomial time [Cormen
et al., 2001]. The longest path problem is NP-hard, though for DAGs it has a linear time
solution [Sedgewick and Wayne, 2011].
A trail in a digraph G is a walk that does not repeat edges. A circuit is a trail that starts
and ends in the same vertex. The diameter of G is the size of the longest shortest path between
any pair of vertices. A sub-graph of a digraph G = (V,E) is a digraph H = (V ′, E′) such that
E′ ⊆ E, V ′ ⊆ V and E′ ⊆ V ′×V ′. A digraph G = (V,E) is strongly connected (SCD) if there
exists a path between every pair of vertices (u, v) ∈ V × V . A strongly connected component
(SCC) of a digraph is a maximal sub-graph that is strongly connected. The cyclicity of a
strongly connected digraph G is the greatest common divisor of the lengths of all circuits of
the graph. The cyclicity of a digraph G is the least common multiple of the lengths of its
113
SCCs.
A Boolean matrix A is irreducible if and only if G(A) is strongly connected.
10.2.3 Known Results
Now, we present a set of results that explore properties of Boolean matrices by means of
digraphs properties. A more detailed survey may be found in [Li and Shao, 1993]. The first
result about the index of a Boolean matrix A is the following.
Proposition 10 ( [Wielandt, 1950]). If the matrix A ∈ Bn×n is primitive and G(A) is strongly
connected, then k(A) ≤ (n− 1)2 + 1.
The next result provides a tighter bound for the index of a Boolean matrix, using the
girth of its associated digraph.
Proposition 11 ( [Denardo, 1977,Dulmage et al., 1964]). If A ∈ Bn×n is a primitive matrix
such that G(A) is strongly connected with girth g, then the value of k(A) is O(g · n).
Another result sharpening the bounds of the index of a Boolean matrix, using the cyclicity
of the digraph representation, is the following.
Proposition 12 ( [Schwarz, 1970]). Let G(A) be a non-primitive strongly connected digraph,
then the index of A verifies k(A) ≤ (n − 1)2 + 1. Further, if the cyclicity of G(A) is δ, then
k(A) is O(n2/δ).
The previous result suggests that the higher the cyclicity of G(A), the lower the index
of A. Notwithstanding, the girth of a strongly connected digraph is always greater or equal
to the cyclicity, and Propositions 11 and 12 suggest a necessary trade-off between these two
quantities to attain a small index of convergence. A more recent upper bound was provided
in [Kim, 1979], generalizing Propositions 11 and 12, as follows.
Proposition 13 ( [Kim, 1979]). Let G(A) be a strongly connected digraph, with n vertices,
girth g and cyclicity δ, then the index of A is at most n+ g ·(⌊
nδ
⌋− 2).
Even more recently, in [Merlet et al., 2017] the authors identify the matrices that actually
achieve two particular bounds on the indices of Boolean matrices, which are generalizations
of the bounds of Wielandt and Dulmage-Mendelsohn.
10.3 Index of Boolean matrices with commutative SD-decom-
position
Here, we present a new bound for a class of Boolean matrices that we next detail. First, we
introduce a decomposition of a digraph, inspired by results from structured control systems,
114
see [Ramos et al., 2015].
Definition 10.3.1. Given a digraph G with adjacency matrix A, its SCC and DAG decom-
position, SD-decomposition, consists of S, the adjacency matrix of the subgraph of its SCCs,
and D, the adjacency matrix of the subgraph of the DAG that results from removing the SCCs
of G, such that A = S ⊕ D. A has a commutative SD-decomposition (CSDD), whenever
S ×D = D × S.
Observe that for G(A) = (V,E), A ∈ Bn×n, we can compute an SD-decomposition in
Θ(|V |+ |E|) time, using Tarjan’s algorithm to compute the SCCs of G(A), see [Tarjan, 1972].
Further, we can check if an SD-decomposition commutes in O(n2.3728639), see [Le Gall, 2014].
Hence, given A ∈ Bn×n, we can check if its SD-decomposition commutes in O(n2.3728639).
Next, we characterize the digraphs that have a CSDD.
Proposition 14. Let A ∈ Bn×n, let G(A) = (V,E) be its digraph representation and let
A = S⊕D be its SD-decomposition, with digraph representations G(S) = (V,ES) and G(D) =
(V,ED), respectively. Then, A has a CSDD if and only if for all i, j ∈ V there is a path of
size two starting in i and ending in j that passes in some vertex k ∈ V with (i, k) ∈ ES and
(k, j) ∈ ED if and only if there is a path of size two starting in i and ending in j that passes
in some vertex k′ ∈ V with (i, k′) ∈ ED and (k′, j) ∈ ES.
Proof. Let (S ⊗ D)ij = 1. This means that there exists k s.t. Sik = 1 and Dkj = 1 and,
thus, there is a path starting in i and ending in j that passes in some vertex k ∈ V with
(i, k) ∈ ES and (k, j) ∈ ED. (S ⊗D)ij = 0, otherwise. Hence, A has a CSDD if and only if
(S ⊗D)ij = (D ⊗ S)ij for all i, j ∈ V .
To illustrate Proposition 14, we consider the two digraphs depicted in Figure 10.1 and
their respective SD-decompositions. In each digraph, the red edges correspond to S and the
black edges correspond to D. In digraph (a) there is a path between vertices 1 and 4 with a
red edge before a black edge, but there does not exist a path with a black edge before a red
edge. For this reason, (a) does not have CSDD. In digraph (b) there is a path with a black
edge before a red edge between two distinct vertices, if, and only if, there is a path with a red
edge before a black edge. Therefore, (b) has CSDD.
In Table 10.1, we present the number of Boolean matrices with CSDD in Bn×n. Although
we do not have a closed form for the proportion of size n CSDD Boolean matrices, we notice
that all strongly connected digraphs correspond to matrices with a CSDD. Furthermore, up
to n = 5, around half of the matrices have a CSDD.
115
(a)
(b)
Figure 10.1: Two digraphs and their respective SD-decompositions.
n 1 2 3 4 5 6
SCD 1 4 144 25 696 18 082 560 47 025 585 664
CSDD 1 12 260 30 444 18 819 092 47 543 429 052
Bn×n 1 16 512 65 536 33 554 432 68 719 476 736
Table 10.1: Number of size n CSDD Boolean matrices vs. size n Boolean matrices.
In fact, the class of Boolean matrices that commute has interesting properties, as studied
in [Katz et al., 2012]. In this work, the authors investigate the Frobenius normal forms of
commuting matrices, and they show how the intersection of eigencones of commuting matrices
can be described, considering connections with Boolean algebra, to prove that commuting
irreducible matrices in the max-min algebra have a common eigennode.
Notwithstanding, several applications use digraphs that may have a CSDD. In [Gao et al.,
2010], the authors make use of DAGs for which they add self-loops in some vertices in the
context of automata theory. In [Lin, 2012], the authors use the same kind of digraphs to
model and verify multithreaded programs. Also, in [Fletcher et al., 2012] the authors defined
an extended DAG (EDAG), a digraph that becomes a DAG when the self-loops are deleted
and where for each path there is, at most, one self-loop. They used EDAGS to study the
expressive power of navigational query languages on graphs that represent binary relations.
Theorem 10.3.2. Let A ∈ Bn×n, with G ≡ G(A) its digraph representation. Let A =
S ⊕D be its SD-decomposition, with SCCjlj=1 the strongly connected components of G(S),
and G(D) be the DAG that corresponds to G when we remove its SCCs. Further, let m =
lcm(k(SCC1), . . . ,K(SCCl)) and d = |lp(G(D))|+ 1. If [S,D] = 0 then
k(A) ≤
min2(m+1)·(d+1) − 1, 2n
2 if S,D 6= 0
m if D = 0
d if S = 0
.
116
Proof. Let A ∈ Bn×n and G ≡ G(A) be its associated digraph representation. Further, let
A = S ⊕ D be the respective SD-decomposition, where the matrix S corresponds to the
adjacency matrix of SCCjlj=1, the strongly connected components of G, and D corresponds
to the remaining digraph, which is a DAG. In fact, without loss of generality, we can see S
as a block diagonal matrix, where each block corresponds to an SCC. In other words, we can
rename the vertices of the digraph in such a way that the matrix is block diagonal, and this
corresponds to applying a permutation to the rows of A. The matrix D is the DAG built
from G when we remove the SCCs. Let us compute At for t ∈ N1. Let S,D 6= 0, since A has
a CSDD, we can use the binomial expansion A⊗t = (S ⊕D)⊗t =
t⊕i=0
S⊗(t−i) ⊗D⊗i. Now, we
explore the behavior of the powers of D. Since D is the adjacency matrix of a DAG, we know
there exists d ≤ n such that d− 1 is the size of the longest path between any pair of vertices.
Therefore, we have that D⊗d = D⊗(d+1) = 0, in other words, there are not walks in a DAG
with size larger than d− 1 and k(D) = d.
Finally, we study the behavior of the powers of S. Let kj ≡ k(SCCj) denote the index of
its corresponding block in S. Then k(S) = lcm(k1, . . . , kl). Let m denote the value of k(S).
Putting all pieces together, to compute A⊗t, we only need to consider products of ordered
pairs in X = I, S, . . . , S⊗m × I,D, . . . ,D⊗d \ (I, I). More precisely, we can eventually
have different powers of A for each sum of product of pairs of subsets of X. There are
M = (m+ 1) · (d+ 1) ordered pairs to consider, andM∑i=1
(M
i
)= 2M − 1 such subsets. Thus,
we have that k(A) ≤ 2M − 1 = 2(m+1)·(d+1)− 1, and since k(A) ≤ 2n2, the number of different
n × n Boolean matrices, it follows that k(A) ≤ min2(m+1)·(d+1) − 1, 2n2. If S = 0, then
k(A) ≤ d, and if D = 0, then k(A) ≤ m.
The result of Theorem 10.3.2 allows us to apply one of the known bounds to each SCC,
Propositions 10-13, and get a bound for the index of convergence of a Boolean matrix that,
contrarily to Theorem 10.3.2, does not depend on the index of convergence of other matrices.
Corollary 10.3.3. Let A ∈ Bn×n, with G ≡ G(A) its digraph representation. Let A = S ⊕Dbe its SD-decomposition, with SCCjlj=1 the strongly connected components of G(S) and
G(D) be the DAG that corresponds to G when we remove its SCCs. Further, let k(SCCi) ≤ ki(a bound for the index of SCCi), for 1 ≤ i ≤ l, let m =
∏li=1 ki, and let d = |lp(G(D))|+ 1.
If [S,D] = 0 then
k(A) ≤
min2(m+1)·(d+1) − 1, 2n
2 if S,D 6= 0
m if D = 0
d if S = 0
.
117
We can identify some classes of digraphs for which the bound from Theorem 10.3.2, ap-
proximated by Corollary 10.3.3, is clearly better than the size of all possible powers, i.e., when
2(m+1)·(d+1) − 1 2n2.
• balanced directed r-trees (r ≥ 2): m = m = 0, d = logr n+ 1 and k(A) ≤ 2logrn+2− 1 ≤4n− 1 = O(n);
• self-loop balanced directed r-trees with n vertices: m = m = 1, d = logr n + 1 and
k(A) ≤ 22(logrn+2) − 1 ≤ 16n2 − 1 = O(n2);
• DAGs with n vertices: m = m = 0, d ≤ n+ 1 and k(A) ≤ 2n+2 − 1 = O(2n);
• self-loop DAGs with n vertices: m = m = 1, d ≤ n+ 1 and k(A) ≤ 22(n+2)− 1 = O(2n);
• Digraphs with a giant SCC and an edge from a vertex in the giant SCC to each
vertex (outside the giant SCC) that only has a self-loop, see Figure 10.2(c): k(A) ∈O(
2(n+g·(bnδ c−2)))
.
10.4 Illustrative Examples
To illustrate the bounds of Theorem 10.3.2 and Corollary 10.3.3, we consider some examples
of Boolean matrices having a CSDD.
Consider the family of matrices Ai1∞i=1, where Ai1 ∈ B(2i+1)×(2i+1), i ∈ N, with CSDS of
Si1 ⊕Di1, as the adjacency matrices of the digraphs depicted in Figure 10.2 (a).
Ai1 = Si1 ⊕Di1,
Si1 =
B
. . .
B
1
, Di1 =
0 1
. . ....
0 1
0 · · · 0 0
, where B =
[0 1
1 0
].
The digraph G(Ai1) has n = 2i+ 1 vertices and it has s+ 1 SCCs, one self-loop and s cycles
of size 2. Hence, using Theorem 10.3.2, we have that m = 1 + 1, using Proposition 10, and
d = 1 + 1, which implies that k(Ai1) ≤ 2(2+1)(2+1) − 1 = 29 − 1 = 511, instead of the know
bound of 2(2i+1)2 .
118
...
(a)
(b)
(c)
Figure 10.2: A family of digraphs G(Ai1)∞i=1 in (a); digraph G(A2) in (b); and digraph G(A3)
in (c). The SCCs are represented by the red edges in (a), each cycle is an SCC, and they are
represented by different (not black) colors, one per SCC, in (b) and (c). The DAG of each
digraph is represented by the black edges.
Now, consider the matrix A2 with SD-decomposition of S2 ⊕D2
A2 =
1 0 0 0 0 1 0
1 1 0 0 1 0 0
1 1 1 0 1 1 1
0 1 0 0 1 0 0
1 1 0 1 0 1 0
0 1 0 1 1 1 0
0 1 1 1 1 1 1
,
119
S2 =
1 0 0 0 0 1 0
1 1 0 0 1 0 0
0 0 1 0 0 0 1
0 1 0 0 1 0 0
1 1 0 1 0 1 0
0 1 0 1 1 1 0
0 0 1 0 0 0 1
and D2 =
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 0 0 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 1 0 1 1 1 0
.
In this example, the G(A2) has two SCC. They are depicted in the red edges, SCC1, and in
the blue edges, SCC2, of Figure 10.2 (b), corresponding to G(S2). The black edges edges of
Figure 10.2 (b) represent the DAG subgraph of G(A2), G(D2). By Proposition 11, we have
that the girth of both SCCs is 1 because both have a self-loop, a cycle of size 1. Hence,
k(SCC1) ≤ 2×1 = 2 and k(SCC2) = 5×1 = 5. Moreover, the longest path of G(D2) has size
d = 1. By Corollary 10.3.3, we have that k(A2) ≤ 2(2×5)×(1+1) − 1 = 4.194.303, much smaller
(10.000 times smaller) than the obvious bound of 262 = 68.719.476.736. Last, we consider
the 5-bus electrical power system [Ramos et al., 2013], depicted in Figure 10.3, with digraph
representation of the dynamics state matrix of Figure 10.2 (c).
l2 l3
G1 G2
L1
G3
L2
l1 l4
l5
l6
Figure 10.3: Graph representation of a 5-bus power system.
The SD-decomposition, A3 = S3 ⊕ D3, is commutative and G(A3) has l = 3 SCCs, see
Figure 10.2 (c). Using Proposition 13 with Corollary 10.3.3, we have that the girth and
cyclicity of G(S3) are g = 2 and δ = 1, respectively. Thus, m = 16 + 2× (16− 2) = 44, d = 1
and k(A3) ≤ 2(44+1)×(1+1) − 1 = 290 − 1, instead of the previous 2(182) = 2324 known bound.
120
10.5 Concluding Remarks
In this chapter, we explored the index of convergence of Boolean matrices, and we presented
a new bound for the class of Boolean matrices that have a CSDD. Our result makes use
of previously know bounds to present a new bound for the class of Boolean matrices with
commutative SD-decomposition, enabling us to use previous known bounds to each one of the
constituents of the decomposition. Future work includes exploring the exact proportion of size
n square Boolean matrices that has CSDD, and to find more important classes of digraphs
for which the new presented bound is helpful.
121
Chapter 11
Conclusions and Future Work
The main goal of this dissertation was to study information security aspects in ranking,
recommender, and control systems, and to develop systems in these classes that are more
robust to attacks. We used the area of information theory, in particular of Kolmogorov
complexity theory, graph theory, clustering and collaborative filtering, and the area of control
systems theory as a valuable tools to approach the fields of ranking and recommender systems,
improving the state-of-the-art in both fields by designing systems more robust to attacks and
noise, and also to design systems that are more efficient in terms of computational complexity.
Moreover, we used ideas from control systems theory and applied them in the context of
ranking systems. We studied the bribing effect in ranking systems and the optimal and
profitable strategies to attack them.
The Part I of this dissertation used information theory and ideas from control systems,
Part II, in the context of e-commerce. In Chapter 3, we created a multipartite reputation-
based ranking system that clusters users using similarity measures, based on users’ tastes.
The system is more user personalized. Further, we proved the convergence and efficiency
of the system, and we tested it against the state-of-the-art ranking systems, obtaining more
robustness to noise and attacks. In Chapter 4, we studied the effect of bribing in ranking
systems, we calculated the optimal bribing strategies, and we tested the resistance to bribery of
bipartite ranking systems versus multipartite ranking systems, being the latter more resistant
than the former.
As future work, we would like to evaluate the personalization perspective of our ranking
system, i.e., to measure how better the proposed ranking system produces items’ rankings
for each user when compared to bipartite ranking systems. Hence, we would like to measure,
through different metrics, if our ranking generates preferences that are close to the individual
preferences. Also, we would like to consider a game theory model with the sellers as players, to
study the interactions between bigger and smaller players (big and small companies), and the
123
scenario where sellers can bribe users to not only to increase, but also to decrease a competitor
item’s ranking. Moreover, it would be interesting to study the optimal bribing strategies in
the cases where the reputations are dynamically computed (change when the ratings change),
as in the ranking system we propose in Chapter 3.
By exploring the similarity measures that we proposed, we introduced, in Chapter 5, a
recommender system that is highly parallelizable. Besides, the KS similarity measure, that
we introduced to compute how related each pair of users, and pair of items, is has optimal
time complexity. By evaluating its performance with synthetic and real datasets, we obtained
results similar to, and sometimes better than, the state-or-the-art recommender systems. We
presented a novel group recommender system, by exploring the KS similarity measure that
we proposed in the grouping phase, in Section 6. When evaluated with real data, our method
revealed results that are statistically the same as the ones obtained with the standardly used
similarity, the Pearson correlation. However, a big advantage of our similarity measure is
that it achieves measurable gains in terms of time complexity. In the future, we would like to
provide a description of the groups that we generate, so that the companies/sellers know what
characterizes users in terms of their preferences. Also, it would be interesting to study the
optimal bribing strategies such that the bribing company can push the recommendation order
of their products to the top-N recommendation list presented to users or groups of users.
In Part II of this manuscript, we studied the security aspects of control of both LTI (linear-
time invariant) systems and switched LTI systems. Chapter 8 addresses the robust minimal
controllability (rMCP) problem in the context of LTI systems. The problem consists in finding
one of the smallest set of state variables that we need to actuate such that the underlying
dynamical system is controllable, even in the event of failure of a subset of controllers, for
instance, due to an external agent attacking the system. We characterized the exact solutions
to this problem. We showed that the problem is NP-complete and provided approximation
algorithms with polynomial complexity to solve it. In Chapter 9, we extended the results of
Chapter 8 to switched LTI systems. Under this scenario, we identified two relevant minimal
controllability problems, depending on whether we want to place a fixed set of actuators or one
set of actuators for each mode of the system. Again, we provided approximation algorithms
to solve both versions of the problem. Notwithstanding, only for the second version of the
problem the approximation algorithms may have polynomial complexity. Future research
includes exploring the structure of the inputs and to consider methods, such as coordinate
gradient descent, to minimize an energy cost.
124
InformationTheory and Security
E-commerce ControlSystems
RecommenderSystems
RankingSystems
LTIsystems
SwitchedLTI systems
Ch. 5,6 Ch. 3,4Ch. 4 Ch. 8 Ch. 9
Part I Part II
Figure 11.1: Future work directions.
Moreover, it would be interesting to find solutions for the rMCP of both Chapter 8 and
Chapter 9 without the assumption about the eigenvalues of the state matrices. Another inter-
esting avenue for future research is to model the dynamics of ranking systems, or recommender
systems, by means of a dynamical system. By doing this, we may study the smallest subset
of users that a company/seller needs to bribe in order to control the ranking of their products
or of competitor products, or the order of the appearance of an item in a recommendation
list. These future research lines correspond to the dashed edges of Figure 11.1.
Last, because the area of structural control theory is strongly connected with Boolean
matrices and graph theory, we used ideas from the first area and explore a digraph decom-
position to provide a novel and more general bound for the index of convergence of Boolean
matrices. In the future, we would like to explore more digraph decompositions to extended
our new bound.
125
Bibliography
[Aggarwal, 2016] Aggarwal, C. C. (2016). Recommender Systems. Springer.
[Aguilar and Gharesifard, 2014] Aguilar, C. and Gharesifard, B. (2014). Graph controllabil-
ity classes for the laplacian leader-follower dynamics. IEEE Transactions on Automatic
Control, PP(99):1–1.
[Akian et al., 2006] Akian, M., Bapat, R., and Gaubert, S. (2006). Max-plus algebra. Hand-
book of linear algebra. Chapman and Hall, London.
[Altman, 1992] Altman, N. S. (1992). An introduction to kernel and nearest-neighbor non-
parametric regression. The American Statistician, 46(3):175–185.
[Apt and Markakis, 2014] Apt, K. R. and Markakis, E. (2014). Social networks with compet-
ing products. Fundamenta Informaticae, 129(3):225–250.
[Ardissono et al., 2003] Ardissono, L., Goy, A., Petrone, G., Segnan, M., and Torasso, P.
(2003). Intrigue: Personalized recommendation of tourist attractions for desktop and hand
held devices. Applied Artificial Intelligence, 17(8-9):687–714.
[Bach, 2011] Bach, F. (2011). Learning with Submodular Functions: A Convex Optimization
Perspective. ArXiv e-prints.
[Balzano et al., 2012] Balzano, L., Eriksson, B., and Nowak, R. (2012). High rank matrix
completion and subspace clustering with missing data. In Proceedings of the conference on
Artificial Intelligence and Statistics (AIStats).
[Bennett et al., 2007] Bennett, J., Lanning, S., et al. (2007). The netflix prize. In Proceedings
of KDD cup and workshop, volume 2007, page 35. New York, NY, USA.
[Bickart and Schindler, 2001] Bickart, B. and Schindler, R. M. (2001). Internet forums as
influential sources of consumer information. Journal of interactive marketing, 15(3):31–40.
[Bollobas, 2012] Bollobas, B. (2012). Graph theory: an introductory course, volume 63.
Springer Science & Business Media.
127
[Bollobas, 2013] Bollobas, B. (2013). Modern graph theory, volume 184. Springer Science &
Business Media.
[Boratto and Carta, 2011] Boratto, L. and Carta, S. (2011). State-of-the-Art in Group Rec-
ommendation and New Approaches for Automatic Identification of Groups, pages 1–20.
Springer Berlin Heidelberg, Berlin, Heidelberg.
[Bronnimann and Goodrich, 1995] Bronnimann, H. and Goodrich, M. T. (1995). Almost opti-
mal set covers in finite VC-dimension. Discrete & Computational Geometry, 14(4):463–479.
[Candes and Tao, 2010] Candes, E. J. and Tao, T. (2010). The power of convex relaxation:
Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053–
2080.
[Case, 2016] Case, D. U. (2016). Analysis of the cyber attack on the ukrainian power grid.
Electricity Information Sharing and Analysis Center (E-ISAC).
[Chapman and Mesbahi, 2014] Chapman, A. and Mesbahi, M. (2014). On symmetry and
controllability of multi-agent systems. In 53rd IEEE Conference on Decision and Control.
[Charron-Bost et al., 2011] Charron-Bost, B., Fugger, M., Welch, J., and Widder, J. (2011).
Full reversal routing as a linear dynamical system. Structural Information and Communi-
cation Complexity, pages 101–112.
[Chekuri et al., 2012] Chekuri, C., Clarkson, K. L., and Har-Peled, S. (2012). On the set
multicover problem in geometric settings. ACM Trans. Algorithms, 9(1):9:1–9:17.
[Chen et al., 2015] Chen, Y., Kar, S., and Moura, J. M. F. (2015). Dynamic Attack Detection
in Cyber-Physical Systems with Side Initial State Information. ArXiv e-prints.
[Cheng, 2005] Cheng, D. (2005). Controllability of switched bilinear systems. IEEE Trans-
actions on Automatic Control, 50(4):511–515.
[Cheng, 2011] Cheng, D. (2011). Disturbance decoupling of Boolean control networks. IEEE
Transactions on Automatic Control, 56(1):2–10.
[Chevalier and Mayzlin, 2006] Chevalier, J. A. and Mayzlin, D. (2006). The effect of word of
mouth on sales: Online book reviews. Journal of marketing research, 43(3):345–354.
[Cialdini and Garde, 1987] Cialdini, R. B. and Garde, N. (1987). Influence, volume 3. A.
Michel.
[Cormen et al., 2001] Cormen, T. H., Stein, C., Rivest, R. L., and Leiserson, C. E. (2001).
Introduction to Algorithms. McGraw-Hill Higher Education, 2nd edition.
128
[Cover and Thomas, 2012] Cover, T. M. and Thomas, J. A. (2012). Elements of information
theory. John Wiley & Sons.
[Davis and Khazanchi, 2008] Davis, A. and Khazanchi, D. (2008). An empirical study of
online word of mouth as a predictor for multi-product category e-commerce sales. Electronic
Markets, 18(2):130–141.
[De Kerchove and Van Dooren, 2010] De Kerchove, C. and Van Dooren, P. (2010). Itera-
tive filtering in reputation systems. SIAM Journal on Matrix Analysis and Applications,
31(4):1812–1834.
[De Maeyer, 2012] De Maeyer, P. (2012). Impact of online consumer reviews on sales and
price strategies: A review and directions for future research. Journal of Product &
Brand Management, 21(2):132–139.
[De Pessemier et al., 2015] De Pessemier, T., Dhondt, J., Vanhecke, K., and Martens, L.
(2015). Travelwithfriends: a hybrid group recommender system for travel destinations. In
Proceedings of the Workshop on Tourism Recommender Systems, in conjunction with the
9th ACM Conference on Recommender Systems, pages 51–60.
[Delic et al., 2016] Delic, A., Neidhardt, J., Nguyen, T. N., Ricci, F., Rook, L., Werthner, H.,
and Zanker, M. (2016). Observing group decision making processes. In Proceedings of the
10th ACM Conference on Recommender Systems, RecSys ’16, pages 147–150, New York,
NY, USA. ACM.
[Dellarocas et al., 2007] Dellarocas, C., Zhang, X. M., and Awad, N. F. (2007). Exploring the
value of online product reviews in forecasting sales: The case of motion pictures. Journal
of Interactive marketing, 21(4):23–45.
[Demmel et al., 2000] Demmel, J., Dongarra, J., Ruhe, A., and van der Vorst, H. (2000).
Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Society
for Industrial and Applied Mathematics, Philadelphia, PA, USA.
[Denardo, 1977] Denardo, E. V. (1977). Periods of connected networks and powers of non-
negative matrices. Mathematics of Operations Research, 2(1):20–24.
[Dion et al., 2003] Dion, J.-M., Commault, C., and der Woude, J. V. (2003). Generic prop-
erties and control of linear structured systems: a survey. Automatica, pages 1125–1144.
[Dulmage et al., 1964] Dulmage, A. L., Mendelsohn, N. S., et al. (1964). Gaps in the exponent
set of primitive matrices. Illinois Journal of Mathematics, 8(4):642–656.
129
[Egerstedt, 2011] Egerstedt, M. (2011). Complex networks: Degrees of control. Nature,
473(7346):158–159.
[Egerstedt et al., 2012] Egerstedt, M., Martini, S., Cao, M., Camlibel, K., and Bicchi, A.
(2012). Interacting with networks: How does structure relate to controllability in single-
leader, consensus networks? Control Systems Magazine, 32(4):66 – 73.
[Fawzi et al., 2012] Fawzi, H., Tabuada, P., and Diggavi, S. (2012). Secure estimation and
control for cyber-physical systems under adversarial attacks. ArXiv e-prints.
[Fletcher et al., 2012] Fletcher, G. H., Gyssens, M., Leinders, D., Van den Bussche, J.,
Van Gucht, D., Vansummeren, S., and Wu, Y. (2012). The impact of transitive closure
on the boolean expressiveness of navigational query languages on graphs. In FoIKS, pages
124–143. Springer.
[Flick and Morehouse, 2010] Flick, T. and Morehouse, J. (2010). Securing the smart grid:
next generation power grid security. Elsevier.
[Forman et al., 2008] Forman, C., Ghose, A., and Wiesenfeld, B. (2008). Examining the
relationship between reviews and sales: The role of reviewer identity disclosure in electronic
markets. Information Systems Research, 19(3):291–313.
[Fouss et al., 2007] Fouss, F., Pirotte, A., Renders, J.-M., and Saerens, M. (2007). Random-
walk computation of similarities between nodes of a graph with application to collaborative
recommendation. IEEE Trans. on Knowl. and Data Eng., 19(3):355–369.
[Ganti et al., 2015] Ganti, R. S., Balzano, L., and Willett, R. (2015). Matrix completion under
monotonic single index models. In Advances in Neural Information Processing Systems,
pages 1873–1881.
[Gao et al., 2010] Gao, Y., Lu, H., Seki, S., and Yu, S. (2010). Developments in Language
Theory: 14th International Conference, DLT 2010, London, ON, Canada, August 17-20,
2010, Proceedings, volume 6224. Springer.
[Garey and Johnson, 1979] Garey, M. R. and Johnson, D. S. (1979). Computers and In-
tractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New
York, NY, USA.
[Gavalec, 1997] Gavalec, M. (1997). Computing matrix period in max-min algebra. Discrete
Applied Mathematics, 75(1):63–70.
130
[George and Merugu, 2005] George, T. and Merugu, S. (2005). A scalable collaborative filter-
ing framework based on co-clustering. In Data Mining, Fifth IEEE international conference
on, pages 4–pp. IEEE.
[Goren-Bar and Glinansky, 2004] Goren-Bar, D. and Glinansky, O. (2004). Fit-recommend
ing tv programs to family members. Computers & Graphics, 28(2):149–156.
[Grandi and Turrini, 2016] Grandi, U. and Turrini, P. (2016). A network-based rating sys-
tem and its resistance to bribery. In Proceedings of the Twenty-Fifth International Joint
Conference on Artificial Intelligence (IJCAI-16).
[Hartigan and Wong, 1979] Hartigan, J. A. and Wong, M. A. (1979). Algorithm as 136: A
k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied
Statistics), 28(1):100–108.
[Hespanha, 2009] Hespanha, J. P. (2009). Linear Systems Theory. Princeton Press, Princeton,
New Jersey.
[Hodges, 2012] Hodges, A. (2012). Alan Turing: the enigma. Random House.
[Holmes et al., 2007] Holmes, M., Gray, A., and Isbell, C. (2007). Fast svd for large-scale
matrices. In Workshop on Efficient Machine Learning at NIPS, volume 58, pages 249–252.
[Hopcroft and Tarjan, 1973] Hopcroft, J. and Tarjan, R. (1973). Algorithm 447: efficient
algorithms for graph manipulation. Communications of the ACM, 16(6):372–378.
[Hu et al., 2012] Hu, N., Bose, I., Koh, N. S., and Liu, L. (2012). Manipulation of online
reviews: An analysis of ratings, readability, and sentiments. Decision Support Systems,
52(3):674–684.
[Hu et al., 2006] Hu, N., Pavlou, P. A., and Zhang, J. (2006). Can online reviews reveal a
product’s true quality?: empirical findings and analytical modeling of online word-of-mouth
communication. In Proceedings of the 7th ACM conference on Electronic commerce, pages
324–330. ACM.
[Hua et al., 2010] Hua, Q.-S., Wang, Y., Yu, D., and Lau, F. C. (2010). Dynamic program-
ming based algorithms for set multicover and multiset multicover problems. Theoretical
Computer Science, 411(26–28):2467 – 2474.
[Hua et al., 2009] Hua, Q.-S., Yu, D., Lau, F. C. M., and Wang, Y. (2009). Proceedings
of the Algorithms and Computation: 20th International Symposium, ISAAC 2009, Hon-
olulu, Hawaii, USA, chapter Exact Algorithms for Set Multicover and Multiset Multicover
Problems, pages 34–44. Springer Berlin Heidelberg, Berlin, Heidelberg.
131
[Hug, 2017] Hug, N. (2017). Surprise, a Python library for recommender systems. http:
//surpriselib.com.
[Jain and Dubes, 1988] Jain, A. K. and Dubes, R. C. (1988). Algorithms for clustering data.
[Jameson and Smyth, 2007] Jameson, A. and Smyth, B. (2007). Recommendation to groups.
In The Adaptive Web, Methods and Strategies of Web Personalization, volume 4321 of
Lecture Notes in Computer Science, pages 596–627. Springer, Berlin.
[Ji et al., 2007] Ji, Z., Wang, L., and Guo, X. (2007). Design of switching sequences for
controllability realization of switched linear systems. Automatica, 43(4):662–668.
[Jurczyk and Agichtein, 2007] Jurczyk, P. and Agichtein, E. (2007). Discovering authorities
in question answer communities by using link analysis. In Proceedings of the sixteenth
ACM conference on Conference on information and knowledge management, pages 919–
922. ACM.
[Katz et al., 2012] Katz, R. D., Schneider, H., et al. (2012). On commuting matrices in
max algebra and in classical nonnegative algebra. Linear Algebra and its Applications,
436(2):276–292.
[Kendall, 1938] Kendall, M. G. (1938). A new measure of rank correlation. Biometrika,
30(1/2):81–93.
[Kibangou and Commault, 2014] Kibangou, A. Y. and Commault, C. (2014). Observability
in connected strongly regular graphs and distance regular graphs. IEEE Transactions on
Control of Network Systems.
[Kietzmann and Canhoto, 2013] Kietzmann, J. and Canhoto, A. (2013). Bittersweet! under-
standing and managing electronic word of mouth. Journal of Public Affairs, 13(2):146–159.
[Kim, 1979] Kim, K.-H. (1979). An extension of the Dulmage-Mendelsohn theorem. Linear
Algebra and its Applications, 27:187–197.
[Kleinberg, 1999] Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environ-
ment. Journal of the ACM (JACM), 46(5):604–632.
[Koren, 2008] Koren, Y. (2008). Factorization meets the neighborhood: a multifaceted collab-
orative filtering model. In Proceedings of the 14th ACM SIGKDD international conference
on Knowledge discovery and data mining, pages 426–434. ACM.
[Koren, 2010] Koren, Y. (2010). Factor in the neighbors: Scalable and accurate collaborative
filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1):1.
132
[Koren and Bell, 2015] Koren, Y. and Bell, R. M. (2015). Advances in collaborative filtering.
In Ricci, F., Rokach, L., and Shapira, B., editors, Recommender Systems Handbook, pages
77–118. Springer.
[Kreyszig, 1989] Kreyszig, E. (1989). Introductory functional analysis with applications, vol-
ume 1. wiley New York.
[Langner, 2011] Langner, R. (2011). Robust Control System Networks: How to Achieve Reli-
able Control After Stuxnet. Momentum Press.
[Le Gall, 2014] Le Gall, F. (2014). Powers of tensors and fast matrix multiplication. In
Proceedings of the 39th international symposium on symbolic and algebraic computation,
pages 296–303. ACM.
[Lee and Seung, 2001] Lee, D. D. and Seung, H. S. (2001). Algorithms for non-negative matrix
factorization. In Advances in neural information processing systems, pages 556–562.
[Lee Rodgers and Nicewander, 1988] Lee Rodgers, J. and Nicewander, W. A. (1988). Thir-
teen ways to look at the correlation coefficient. The American Statistician, 42(1):59–66.
[Lemire and Maclachlan, 2005] Lemire, D. and Maclachlan, A. (2005). Slope one predictors
for online rating-based collaborative filtering. In Proceedings of the 2005 SIAM Interna-
tional Conference on Data Mining, pages 471–475. SIAM.
[Leskovec and Krevl, 2014] Leskovec, J. and Krevl, A. (2014). SNAP Datasets: Stanford
large network dataset collection. http://snap.stanford.edu/data.
[Li et al., 2004] Li, M., Chen, X., Li, X., Ma, B., and Vitanyi, P. M. (2004). The similarity
metric. IEEE transactions on Information Theory, 50(12):3250–3264.
[Li and Shao, 1993] Li, Q. and Shao, J. (1993). The index set problem for boolean (or non-
negative) matrices. Discrete Mathematics, 123, number = 1:75–92.
[Li et al., 2012] Li, R.-H., Yu, J. X., Huang, X., and Cheng, H. (2012). Robust reputation-
based ranking on bipartite rating networks. In SDM, volume 12, pages 612–623. SIAM.
[Lin, 2012] Lin, A. W. (2012). Weakly-synchronized ground tree rewriting. In International
Symposium on Mathematical Foundations of Computer Science, pages 630–642. Springer.
[Lin and Antsaklis, 2009] Lin, H. and Antsaklis, P. J. (2009). Stability and stabilizability
of switched linear systems: a survey of recent results. IEEE Transactions on Automatic
control, 54(2):308–322.
133
[Liu and Wang, 2007] Liu, H. and Wang, B. (2007). An association rule mining algorithm
based on a Boolean matrix. Data Science Journal, 6:S559–S565.
[Liu et al., 2013] Liu, X., Lin, H., and Chen, B. M. (2013). Structural controllability of
switched linear systems. Automatica, 49(12):3531–3537.
[Liu et al., 2015] Liu, X., Pequito, S., Kar, S., Sinopoli, B., and Aguiar, A. P. (2015). Mini-
mum Sensor Placement for Robust Observability of Structured Complex Networks. ArXiv
e-prints. arXiv:1507.07205.
[MacQueen et al., 1967] MacQueen, J. et al. (1967). Some methods for classification and
analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on
mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA.
[Mahajan et al., 2009] Mahajan, M., Nimbhorkar, P., and Varadarajan, K. (2009). The planar
k-means problem is np-hard. In International Workshop on Algorithms and Computation,
pages 274–285. Springer.
[Maslowska et al., 2017] Maslowska, E., Malthouse, E. C., and Bernritter, S. F. (2017). The
effect of online customer reviews’ characteristics on sales. In Advances in Advertising Re-
search (Vol. VII), pages 87–100. Springer.
[Masthoff, 2015] Masthoff, J. (2015). Group recommender systems: Aggregation, satisfaction
and group attributes. In Ricci, F., Rokach, L., and Shapira, B., editors, Recommender
Systems Handbook, pages 743–776. Springer.
[McAuley et al., 2015] McAuley, J., Pandey, R., and Leskovec, J. (2015). Inferring networks
of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, pages 785–794. ACM.
[McCarthy and Anagnost, 1998] McCarthy, J. F. and Anagnost, T. D. (1998). Musicfx: An
arbiter of group preferences for computer supported collaborative workouts. In Poltrock,
S. E. and Grudin, J., editors, CSCW ’98, Proceedings of the ACM 1998 Conference on
Computer Supported Cooperative Work, Seattle, WA, USA, November 14-18, 1998, pages
363–372. ACM.
[McCarthy et al., 2006] McCarthy, K., Salamo, M., Coyle, L., McGinty, L., Smyth, B., and
Nixon, P. (2006). Cats: A synchronous approach to collaborative group recommendation.
In Sutcliffe, G. and Goebel, R., editors, Proceedings of the Nineteenth International Florida
Artificial Intelligence Research Society Conference, Melbourne Beach, Florida, USA, May
11-13, 2006, pages 86–91. AAAI Press.
134
[Merlet et al., 2017] Merlet, G., Nowak, T., and Sergeev, S. (2017). On the tightness of
bounds for transients and weak csr expansions in max-plus algebra. arXiv preprint
arXiv:1705.04104.
[Ming and Vitanyi, 1990] Ming, L. and Vitanyi, P. M. (1990). Kolmogorov complexity and
its applications. In Algorithms and Complexity, pages 187–254. Elsevier.
[Mizzaro, 2003] Mizzaro, S. (2003). Quality control in scholarly publishing: A new proposal.
Journal of the American Society for Information Science and Technology, 54(11):989–1005.
[Mnih and Salakhutdinov, 2008] Mnih, A. and Salakhutdinov, R. R. (2008). Probabilistic
matrix factorization. In Advances in neural information processing systems, pages 1257–
1264.
[Nabi-Abdolyousefi and Mesbahi, 2013] Nabi-Abdolyousefi, M. and Mesbahi, M. (2013). On
the controllability properties of circulant networks. IEEE Transactions on Automatic Con-
trol, 58(12):3179–3184.
[Ning et al., 2015] Ning, X., Desrosiers, C., and Karypis, G. (2015). A comprehensive survey
of neighborhood-based recommendation methods. In Ricci, F., Rokach, L., and Shapira,
B., editors, Recommender Systems Handbook, pages 37–76. Springer.
[Notarstefano and Parlangeli, 2013] Notarstefano, G. and Parlangeli, G. (2013). Controlla-
bility and observability of grid graphs via reduction and symmetries. IEEE Transactions
on Automatic Control, 58(7):1719–1731.
[Ntoutsi et al., 2012] Ntoutsi, E., Stefanidis, K., Nørvag, K., and Kriegel, H.-P. (2012). Fast
group recommendations by applying user clustering. In Atzeni, P., Cheung, D. W., and
Ram, S., editors, Conceptual Modeling - 31st International Conference ER 2012, Florence,
Italy, October 15-18, 2012. Proceedings, volume 7532 of Lecture Notes in Computer Science,
pages 126–140. Springer.
[O’Connor et al., 2001] O’Connor, M., Cosley, D., Konstan, J. A., and Riedl, J. (2001).
Polylens: A recommender system for groups of users. In Prinz, W., Jarke, M., Rogers,
Y., Schmidt, K., and Wulf, V., editors, Proceedings of the Seventh European Conference
on Computer Supported Cooperative Work, 16-20 September 2001, Bonn, Germany, pages
199–218. Kluwer.
[Ogata, 2001] Ogata, K. (2001). Modern Control Engineering. Prentice Hall PTR, Upper
Saddle River, NJ, USA, 4th edition.
[Ogata and Yang, 1970] Ogata, K. and Yang, Y. (1970). Modern control engineering.
135
[Olshevsky, 2014] Olshevsky, A. (2014). Minimal controllability problems. IEEE Transactions
on Control of Network Systems, 1(3):249–258.
[O’Mahony, 1986] O’Mahony, M. (1986). Sensory evaluation of food: statistical methods and
procedures, volume 16. CRC Press.
[Pan and Chen, 1999] Pan, V. Y. and Chen, Z. Q. (1999). The complexity of the matrix
eigenproblem. In Proceedings of the thirty-first annual ACM symposium on Theory of
computing, STOC ’99, pages 507–516, New York, NY, USA. ACM.
[Parlangeli and Notarstefano, 2012] Parlangeli, G. and Notarstefano, G. (2012). On the
reachability and observability of path and cycle graphs. IEEE Transactions on Automatic
Control, 57(3):743–748.
[Pasqualetti and Zampieri, 2014] Pasqualetti, F. and Zampieri, S. (2014). On the controlla-
bility of isotropic and anisotropic networks. In 53rd IEEE Conference on Decision and
Control.
[Pasqualetti et al., 2014] Pasqualetti, F., Zampieri, S., and Bullo, F. (2014). Controllability
metrics, limitations and algorithms for complex networks. IEEE Transactions on Control
of Network Systems, 1(1):40–52.
[Pequito et al., 2016a] Pequito, S., Kar, S., and Aguiar, A. P. (2016a). A framework for
structural input/output and control configuration selection of large-scale systems. IEEE
Transactions on Automatic Control, 61(2):303–318.
[Pequito and Pappas, 2017] Pequito, S. and Pappas, G. J. (2017). Structural minimum con-
trollability problem for switched linear continuous-time systems. Automatica, 78:216–222.
[Pequito et al., 2016b] Pequito, S., Ramos, G., Kar, S., Aguiar, A. P., and Ramos, J. (2016b).
The robust minimal controllability problem. arXiv preprint arXiv:1401.4209.
[Pequito et al., 2017] Pequito, S., Ramos, G., Kar, S., Aguiar, A. P., and Ramos, J. (2017).
The robust minimal controllability problem. Automatica, 82:261–268.
[Petreczky et al., 2015] Petreczky, M., Tanwani, A., and Trenn, S. (2015). Observability of
switched linear systems. In Hybrid Dynamical Systems, pages 205–240. Springer.
[Prosser, 1959] Prosser, R. T. (1959). Applications of Boolean matrices to the analysis of flow
diagrams. In Papers presented at the December 1-3, 1959, eastern joint IRE-AIEE-ACM
computer conference, pages 133–138. ACM.
136
[Rahmani et al., 2009] Rahmani, A., Ji, M., Mesbahi, M., and Egerstedt, M. (2009). Con-
trollability of multi-agent systems from a graph-theoretic perspective. SIAM Journal on
Control and Optimization, 48(1):162–186.
[Ramos et al., 2018a] Ramos, G., Borrato, L., and Caleiro, C. (2018a). A novel similarity
metric for group recommender systems with optimal time complexity. In submitted to
Proceedings of the 24rd ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining. ACM.
[Ramos and Caleiro, 2018] Ramos, G. and Caleiro, C. (2018). On the index of convergence
of boolean matrices with commutative sd-decomposition. submitted to Discrete Applied
Mathematics.
[Ramos et al., 2015] Ramos, G., Pequito, S., Aguiar, A. P., and Kar, S. (2015). Analysis and
design of electric power grids with p-robustness guarantees using a structural hybrid system
approach. In Control Conference (ECC), 2015 European, pages 3542–3547. IEEE.
[Ramos et al., 2013] Ramos, G., Pequito, S., Aguiar, A. P., Ramos, J., and Kar, S. (2013).
A model checking framework for linear time invariant switching systems using structural
systems analysis. In Communication, Control, and Computing (Allerton), 2013 51st Annual
Allerton Conference on, pages 973–980. IEEE.
[Ramos et al., 2018b] Ramos, G., Pequito, S., and Caleiro, C. (2018b). The robust minimal
controllability problem for switched linear continuous-time systems. In American Control
Conference (ACC), 2018.
[Ramos et al., 2018c] Ramos, G., Saude, J., Borrato, L., and Caleiro, C. (2018c). Recommen-
dation via matrix completion using kolmogorov complexity. In submitted to Proceedings of
the 26th Conference on User Modeling, Adaptation and Personalization. ACM.
[Ricci, 2014] Ricci, F. (2014). Recommender systems: Models and techniques. In Encyclopedia
of Social Network Analysis and Mining, pages 1511–1522. Springer New York.
[Ricci et al., 2015] Ricci, F., Rokach, L., and Shapira, B. (2015). Recommender systems: In-
troduction and challenges. In Ricci, F., Rokach, L., and Shapira, B., editors, Recommender
Systems Handbook, pages 1–34. Springer.
[Salakhutdinov and Mnih, 2007] Salakhutdinov, R. and Mnih, A. (2007). Probabilistic matrix
factorization. In Nips, volume 1, pages 2–1.
137
[Sanchez et al., 2014] Sanchez, L. Q., Dıaz-Agudo, B., and Recio-Garcıa, J. A. (2014). De-
velopment of a group recommender application in a social network. Knowl.-Based Syst.,
71:72–85.
[Sarwar et al., 2001a] Sarwar, B., Karypis, G., Konstan, J., and Riedl, J. (2001a). Item-based
collaborative filtering recommendation algorithms. In Proceedings of the 10th international
conference on World Wide Web, pages 285–295. ACM.
[Sarwar et al., 2001b] Sarwar, B., Karypis, G., Konstan, J., and Riedl, J. (2001b). Item-based
collaborative filtering recommendation algorithms. In Proceedings of the 10th International
Conference on World Wide Web, WWW ’01, pages 285–295, New York, NY, USA. ACM.
[Satta, 1994] Satta, G. (1994). Tree-adjoining grammar parsing and Boolean matrix multi-
plication. Computational linguistics, 20(2):173–191.
[Saude et al., 2018] Saude, J., Ramos, G., Borrato, L., Caleiro, C., and Kar, S. (2018). A
robust- and cluster-based ranking system. In submitted to Proceedings of the 41th Inter-
national ACM SIGIR conference on Research and Development in Information Retrieval.
ACM.
[Saude et al., 2017] Saude, J., Ramos, G., Caleiro, C., and Kar, S. (2017). Reputation-based
ranking systems and their resistance to bribery. In Data Mining (ICDM), 2017 IEEE 17th
International Conference on.
[Saude et al., 2017] Saude, J., Ramos, G., Caleiro, C., and Kar, S. (2017). Robust reputation-
based ranking on multipartite rating networks. arXiv preprint arXiv:1705.00947.
[Schafer et al., 2007a] Schafer, J. B., Frankowski, D., Herlocker, J., and Sen, S. (2007a).
The adaptive web. chapter Collaborative Filtering Recommender Systems, pages 291–324.
Springer-Verlag, Berlin, Heidelberg.
[Schafer et al., 2007b] Schafer, J. B., Frankowski, D., Herlocker, J., and Sen, S. (2007b).
Collaborative filtering recommender systems. In The adaptive web, pages 291–324. Springer.
[Schwarz, 1970] Schwarz, S. (1970). On a sharp estimation in the theory of binary relations
on a finite set. Czechoslovak Mathematical Journal, 20(4):703–714.
[Sedgewick and Wayne, 2011] Sedgewick, R. and Wayne, K. (2011). Algorithms. Pearson
Education.
[Shiels, 2009] Shiels, M. (2009). Spies’ infiltrate us power grid’. BBC News, Apr.
138
[Shoukry and Tabuada, 2014] Shoukry, Y. and Tabuada, P. (2014). Event-triggered projected
Luenberger observer for linear systems under sparse sensor attacks. In 53rd IEEE Confer-
ence on Decision and Control, pages 3548–3553. IEEE.
[Siljak, 1991] Siljak, D. D. (1991). Decentralized control of complex systems. Academic Press,
Boston.
[Siljak, 2007] Siljak, D. D. (2007). Large-Scale Dynamic Systems: Stability and Structure.
Dover Publications.
[Siljak, 2011] Siljak, D. D. (2011). Decentralized control of complex systems. Courier Corpo-
ration.
[Simon and Apt, 2015] Simon, S. and Apt, K. R. (2015). Social network games. Journal of
Logic and Computation, 25(1):207–242.
[Singer and Friedman, 2014] Singer, P. W. and Friedman, A. (2014). Cybersecurity: What
everyone needs to know. Oxford University Press.
[Singh, 2000] Singh, S. (2000). The code book: the science of secrecy from ancient Egypt to
quantum cryptography. Anchor.
[Skogestad, 2004] Skogestad, S. (2004). Control structure design for complete chemical plants.
Computers and Chemical Engineering, 28(1-2):219–234.
[Song et al., 2016] Song, D., Lee, C. E., Li, Y., and Shah, D. (2016). Blind regression: Non-
parametric regression for latent variable models via collaborative filtering. In Advances in
Neural Information Processing Systems, pages 2155–2163.
[Sparks and Browning, 2011] Sparks, B. A. and Browning, V. (2011). The impact of on-
line reviews on hotel booking intentions and perception of trust. Tourism Management,
32(6):1310–1323.
[Sun, 2006] Sun, Z. (2006). Switched linear systems: control and design. Springer Science &
Business Media.
[Sun et al., 2002] Sun, Z., Ge, S. S., and Lee, T. H. (2002). Controllability and reachability
criteria for switched linear systems. Automatica, 38(5):775–786.
[Symeonidis et al., 2011] Symeonidis, P., Tiakas, E., and Manolopoulos, Y. (2011). Product
recommendation and rating prediction based on multi-modal social networks. In Proceedings
of the fifth ACM conference on Recommender systems, pages 61–68. ACM.
139
[Tanner, 2004] Tanner, H. (2004). On the controllability of nearest neighbor interconnections.
In 43rd IEEE Conference on Decision and Control, volume 3, pages 2467–2472 Vol.3.
[Tao and Vu, 2014] Tao, T. and Vu, V. (2014). Random matrices have simple spectrum.
ArXiv e-prints. arXiv:1412.1438.
[Tao and Vu, 2016] Tao, T. and Vu, V. (2016). Random matrices have simple spectrum.
Combinatorica, pages 1–15.
[Tarjan, 1972] Tarjan, R. (1972). Depth-first search and linear graph algorithms. SIAM
journal on computing, 1(2):146–160.
[van de Wal and de Jager, 2001] van de Wal, M. and de Jager, B. (2001). A review of methods
for input/output selection. Automatica, 37(4):487 – 510.
[Vazirani, 2013] Vazirani, V. V. (2013). Approximation algorithms. Springer Science & Busi-
ness Media.
[Velde and Carignan, 1984] Velde, W. E. V. and Carignan, C. R. (1984). Number and place-
ment of control system components considering possible failures. Journal of Guidance,
Control, and Dynamics, 7(6):703–709.
[Wang et al., 2006] Wang, J., De Vries, A. P., and Reinders, M. J. (2006). Unifying user-
based and item-based collaborative filtering approaches by similarity fusion. In Proceedings
of the 29th annual international ACM SIGIR conference on Research and development in
information retrieval, pages 501–508. ACM.
[Wielandt, 1950] Wielandt, H. (1950). Unzerlegbare, nicht negative matrizen. Mathematische
Zeitschrift, 52(1):642–648.
[Williams, 1991] Williams, R. N. (1991). An extremely fast ziv-lempel data compression
algorithm. In Data Compression Conference, 1991. DCC’91., pages 362–371. IEEE.
[Wonham, 1985] Wonham, W. M. (1985). Linear multivariable control: a geometric approach.
Applications of mathematics. Springer, New York, Berlin, Tokyo.
[Yu et al., 2006a] Yu, Y.-K., Zhang, Y.-C., Laureti, P., and Moret, L. (2006a). Decoding in-
formation from noisy, redundant, and intentionally distorted sources. Physica A: Statistical
Mechanics and its Applications, 371(2):732–744.
[Yu et al., 2006b] Yu, Z., Zhou, X., Hao, Y., and Gu, J. (2006b). Tv program recommenda-
tion for multiple viewers based on user profile merging. User Modeling and User-Adapted
Interaction, 16(1):63–82.
140
[Zhang et al., 2011] Zhang, S., Camlibel, M., and Cao, M. (2011). Controllability of
diffusively-coupled multi-agent systems with general and distance regular coupling topolo-
gies. In 50th IEEE Conference on Decision and Control and European Control Conference,
pages 759–764.
[Zhao and Shang, 2010] Zhao, Z.-D. and Shang, M.-S. (2010). User-based collaborative-
filtering recommendation algorithms on hadoop. In Knowledge Discovery and Data Mining,
2010. WKDD’10. Third International Conference on, pages 478–481. IEEE.
141