Lorentz Center

International center for scientific workshops

International center for scientific workshops

Current Workshop | Overview | Back | Home | Search | | ||||||||||

## Clusters, Games and Axioms |

Social choice theory is the study of the aggregation
of the views of several individuals into
a single collective view, with the prime
example being the aggregation of preferences in an election. Axioms
in this context are mathematically
rigorous encodings of philosophically
motivated normative requirements on aggregation methods, such as
the requirement to give equal weight to
every individual. Social choice
theorists have explored the consequences of imposing such axioms to fully characterise certain aggregation
methods or to show that certain
combinations of axioms are impossible to satisfy. Famous examples,
in preference aggregation, are May's
characterisation of the simple majority
rule, Arrow's Theorem on the impossibility of "reasonable" preference aggregation, and the Gibbard-Satterthwaite Theorem showing that all "reasonable" voting rule
are subject to strategic manipulation.
This talk will be an introduction to the axiomatic method in social choice theory, with examples drawn mostly
from the well established areas of
preference aggregation and judgment aggregation. I will also briefly report on recent work using the
axiomatic method in graph aggregation,
where we try to compute a single graph that is a good compromise between the graphs reported by
several individuals.
This tutorial will provide an introduction to some of
the fundamental questions and ideas
studied in computational game theory. The field can be roughly divided into several main branches
-- computation of equilibria, bounding
the inefficiency of equilibria, and mechanism
design -- and the tutorial will highlight a selection of central concepts and analysis tools from these areas.
We'll also show how to apply some of
them to analyze games related to clustering.
In the first part of this tutorial, I will given an
overview of the many dimensions of clustering: hard vs soft, vectors
vs. similarities, cost-based vs probability based, parametric vs
non-parametric. In the second part, I will consider clustering of
weighted graphs, also known as similarity based clustering, and I will
describe several paradigms for this problem. Among them, spectral clustering
maps the graph data back to real-valued vectors, and from a graph clustering problem
to the well-know K-means/Least squares problem. I will show that in fact this
is a two-way relationship, and highlight some properties of the venerable but
widely used K-means and EM algorithms.
Abstract: Clustering
is an area of huge practical relevance but very meager
theoretical foundations. I will start
by discussing some fundamental challenges that the development of a theory of
clustering faces. I will then describe two different approaches to addressing
those challenges; an axiomatic
approach and a statistical/machine-learning perspective. I will
outline recent progress made along these directions, as well as allude to some
common misconceptions and potential pitfalls. The talk is geared towards
stimulating discussions and highlighting open questions more than to providing
answers or boasting definitive results.
Algorithms as selection procedures for mathematical
structures are often exposed to uncertainty in the input or randomness during
computation. The achievable precision of an algorithm, i.e., the attainable
resolution in its output space, is determined by the predictive information in
the input of an algorithm relative to its output space. Therefore, algorithms
can be compared and optimized w.r.t. its runtime, its memory consumption and
its informativeness, meaning its robustness to
stochastic influences in the input and during computation. I will present an
information theoretic framework for algorithm analysis where an algorithm is
characterized as computational evolution of a (possibly contracting) posterior
distribution over the output space. The tradeoff between
precision and stability is controlled by an input sensitive generalization
capacity (GC). GC measures how much the posteriors on two different prpoblem instances agree despite the noise in the input.
Thereby, GC objectively ranks different
algorithms for the same data processing task based on the bit rate of their
respective capacities. Information theoretic algorithm selection is rigorously
demonstrated for minimum spanning tree algorithms and for greedy MaxCut algorithms. The method also allows us to rank
centroid based and spectral clustering methods, e.g. k-means, pairwise
clustering, normalized cut, adaptive ratio cut and dominant set clustering.
It would be
desirable if, as a society, we could reduce the amount of landfill trash we
create, the amount of carbon dioxide we emit, the amount of forest we clear,
etc. Since we cannot (or are in any case
not willing to) simultaneously pursue all these objectives to their maximum
extent, we must prioritize among them.
Currently, this is done mostly in an ad-hoc manner, with people,
companies, local governments, and other entities deciding on an individual
basis which of these objectives to pursue, and to what extent. A more
systematic approach would be to set, at a global level, exact numerical
tradeoffs: using one gallon of gasoline is as bad as creating x bags of
landfill trash. Having such tradeoffs
available would greatly facilitate decision making, and reduce inefficiencies resulting
from inconsistent decisions across agents.
But how could we arrive at a reasonable value for x? I will discuss this from the viewpoint of
computational social choice. (joint work
with Markus Brill and Rupert Freeman)
We consider approval-based committee voting, i.e., the
setting where each voter approves a subset of candidates, and these votes are
then used to select a fixed-size set of winners (committee). We propose a
natural axiom for this setting, which we call justified representation (JR).
This axiom requires that if a large enough group of voters exhibits agreement
by supporting the same candidate, then at least one voter in this group has an
approved candidate in the winning committee. We show that for every list of
ballots it is possible to select a committee that provides JR. We then check if
this axiom is fulfilled by well-known approval-based voting rules. We show that
the answer is negative for most of these rules, with notable exceptions of PAV
(Proportional Approval Voting), an extreme version of RAV (Reweighted Approval
Voting), and, for a restricted preference domain, MAV (Minimax
Approval Voting). We then introduce a stronger version of the JR axiom, which
we call extended justified representation (EJR), and show that PAV satisfies
EJR, while other rules do not. We also consider several other questions related
to JR and EJR, including the relationship between JR/EJR and unanimity, and the
complexity of the associated algorithmic problems. Based on joint work with Haris
Aziz, Markus Brill, Vincent Conitzer, Rupert Freeman,
and Toby Walsh.
Many graph clustering quality functions suffer from a
resolution limit, the inability to find
small clusters in large graphs. This limit stems from a dependence of the quality function on global
properties of the graph. In this talk I
will discuss locality, which roughly states that changing one part of the graph does not affect the clustering of other
parts of the graph. Local quality
functions are a way to avoid the resolution limit. I will look at locality in different graph
clustering settings. In particular, I
will look at hard clustering, where a clustering is a partition of the
set of nodes. Then I will extend the
notion of locality to soft clustering, where nodes can belong to more than one cluster, with
varying degrees of membership. Several
common methods do not satisfy locality, but local variants exist.
We propose and study a natural class of games, which
we call coordination games on graphs: We are given a finite graph whose nodes
correspond to the players of the game. Each player chooses a colour from a set
of colours available to her. The payoff of a player is the number of neighbours
who choose the same colour. These games
capture the idea of decentralized coordination in a local setting. We study
their equilibria when groups of players may jointly deviate. We analyze the question of when those equilibria exist; how to
recognize and compute them; and, finally, their inefficiency compared to an
optimal colouring.
Similarity-based clustering refers to the
process of extracting maximally coherent groups from a set of data points using
their observed, similarity relations. Traditional approaches to this problem
are based on the idea of partitioning the input data into a predetermined
number of classes, thereby obtaining the clusters as a by-product of the
partitioning process. In this talk, I present a radically different view of the
clustering problem that attempts to derive a rigorous formulation of the very
notion of a cluster using game-theoretic principles. Clusters are derived as a
result of the competition of individuals playing the so-called “clustering
game", which is a non-cooperative game involving two or more players
according to whether similarities are pairwise relations or higher-order ones.
The competition fostered by this game induces the players engaged in the
dispute to agree on a common notion of cluster, which turns out to be
equivalent to the concept of evolutionary stable strategy (ESS), a classical
equilibrium concept from evolutionary game theory. Computationally,
ESS-clusters can be found using, e.g., replicator dynamics, a well-known
formalization of natural selection processes. The proposed game-theoretic
framework has already found applications in a variety of application fields,
including bioinformatics, computer vision, image processing, security and video
surveillance, etc., and it is general by applying also to asymmetric, negative,
or high-order similarity relations.
Abstract. Many desirable goals are under-specified. A first such goal is to deal with a non-digital native
in a friendly and effective way. In order to do so, the computational agent
should build a model of the human in the loop (HiL), estimate his
preferences and account for his unavoidable inconsistencies, and behave in a
way that i) complies with the estimated HiL
preferences and ii) enables to refine the HiL preference
model. A second such goal is to control the dynamics of two
interacting processes (e.g. identifying and repairing faults in search-based softare engineering), in a sustainable way: each process should make progress and
support the other process' progress. The talk will propose some principles for tackling
such under specified goals: make it a
game; define rewards. Based on joint work with Riad
Akrour and Francois-Michel de Rainville:
Programming by Feedback, Int. Conf. on Machine Learning 2014, R. Akrour et al. Sustainable cooperative coevolution with a
multi-armed bandit, GECCO 2013, FM De Rainville et
al. [Back] |