**WORKSHOP LNMB (LORENTZ CENTER, JUNE 17-20, 1997)**

**Organizers: A. Hordijk and L.C.M. Kallenberg**

**1. GENERAL INFORMATION**

The ninth summer workshop of the LNMB (Dutch for National Network on the
Mathematics of Operations Research) was held in the Lorentz Center of the Leiden
University, June 17-20, 1997, and was sponsored by the Lorentz Center and the
Thomas Stieltjes Institute for Mathematics.

The workshop consisted of a number of lectures on the following themes:

- Multimodularity and sequential optimization (six lectures)
- Markovian control problems and queueing (ten lectures)
- Combinatorial games (six lectures)

On Friday, June 20, the afternoon session was devoted to the special event of the 10th anniversary of the LNMB. This special program includes:

- Lecture by Jan Karel Lenstra
- Presentation of the Gijs de Leve Prize
- Presentation of the Jubilee Book
- Reception

The workshop was attended by 45 persons, mostly Dutch researchers in the
Mathematics of Operations Research, particularly graduate students.

**2. THE PROGRAM**

**Tuesday, June 17**

09.30 - 10.15 Registration

10.15 - 11.00 Hordijk 1

11.15 - 12.00 Altman 1

12.15 - 13.00 Gaujal 1

13.00 - 14.15 Lunch

14.15 - 15.00 Gaujal 2

15.15 - 16.00 Altman 2

16.15 - 17.00 Hordijk 2

17.00 - 18.30 Reception sponsored by the Lorentz Center

**Wednesday, June 18**

09.30 - 10.00 Kallenberg 1

10.15 - 11.00 Puterman 1

11.15 - 12.00 Thomas 1

12.15 - 13.00 Vrieze 1

13.00 - 14.15 Lunch

14.15 - 15.00 Koole 1

15.15 - 16.00 Puterman 2

16.15 - 17.30 Practical work 1

17.30 - 19.30 Informal dinner in 'De Nieuwe Wereld'

19.30 - 21.00 Practical work 2

**Thursday, June 19**

09.15 - 10.00 Thomas 2

10.15 - 11.00 Puterman 3

11.15 - 12.00 Vrieze 2

12.15 - 13.00 Koole 2

13.00 - 14.15 Lunch

14.15 - 15.00 Hamers 1

15.15 - 16.00 Feltkamp 1

16.15 - 17.00 Feltkamp 2

18.00 Conference dinner in 'Koetshuis De Burcht'

**Friday, June 20**

09.15 - 10.00 Hamers 2

10.15 - 11.00 Hamers 3

11.15 - 12.00 Feltkamp 3

12.00 - 13.30 Lunch

13.30 - 14.30 Registration 10th Anniversary LNMB

14.30 - 15.15 Lecture Jan Karel Lenstra

15.30 - 16.00 Presentation of the Gijs de Leve Prize

16.00 - 16.15 Presentation of the Jubilee Book

16.15 - 18.00 Reception

**3. SPEAKERS AND TITLES OF THE LECTURES**

**Minicourse Multimodularity and Sequential Optimization**

Eitan Altman (INRIA, Sophia-Antipolis, France)

- Multimodularity, convexity and optimization properties

- Applications to the control of admission and of vacation in a single queue

Bruno Gaujal (INRIA, Sophia-Antipolis, France)

- Multimodularity in (max,+) linear systems

- Balanced sequences

Arie Hordijk (Leiden University)

- Introduction

- Markov decision chains with partial information

**Markovian control problems and queueing**

Lodewijk Kallenberg (Leiden University, Groningen University/LNMB)

- Introduction

Ger Koole (Free University Amsterdam)

- Stochastic bounds for queueing systems with multiple on-off sources

- Structural results for the control of queueing systems using event-based dynamic
programming

Martin Puterman (University of British Columbia, Canada)

- Bias optimality and the expected total reward criterion in Markov decision processes

- Bias optimality in controlled queueing systems

- Stopping times and Markov reward processes

Lyn Thomas (University of Edinburgh, United Kingdom)

- Parallel computation algorithms for solving Markov Decision Processes

- A comparison of dynamic programming and linear programming for reservoir control

Koos Vrieze (Maastricht University)

- Evaluation criteria in competitive Markov decision processes

- Economical and political applications of competitive Markov decision processes

**Combinatorial games**

Vincent Feltkamp (University of Alicante, Spain)

- Network construction and cost allocation I, II and III

Herbert Hamers (Tilburg University)

- Introduction

- Some topics in sequencing games and delivery games I and II

**4. INFORMATION ABOUT THE SPEAKERS**

**Eitan Altman** received the B.Sc. degree in electrical engineering (1984), the B.A. degree
in physics (1984) and the Ph.D. degree in electrical engineering (1990), all from the
Technion-Israel Institute, Haifa. In 1990 he further received his B.Mus. degree in music
composition in Tel-Aviv University. Since 1990, he has been with INRIA (National research
institute in informatics and control) in Sophia-Antipolis, France. His current research
interests include performance evaluation and control of telecommunication networks,
stochastic control and dynamic games.

**Vincent Feltkamp** graduated in 1995 at Tilburg University on the Ph.D. thesis
"Cooperation in controlled network structures". During 1996 he worked at the Hebrew
University in Jerusalem, and now he has a post-doc position at the University of Alicante.
His research interests are in the fields of cooperative and non-cooperative game theory
and microeconomics.

**Bruno Gaujal** graduated from ENS Lyon in 1991 and received his Ph.D. from University
of Nice, France, in 1994 under the supervision of François Baccelli. Since then, he is with
INRIA Sophia-Antipolis. His research activities are mainly focused on discrete event
systems. In particular, he has been working on stochastic properties of Petri nets and
parallel simulation based on dynamic equations written in the so called (max,+) algebra.

**Herbert Hamers** graduated in 1995 at Tilburg University on the Ph.D. thesis "Sequencing
and delivery situations: a game theoretic approach". Currently, he is assistant professor
in Mathematics at the Department of Econometrics of Tilburg University. His primary
research interests are on the interaction between combinatorial optimization and
cooperative game theory.

**Arie Hordijk** received his Ph.D. in mathematics from the University of Amsterdam in 1974.
He was research associate at the Mathematical Centre and associate professor at the
Free University in Amsterdam. Since 1976 he has been professor of Operations Research
at Leiden University. He is (co)author of 120 research papers and one book. In 1980 he
received the Van Dantzig Prize and in 1990 a Fulbright grant. He is editor of the journals
Advances in Applied Probability, Journal of Applied Probability, Mathematical Methods of
Operations Research and Probability in the Engineering and Informational Sciences.

**Ger Koole** received his M.Sc. and Ph.D. from Leiden University, in the field of stochastic
Operations Research. His supervisor was Arie Hordijk. After that he held post-doc
positions at CWI Amsterdam and INRIA Sophia Antipolis. Currently he is assistant
professor in the Mathematics Department of the Free University Amsterdam. His research
is centered around stochastic control and its applications to queueing networks.

**Lodewijk Kallenberg** received his Ph.D. from Leiden University on the thesis "Linear
programming and finite Markovian control problems" with Arie Hordijk as thesis advisor.
Since 1992 he is Director of the Dutch Network on the Mathematics of Operations
Research (LNMB). His research interests are on algorithmic aspects of stochastic dynamic
programming problems.

**Martin Puterman** is Advisory Council Professor of Management in UBC's Faculty of
Commerce. He received his Ph.D. in Operations Research from Stanford University in
1971. He received the prestigious INFORMS Lanchester Prize in 1995 for his book Markov
Decision Processes and is on the editorial board of Mathematics of Operations Research.

**Lyn Thomas** received a Ph.D. from University of Oxford in 1971. Researched and taught
subsequently at the University of Wales, Swansea, University of Manchester and Naval
Postgraduate School, Monterey. Research interests are the application of dynamic
programming to large scale problems; credit scoring and behavioural scoring.

**Koos Vrieze** is since 1987 chairman of the Department of Mathematics at the Maastricht
University. In 1983 he finished his Ph.D. under the supervision of Henk Tijms and Stef Tijs
and the title of his thesis is "Stochastic games with finite state and action spaces". His
main research topics concern: noncooperative game theory, stochastic games and
mathematical simulation. More recently he works on applications of operational research
techniques to practical situations in environmental areas and in public health.

**5. ABSTRACTS**

**Minicourse Multimodularity and Sequential Optimization**

__Part 1: Introduction (Arie Hordijk; lecture 1)__

In his paper of 1985 "Extremal splitting of point processes", B. Hajek introduced a convexity property, which he called multimodularity. It is defined for functions on the integer points of a multi-dimensional Euclidean space. Multimodularity appears to be a key property in the optimal control of queueing networks.

In this series of presentations we will introduce and characterize multimodularity, and we
will show how it applies to optimal control of queues and (max,+) linear systems. Since the
structure of the optimal policy is a balanced sequence, we will give an overview of
properties of balanced sequences. Finally we will discuss an algorithm and its
convergence to a global optimal policy.

__Part 2: Convexity and optimization properties (Eitan Altman; lecture 1)__

We present in this talk multimodular functions. This property generalizes integer convexity.
We show the relation between convexity and multimodularity, which allows us to restrict
the study of multimodular functions to convex subsets of integer pionts in an m-dimensional space. We then obtain general sequential optimization results for discounted
and average costs criteria. We assume that the immediate costs are convex increasing
functions of multimodular functions. In particular, we establish lower bounds for the value
of the discounted problem, and show that the expected average problem is optimized by
using Hajek's most regular sequences. We consider first single objective problems, and
then extend the results to multi-objective problems.

__Part 3: Applications in (max,+) linear systems (Bruno Gaujal; lecture 1)__

In this talk, we present the concept of (max,+) linear systems through a general definition
as well as several concrete examples. We show the interest of the (max,+) formalization
by proving that the expected traveling time of a customer from its entrance in the system
to any node *i* is multimodular. Instead of using ad hoc methods for every system we may
be interested in this general approach; it solves optimal admission problems for the whole
class of (max,+) linear systems.

__Part 4: Balanced sequences (Bruno Gaujal; lecture 2)__

In this talk, we give an overview on the notion of balanced sequences, which can be
defined informally as the solution of the following simple problem: "Given *k* slots, how to
distribute *p* tokens as evenly as possible among the slots?" We provide several
characterizations of the balanced sequences and we show some curious properties of
such sequences. In the second part of the talk, we establish the link with multimodular
functions and we apply these two concepts to optimize the admission control in several
queues. In particular, new results such as the optimality of the round robin policy for
homogeneous G/G/1 queues come out as simple corollaries of the theory.

__Part 5: Applications to the control of admission and of vacation in a single queue. (Eitan
Altman; lecture 2)__

We consider optimal control problems related to a single queue; we study both admission
control and control of vacations, in a setting in which the controller has no information on
the state of the queue. We first present examples of a deterministic control (in which
interarrival times and service times are deterministic). We show that the Hajek policy
minimizes the average queue length for the case of an infinite queue, but not for the case
of a finite queue. When further adding a constraint on the losses, it is shown that Hajek's
policy is also optimal for the case of a finite queue. We then study the stochastic G/G/1
queue, allowing for general (not necessarily i.i.d.) arrival processes. Our analysis is based
on Lindley's equations and some coupling ideas, which leads to multimodularity properties
of expected functions of waiting times and workload. We finally present the optimal control
of stochastic polling models with no state information.

__Part 6: Markov decision chains with partial information (Arie Hordijk; lecture 2)__

In this talk Markov decision chains with partial information will be presented. This
Markovian model is especially suited for analyzing the optimal routing of customers to
exponential queues, in case the information on the queue sizes is not complete. We give
an algorithm to compute quasi-optimal policies. The problem whether a quasi-optimal
policy is also a global-optimal policy, can be solved through multimodularity of the routing
model.

**Markovian control problems and queueing**

__Introduction (Lodewijk Kallenberg)__

In this talk an introduction is given to Markovian control problems. The model is introduced
and several optimality criteria are discussed, e.g. discounted optimality, average
optimality, bias optimality and the total reward criterion.

__Bias optimality and the expected total reward criterion in Markov decision processes
(Martin Puterman; lecture 1)__

This talk provides an overview of the bias optimality criterion for infinite horizon MDPs and
investigates its relationship to average optimality and expected total reward optimality. It
shows through simple examples that an obvious policy iteration algorithm may fail to find
optimal policies under the expected total reward criterion. It then shows how a nested
policy iteration algorithms which find bias optimal policies can be used to solve MDPs
under the expected total reward criterion.

__Parallel computation algorithms for solving Markov Decision Processes (Lyn Thomas;
lecture 1)__

The paper reviews the different ways Markov decision processes solution algorithms can
be modified to work both on dedicated parallel machines and on local area networks. It
concentrates on numerical experiments with value iteration algorithms that suggest a new
multi-phase method.

__Evaluation criteria in competitive Markov decision processes (Koos Vrieze; lecture 1)__

Competitive Markov decision models in which two or more decision makers with conflicting
interests have the possibility to influence the coarse of a system at discrete time moments,
can be modelled as stochastic games. For the infinite horizon case optimal strategies
depend on the choice of the way an infinite stream of immediate rewards is evaluated,
namely discounted reward, average reward, weighted reward or total reward. The
relationship between these criteria will be explained as well as the consequence of the
choice of the criterion on optimal behaviour of the decision makers.

__Stochastic bounds for queueing systems with multiple on-off sources (Ger Koole; lecture 1)__

Performance measures for systems with multiple on-off sources are compared with
systems with a single on-off source and with systems with Poisson arrivals. Multiple
servers, and both infinite and finite capacity buffers are analyzed. We use dynamic
programming, for initially stationary on-off processes, to prove our results. This concerns
joint work with Zhen Liu.

__Bias optimality in controlled queueing systems (Martin Puterman; lecture 2)__

This talk investigates the bias optimality of control limit policies in an admission control
M/M/1 queueing system. It shows that the only gain (average) optimal stationary policies
with gain and bias which satisfy the optimality equation are of control limit type, that there
are at most two and if there are two, they occur consecutively. Conditions are provided
which ensure the existence of two gain optimal control limit policies and are illustrated
with an example. The main result is that bias optimality distinguishes these two gain
optimal policies and that the larger of the two control limits is the unique bias optimal
stationary policy. Consequently it is also Blackwell optimal. This result is established by
appealing to the third optimality equation of the Markov decision process and some
observations concerning the structure of solutions of the second optimality equation.

__A comparison of dynamic programming and linear programming for reservoir control (Lyn
Thomas; lecture 2)__

Multi-reservoir control problems are classes of problems that can be solved using standard
dynamic programming algorithms, dynamic programming algorithms applied to
decomposition models, linear programming, and Benders decomposition variants of linear
programming applied to multi-reservoir control problems. This paper highlights the relative
computatonal advantages of the different approaches ( and how well they each parallelise)
as well as pointing out the similarity between Benders decomposition and dynamic
programming.

__Stopping times and Markov reward processes (Martin Puterman; lecture 3)__

This talk describes the relationship between negative binomial stopping times and the
expected discounted reward in discrete time Markov Reward processes and between
gamma stopping times and the expected discounted reward in continuous time Markov
reward processes. It shows how this perspective provides some interesting interpretations
for commonly used criteria in MDPs and presents and interprets differential equations
relating the returns of different randomly stopped sums of rewards. It describes how these
results can be applied to simulate the expected discounted reward and raises issues about
the relationship between these quantities and Laurent series expansions.

__Economical and Political Applications of Competitive Markov Decision Processes (Koos
Vrieze; lecture 2)__

Many economical and political situations of conflict can be modelled as stochastic games.
During the talk three examples will be worked out. The first one is the Presidency game
in which a Republican and Democratic candidate repeatedly fight for the presidentship.
The second one concerns an optimal advertisement strategy in a dynamic duopoly model.
And the third one treats a pollution tax model and concentrates on production strategies
of firms related to the tax level.

__Structural results for the control of queueing systems using event-based dynamic
programming (Ger Koole; lecture 2)__

We study monotonicity results for optimal policies of various queueing and resource
sharing models. By concentrating on the events and the properties instead of on the value
function we propose a unified treatment of monotonicity results for Markovian models. This
is illustrated with one and two-dimensional models. Recent results are discussed.

**Combinatorial games**

__Introduction (Hamers; lecture 1)__

In combinatorial optimization an optimal action with respect to a given objective function
and a given, in general very large, finite set has to be chosen by one decision maker. In
combinatorial games there are more decision makers which can save costs by
cooperation. The questions that arise are which coalition(s) will be formed and how the
costs of a coalition should be allocated to the members of this coalition. Solution concepts
are given based on game theoretic concepts. Also some special models are discussed:
minimum cost spanning tree games, linear programming games, flow games, traveling
salesman games and routing games.

__Network construction and cost allocation (Feltkamp; lectures 1,2, and 3)__

When a group of users has to be connected to a central supplier of a good, a minimal cost
spanning network (often a tree) has to be constructed, but also the cost of the network has
to distributed among the users in fair or reasonable way. It will be shown that the two
problems of construction and cost allocation can be solved simulateously. Axiomatic
characterizations are provided. Methods from cooperative game theory are used, e.g. the
core of the associated games. We also present a non-cooperative approach in which each
player strategically connects himself to the supplier and we will prove that the Nash
equilibria of these games coincide with the so-called Bird's tree allocations. Furthermore,
we present some applications.

__Some topics in sequencing games and delivery games (Hamers; lectures 2 and 3)__

In sequencing problems jobs have to be processed on a number of machines such that some objective, e.g. weighted completion time, is satisfied. In sequencing games a player is assigned to each job. Now, before a machine starts processing, these players can arrange their jobs to save costs. The concepts of admissible rearrangements and cooperative sequencing games are analyzed.

In delivering problems as the postman problem, a route with minimal travel costs has to be found which visits all streets at least once. In delivery games a player is assigned to each street. In a delivery game the worth of each coalition S is equal to the costs of a S-tour (i.a. a tour which visits the streets of S at least once) with minimal travel costs. Balanced, totally balanced and submodular delivery games will be characterized by several classes of (directed) graphs.