Phylogenetic analysis has emerged in the past 30 years to become a standard part of all biological sciences, from ecology through development, physiology, biochemistry, immunology, genetics, parasitology, bacteriology, and on into medicine. Phylogenetic analyses play a central part in all of these disciplines, and several new areas of research have arisen that explicitly use phylogenies as a central component (eg. evolutionary ecology, evolutionary development).
The traditional model for representing evolution is the phylogenetic tree. Mathematically this is a tree in which the leaves of the tree are labelled by a set of contemporary species and each internal node of the tree represents a hypothetical ancestor. The phylogenetic tree provides an adequate model for capturing mutation and speciation events. However, it is less suited to capturing common “reticulate” evolutionary phenomena such as hybridization, recombination or lateral gene transfer, i.e. events in which lineages combine, rather than diversify. Phylogenetic networks are directed acyclic graphs and are used to display such complex evolutionary scenarios. They are a generalization of phylogenetic trees.
Even though biologists have long been interested in networks as representations of genealogical patterns among organisms, it is only recently that mathematicians and computer scientists started studying this topic in order to develop algorithms that can construct phylogenetic networks from biological data. Until very recently, there were no network reconstruction methods that are entirely suitable for the purposes of those biologists studying the phylogenetic history of organisms, and so biological publications continue to use phylogenetic trees to display evolutionary relationships even in studies where a network would be more suitable.
In October 2012, we organized a Lorentz workshop on “The future of phylogenetic networks” which had as long-term goals the development of practical algorithms for phylogenetic networks, and directing mathematical research on this topic in the directions that are most relevant for biologists. During this workshop, we identified the key features of phylogenetic networks that are important for biologists and thus must be addressed by any successful network algorithm.
The current workshop is an applied follow-up to this workshop. Its goal is to apply the techniques developed by the mathematicians to empirical data provided by the biologists, and to use this to (1) answer the specific scientific questions posed by the biologists; to (2) get an insight into which tools work best on which datasets, and why; and (3) to learn which tools are still needed.
During the workshop, we aim to i) discuss and compare the practical aspects of networks as they relate to both biology and computation, ii) analyze specific datasets to quantitatively assess the potential and limitations of various algorithmic approaches, and iii) lay the groundwork for future collaborations between biological and computational scientists. The "gold standard" datasets that we are going to analyze have been preselected and the participants will be familiar with these datasets before the workshop starts. We also keep the possibility open for participants to propose their own datasets.