Pan genomics is research on larger collections of genomes from related individuals or cells, and not just single genomes. Examplary collections represent human populations, ensembles of cancer and pathogen genomes ("viral quasispecies"). A local example is the "Genome of the Netherlands" project with its family- and space-oriented arrangement of genomes.
Pan-genomics studies have particularly profited from the latest developments in sequencing technology, empowered by the high throughput and the low cost of sequencing experiments. This was hardly thinkable only 1-2 years ago. Although research questions can vary substantially among the different communities, the underlying sequencing technology makes a common denominator, insofar as the computational issues are often near-analogous.
Processing DNA sequencing data is, still, by far not obvious---it is massive (‘big’), it is noisy and it poses computational riddles, because it comes in fragments, rather than in whole-length genomes. Several computationally involved steps immediately follow the data generation step. Even after computational processing, their output can leave the researcher with hard-to-interpret results. Last but not least, sequencing technology is under active and rapidly progressing development. This, again, gives rise to new chances in research, but also to new computational challenges.
The idea of this workshop is to systematically bundle the so far often still separate efforts and to create maximum synergy among all researchers participating.
The final goal of this workshop is to, together, write a white paper that summarizes all those challenges and chances, and their overlap among the different fields and to publish it in a journal of high impact, as future guideline and roadmap for computational pan genomics research.