ARES: Advanced Networking for EU genomic research

Tuesday, 11 March 2014
The completion of the human genome sequencing project represented a major milestone in the field of biological and medical sciences. It happened about ten years ago in the framework of the US project Human Genome. It has been the results of years of expensive research activity. At that (recent) time, although the importance of that result was clear, the possibility of handling the human genome as a commodity was far from imagination due to cost and complexity of sequencing and analyzing complex genomes.
Today the situation is very much different, since the order of magnitude of the cost necessary for sequencing a human genome is rapidly reducing.

In particular, the trend of the reduction of the cost for sequencing a human genome has outpaced the Moore’s law. This cost evolution is referred to as “the big drop”. It has begun in 2008, and is due to the introduction of novel sequencing machines. Under a very practical viewpoint, this means that the cost per producing a unit data of a genome decreases more rapidly than the cost for storing such a unit data and, more importantly, for distributing it. Hence, if the trend will continue for some years, the bottleneck of the process of effectively using the genome information will reside on the ICT side.

Against this background, the strategic objective of ARES is to create an advanced CDN service with the goal of supporting medical and research systems making a large use of genomic data, based on the cloud computing paradigm. The proposal leverages on the new functions brought by NetServ, a programmable service platform which allows deploying at runtime both caches and management services in routers and servers, the associated NSIS signaling, and the powerful cloud management functions of OpenStack. ARES will implement a pilot project in order to gain a more detailed understanding of network problems relating to a sustainable increase in the use of genome data sets for diagnostic purposes. Suitable management policies for very large sets of big files will be identified, in terms of efficiency, resiliency, scalability, and QoS in a distributed CDN environment. In addition, ARES will make available, for the GÉANT network, suitably designed tools for deploying any CDN services handling large data sets, beyond genomes.
