|
There is such a huge volume of genomics (DNA related data) and bio-informatics data being produced that it cannot be transferred over commercial Internet networks, and instead organizations are using FedX and other sneaker nets to ship the data.
The same crisis in data volumes is also occurring in the climate modelling and other fields as well.
Research and Education (R&E) networks for many years have been warning about this coming data tsunami. For the most part they have the capacity and the tools to easily enable the transfer of these large data volumes. No commercial networks have this capability at this time. But the biggest problem is a lot of this data is not being generated by universities or R&E organizations but commercial facilities closely aligned with the R&E community. Numerous bioinformatics companies, like SoftGenetics, DNAStar, DNAnexus and NextBio, have sprung up to as they have found life sciences a fertile market for products that handle large amounts of information.
This poses a real dilemma for many R&E networks, especially those who receive public funding. They cannot be seen to be competing with the private sector (even though commercial networks do not yet have the capability or technology to deliver such data volumes), and in many cases their stated public policies do not allow them to connect commercial facilities. Compounding this problem is that most of the modern computational tools needed to analyse this data are only available on commercial clouds. Academic HPC facilities and university based cloud solutions generally cannot scale as quickly as commercial cloud providers in providing as many cores as required on demand to analyse this data. As well many grad students and many small innovative businesses are developing the necessary analysis tools to work only on the commercial clouds, as they are driven by the revenue opportunity of “click compute” models offered by many commercial cloud providers.
R&E networks are thus conflicted. Academic institutions and commercial organizations need access to commercial clouds to analyse this torrent of data—yet their acceptable use policy may prohibit the interconnection to commercial facilities, especially if the other end of the connection is also a commercial organization. This is where Open Lightpath Exchanges can play a critical role, much like the earlier NAPs played in the early day of the commercialization of the Internet.
Open LightPath Exchanges, by their very definition are policy free. That means anyone can cross connect to anyone else regardless of whether they are commercial organizations or academic institutions. Open LightPath Exchanges are being established all around the world and many more are expected to be deployed in the coming year. A good background paper on Open LightPath Exchanges “Open Exchanges for Open Science” can be downloaded here [PDF].
Sponsored byCSC
Sponsored byDNIB.com
Sponsored byWhoisXML API
Sponsored byIPv4.Global
Sponsored byVerisign
Sponsored byRadix
Sponsored byVerisign
I believe there’s some work on moving portions of Internet2 to 100 Gbps. Is that insufficient for these R&E institutions?