High Capacity DNA Data storage: could all your digital photos be stored as DNA?

Wednesday, July 7, 2021

The world is moving online more and more. As our lives become more digital, we are also requiring larger storage to match the burgeoning strain we place on digital ecosystems. Currently for people looking to save their digital world, they have a couple options, with the most popular being on a harddrive and in the cloud. But on the global scale we are discussing in this article, there are no harddrives or clouds large enough.


To provide an example, there are roughly 10 trillion gigabytes of digital data that is in circulation currently. On top of this a further 2.5 million gigabytes is created everyday in the form of photos, emails, digital files, and social media activity. On this scale, a lot of the data is stored in places called exabyte data centres, they are huge buildings that can be larger than multiple football fields put together.


For a few years now there has been an idea in circulation that there is another way to store data. For some scientists, the solution for our ever increasing storage needs can be found by analysing our DNA. The reason scientists believe DNA is the answer to improve storage in the future is because DNA also stores huge amounts of data at extreme density. Whereas with the current exabyte data centres that can store roughly 1 billion gigabytes of data, if we were to have a glass full of DNA, in theory, that would be able to hold all of the world’s data. 


On this subject, Mark Bathe, a biological engineering professor from MIT said, ‘We need new solutions for storing these massive amounts of data that the world is accumulating, especially the archival data’ One of the big issues is that a lot of data is building up, while it isn’t necessarily being used, it is vital that it is kept and still accessible. Bathe continues by saying, ‘DNA is a thousandfold denser than even flash memory, and another property that’s interesting is that once you make the DNA polymer, it doesn’t consume any energy. You can write the DNA and then store it forever.’


While this might sound like a far off idea that is closer to science fiction, there have already been cases of this being done. Scientists have taken images and text pages, then encoded them as DNA.


One of the difficulties when designing data storage at this scale is not just putting the files in a secure location, but how the files can be navigated and accessed easily. Discussing this topic, Bathe says, ‘assuming that the technologies for writing DNA get to a point where it’s cost effective to write an exabyte or zettabyte of data in DNA, then what? You’re going to have a pile of DNA, which is a gazillion files, images or movies and other stuff, and you need to find the one picture or movie you’re looking for [...] it’s like trying to find a needle in a haystack.’


 One of the ways that Mark Bathe has negotiated this issue with his colleagues is by surrounding each data file in a 6-micrometer particle of silica. To show people what is contained inside this silica particle, this is labelled with a DNA sequence. This functions in a similar way to each item of food you buy from a shop that has a barcode. 


By doing this Bathe was able to showcase a proof of concept wherein the researchers identified separate images stored as DNA out of a group of 20. Bathe explains, ‘At the current state of our proof-of-concept, we’re at the 1 kilobyte per second search rate. Our file system’s search rate is determined by the data size per capsule, which is currently limited by the prohibitive cost to write even megabytes worth of data on DNA, and the number of sorters we can use in parallel. If DNA synthesis becomes cheap enough, we would be able to maximise the data size we can store per file with our approach.’


As Bathe explains, a current downside to DNA storage is the amount of money it costs to create DNA. As it stands, one petabyte of DNA data costs $1trillion. So to truly become competitive with current storage models, the cost needs to come down exponentially, which Bathe believes will happen within ten to twenty years.


The potential for high capacity DNA is immense. Whilst there is still a long way for it to go before we are able to utilise it in our everyday lives, when the cost and efficiency of the process comes down the implications for society could be huge.