Cuelogic Blog Icon
Cuelogic Career Icon
Home > Blog > Data And Machine Learning > Big Data Analytics > Big Data Coming Live with DNA!

Big Data Coming Live with DNA!

Big Data Coming Live with DNA!

Just think about it for a moment: One gram of DNA can store 700 terabytes of data. DNA finally rescues big data from the storage crises.

"You can have data without information, but you cannot have information without data."- Daniel Keys Moran

Big Data Comes Live with DNA

The thought of this great seer provoked me to ponder upon some deeper insights regarding big data. The analogy struck me hard. The library that has served human brains from decades can be thought as a metaphor for big data storage.

Every bit of information is a thought, irrespective of its utility it is unique and holds some value. Stories may be for the Arts students and formulas for the students of Mathematics. All the big data is important for someone or the other. But, its importance may be realized only when it is available (when it is stored and conserved).

When we talk about storing every bit of information, it is going to be voluminous, complex and varied. Data in a large chunk will not make sense unless it is processed and broken down into smaller bits to get the best information out of the data.

Big Data is like a solution with a problem. Its comprehension brings a natural query of ‘data storage’ with it. Let us understand its gravity before reaching out to the solution:

Data storage, a great challenge!

Data is blood for the businesses, it is important to conserve it. Managing data storage for performance, integrity, and scalability is the next summit in Information Technology management and planning.

Main challenges for data storage are:

Volumes of Data

The biggest challenge in handling of big data is the volume of the data. Big data does not come in small amounts at all. It is always in humongous volumes that require large servers to store it.


You could have the best storage facilities but then without good security it is not going to achieve much. Hackers are not sleeping nowadays, working on powerful technology and targeting unsuspecting individuals to wreak havoc in their lives. Security is of special importance if you are handling people’s personal information such as credit card details or even their addresses. It is not possible to get away with poor technology.

Data Management

Data management involves everything from addition of new data to the database and processing to retrieval and use of the data. What kind of data should be flowing into your database? It also involves cleaning out junk and improving the quality of the data that is stored in your database. There is always the temptation to store data that is no longer important to your business which should be resisted.


It is told in Gitopanishad, that the answers should be searched within. Researchers too found the solution for data storage within, in their own DNA! 🙂

A contract has been struck between a startup i.e, Twist Bioscience and Microsoft to encode huge information on synthetic DNA to test its potential as a new medium for data storage.

Since the shelf life of data stored in media is finite and needs higher upkeep price, therefore, researchers have discovered the way by storing digital data in synthetic DNA, which remains intact for thousands of years.

Twist Bioscience will provide Microsoft with 10 million DNA strands for the purpose of encoding data. In other words, Microsoft is trying to figure out how the same molecules that make up humans' genetic code can be used to encode digital information.

Though the commercial product is years away, but the initial tests have shown that it is possible and it's possible to encode and recover 100 percent of digital data from synthetic DNA, said Doug Carmean, a Microsoft partner architect, in a statement.

Also, Twist CEO Emily Leproust accepted,“today, the vast majority of digital data is stored on media that has a finite shelf life and periodically needs to be re-encoded. DNA is a promising storage media, as it has a known shelf life of several thousand years, offers a permanent storage format and can be read for continuously decreasing costs.”

Apart from the news, some prior facts about DNA helps draw light on its splendor as a digital storage medium. Consider the error rate of DNA polyermerase, for every 10 billion basepairs copied, it makes an average of a single mistake — and that too in the very tough conditions that is a human body, exposed to a myriad of biological threats like polluted water, viruses, and McDonald’s take out.

Not only is DNA remarkably effective at retrieving and copying data, it’s extremely efficient in scale. It’s estimated that a diploid cell in the human contains about 1.5 gigabytes of information, which it can store and retrieve with frightening accuracy.

At 1.5GB per cell, the cells in your hand could provide a storage medium bigger than the largest mechanical hard drive in existence. And it’s easy to see why this should be the case. Storing and retrieving genetic information is fundamental to evolution, and it’s had a long time to perfect the process, much longer than we have been making hard drives. As a result, it’s understandable Microsoft is betting on DNA becoming the ultimate storage solution.

Real Time Work

A bioengineer and geneticist at Harvard’s Wyss Institute have successfully stored 5.5 petabits of fast data — around 700 terabytes — in a single gram of DNA, smashing the previous DNA data density record by a thousand times.

The work, carried out by George Church and Sri Kosuri, basically treats DNA as just another digital storage device. Instead of binary data being encoded as magnetic regions on a hard drive platter, strands of DNA that store 96 bits are synthesized, with each of the bases (TGAC) representing a binary value (T and G = 1, A and C = 0).


One gram of DNA can store 700 terabytes of data, that’s 14,000 50-gigabyte Blu-ray discs… in a droplet of DNA that would fit on the tip of your pinky. To store the same kind of data on hard drives — the densest storage medium in use today — you’d need 233 3TB drives, weighing a total of 151 kilos. In Church and Kosuri’s case, they have successfully stored around 700 kilobytes of data in DNA — Church’s latest book, in fact — and proceeded to make 70 billion copies (which they claim, jokingly, makes it the best-selling book of all time!) totaling 44 petabytes of data stored.


The future world holds the potential for biological storage. We would roam around in the world that captures every moment and record it for all eternity/ human posterity- as it will have that amount of storage capacity with the help of Synthetic DNA storage system.

It would not be dependent on the hard drives that need to be stored in warehouses for a finite time that too under a threat of failure anytime. If the entirety of human knowledge — every book, uttered word, and funny cat video — can be stored in a few hundred kilos of DNA, though… well, it might just be possible to record everything.