Friday, July 6, 2012

DNA as a Data Storage Device

DNA as a Data Storage Device

In this day and age, we are all surrounded by technology, with gadgets and gizmos such as CDs, iPods, phones, computers, USBs – all driving us towards the ongoing quest for new and better ways to store information. With the past few years, scientists have been investigating every possibility, ranging from semiconductors to carbon “nanoballs” to even our very own DNA!
Deoxyribonucleic acid, or DNA, possesses many ideal characteristics of a data storage device for the future. Present in all living organisms, a key feature of DNA is its capacity to store significantly large amounts of information in its nucleotide sequences. The structure of a nucleotide consists of a sugar-phosphate backbone, attached to one of the four nitrogenous bases – Adenine, Thymine, Cytosine and Guanine.
Figure 1: The structure of a nucleotide, consisting of a phosphate group, deoxyribose (sugar) and a nitrogenous base.





Using genome sequencing, these nucleotides can be connected to form synthetic oligonucleotide sequences containing data stored in the form of specifically ordered nitrogenous bases.
In a recent study conducted by Yachie et al. (2007) at Keio University, the practicality of using bacterial DNA for long-term, large-volume data storage was investigated. The researchers were able to store a short, alphanumeric message in the loci of a Bacillus subtilis genome and retrieve it successfully. To do so, their chosen message “E=mc2” was firstly translated into dinucleotides, using a 4-bit binary code encryption key.
Figure 2: Encryption keys used in the Yachie et. al (2007) study at Keio University, Japan.

These dinucleotides were then used to form long sequences that were then injected into the Bacillus subtilis cells. After an overnight incubation period, the data was then recovered.
Figure 3: The 4-bit binary codes translate into dinucleotides which make up synthetic oligonucleotide sequences.

 The most common data storage and recovery method for DNA is based on polymerase chain reaction (PCR), which involves the use of primers to amplify the coded regions of DNA. Encryption keys are then employed to decode each dinucleotide into its corresponding bit code and if necessary, into alphanumeric code for convenient use or interpretation.
Figure 4: Bacillus subtilis under a microscopic.

Not only can DNA significantly more bytes than our currently existing mechanisms, but it is also praised for its extreme durability in long-term data storage. Naturally, DNA is passed down from generation to generation of living organisms, and because of this, scientists postulate that any data inserted in an organism’s genome will last as long as the line of the host organism, which is often hundreds of thousands of years. 
However, if the organism undergoes genetic evolution or adaptation, there are several problems that may occur, including data transmutation or loss. Several methods have been suggested to reduce the effect of these mutation rates, such as the selection of a robust host organism that can survive in harsh environments. In addition, the study by Yachie et al. (2007) suggests storing the data in an “alignment-based” method, where several back-ups of the data are also inserted with the original information to increase the stability of DNA data and reduce the chances of data deletion.
The phenomenon of using genomic DNA to archive information is considered as a significant advancement in genetics. According to recent studies, the natural characteristics of DNA, such as compactness, heritability and durability construct it as an ideal data storage device – which may ultimately blur the line between nature and technology forever.

No comments:

Post a Comment