TechRadar Pro recently caught up with David Turek, CTO at Catalog, a pioneering company in DNA-based data storage.
The practice is seen by many as the next frontier in recording (and reading back) bits and bytes at exascale, beyond the memristor and other exotic data recording methods.
Here’s what Turek had to say about his company’s Shannon prototype:
About the machine
“It is called Shannon in honor of Claude Shannon, who is known as the ‘father of information theory’. Shannon is still a relatively early stage prototype being used to explore the storage and computational issues associated with the ability to encode data in DNA. It is not yet ready for commercial sale.”
“Shannon is a writer. It can currently write about data at about 10Mb/sec but we have discovered ways to improve the design to reach gigabit per second speeds. Reading the data back is accomplished with the types of machines used for DNA sequencing. In our case, we are using machines from Oxford Nanopore Technologies.”
How it works
“In conventional computing, data is represented as a string of 1s and 0s. A compressed version of data is also represented as a string of 1s and 0s and can be encoded into DNA. Therefore, Shannon can handle any conventional data compression scheme. As such any standard digital compression algorithm used in digital representation can be implemented in our encoding scheme.”
“However, we can further reduce the amount of data stored in DNA by virtue of the way we encode data to reduce the number of molecules used to represent the data. In a sense we are compressing data in both a conventional and novel way at the same time; we perform conventional compression on input data in a standard way, and then we take that compressed representation and further compact it in its DNA encoding scheme by reducing the number of molecules used to represent the previously compressed digital input data. In total, [as it stands] we can produce 186 GB of compressed data with Shannon.”
The future of Shannon
“Increasing capacity can be accomplished in many ways. For example, we can adjust the chemistry inside of Shannon, we can speed-up the mechanical aspects of Shannon (e.g. have the webbing move through the machine faster), and we can adjust the print head configuration.”
“We expect these to be just a few of the levers we can adjust on Shannon to increase throughput and capacity. Mass production issues will be addressed on the basis of what our early customers teach us with respect to their use cases. In general, we would expect that implementing greater degrees of automation will be the key to mass acceptance of the solution.”
“Miniaturization of Shannon is going to be the byproduct of in-house innovation of chemistry and engineering, along with a clear understanding of how clients would want to make use of the device. It is premature to speculate about the form factor of the device.”
How much will it cost?
“The selling price of the final version will be a determinant of expected cost reductions in design improvement and informed by market demand. This will become clearer over the next year.”
“The media will either be DNA in a solution or a desiccated form in something like a very small pebble. In either form it is likely that the amount of data you write will be contained in a volume smaller than a single test tube. There is no equivalent concept to write protection in conventional media; data is preserved by generating copies of the encoded DNA at almost no cost.”