Last week I had the honor of attending and speaking at the Library of Congress as part of their “Designing Storage Architectures for Digital Collections”event. The two-day meeting was full of fascinating ideas and conversations, and I had a great time.
I spoke on OpenStack Swift, the storage engine behind SwiftStack. As people store more and more data, we need a storage system that is capable of handling the volume of data that’s being created. SwiftStack is specifically designed to support these workloads and to give users and operators the tools they need to consume and manage storage. I covered user features like metadata search and filesystem interfaces and operator features like geo-distributed clusters and seamless upgrades.
The event also included a fascinating look at the current state of the storage media industry. Robert Fontana of IBM presented this analysis, and it was the single best presentation of the event. I love talking about new storage tech and cool new ideas as much as anyone else, but Robert’s analysis was a clear-headed look at the current state of the storage industry without any hype. I learned two main things in this talk. While the archiving and preservation world is still predominantly using tape, that is changing. And looking at the overall market trends, flash storage is rapidly displacing spinning drives in cost, capacity, price, and performance. Anyone who wants to be relevant in five years needs to understand these trends and prepare now.
One interesting prediction made during the first morning is that in 10 years we will see the end of the current “hot, warm, cold” tiering of data. I loved hearing this. While tape is still prevalent because of its very low cost, it’s not good for data that needs to be accessible. The greatest thing about modern storage systems like SwiftStack is that it enables things that weren’t previously possible–store your archives in Swift, and the data is still available. This data availability lets you do things not possible if you’re storing the data on a tape and have best-case access times measured in minutes.
As new storage tech gets used for storing digital archives, this changes the idea of what a “digital repository” even is. When data becomes available, a digital repository transforms from a historic data fossil into data that can continue to be used and provide value.
The afternoon presentations on the first day dived into some new (and one old) storage tech. For the old, one Library of Congress researcher presented her team’s work on recreating old wax cylinders, accurate down to impurities introduced by the mold. By recreating these cylinders, Library researchers are able to understand how they behave in different environmental conditions (temperature, humidity, etc) and discover why Library artifacts have been breaking. These wax cylinders preserve some irreplaceable sound recordings, and the wax cylinder research gives clues on how to keep these treasures for future generations.
Speaking of preserving for future generations… Research on some crazy new technology was presented: digital storage using DNA and digital storage using etchings in glass. DNA storage is quite interesting because of its relatively long shelf-life (on the order of 30k-50k years) and fantastic storage density. DNA storage could be used to store as much as 10 exabytes in one cubic centimeter. Of course, the problems with using DNA to store digital information is that writing requires creating synthetic DNA and reading requires sequencing a genome. Neither of these are fast, and, as such, this technology is likely several decades away from being in practical use.
Another new technology presented during the event is storing data in glass crystals. Using femtosecond laser pulses and precisely focused lenses, you can write data into glass, and that data is stable with zero further power input for 100 million years. Talk about long-term storage! Again, like DNA, this tech is in its infancy, but I found the presentation both fascinating and exciting.
The final bit of “new tech” discussed is in production today at Facebook. Facebook presented on their optical storage technology that they’re using for archiving old photos. They’ve stored “only” tens of petabytes in their optical storage now. Panasonic, the company making the optical storage media system has been looking at getting their tech working with other storage systems, including Swift, and I’m looking forward to helping them get involved in the community.
There were several other topics discussed, including storage for law enforcement body cams, document archiving systems, the economics of archival storage, and how to make data available for researchers.
I was excited to spend a couple of days in the U.S. capital hearing more about the use cases for today’s storage needs and learning more about new storage media tech on the horizon. Participating and paying attention at these sort of events helps me grow personally and helps Swift continue to be the best open source object storage system in the world.
If you’re interested in using Swift for archival storage, sign up for a SwiftStack trial and get started today.