A Blueprint: Multi-Region, Multi-Cloud, Metadata-Searchable Storage

Fred Hutchinson Cancer Research Center.  HudsonAlpha Institute for Biotechnology. A top-10 global pharmaceutical.  Counsyl. Oklahoma Medical Research Foundation. Ploid. University of Virginia.  Georgia Tech. SURFsara. And more. These companies and institutions—ranging from leading pharmaceuticals to globally recognized not-for-profit researchers and myriad organizations providing specific services to the industry—have been moving rapidly into the future of scientific computing and research, and—for each of them—managing the ever-growing repositories of data has been a challenge.

Ever Expanding Life Science Data

This is the story of the current reality that we hear all too often:  Individual departments or researchers maintain their own servers and storage; some use Amazon or Google’s cloud services; some leverage a central IT department; most do some of both.  Important data gets shipped around the world on disk drives, so collaboration moves at the pace of freight shipments, and extra copies take up unnecessary space. The sum of all the laptop, workstation, GPFS/Lustre, NAS/SAN, and AWS/Google storage keeps growing, but no one has a clear view into what is stored where.  Most of the applications still require NFS or SMB/CIFS protocols, but the scalable/flexible/cost-optimized cloud storage vendors only support the S3 or Swift APIs…or try to bridge the gap with an expensive gateway that can’t scale like the storage itself. Centralized search of the “whole archive of data” is a pipedream; metadata must be a part of the answer, but there are no standards for what metadata to generate or how to store it to make global search a reality. Meanwhile, the expectations of things like personalized medicine only accelerate expectations for fast turnaround times, so project deadlines squeeze closer, and no one has the time to think about overhauling an entire infrastructure to proactively address all of this for the future.

Demands on Data; Collaborate, Security, Metadata, Time

If that resonates, you are not alone; every company mentioned above—and many others—have been where you are.  By itself, SwiftStack can’t cure cancer or prevent heart attacks or understand juvenile diabetes or help prepare new parents to care for a special-needs child, but we have made it a fundamental part of our business to simplify storage and data management so that you can.  This blueprint is intended to introduce a strategy and architecture that has worked well for many SwiftStack clients to date.

Research Paper

Modernizing Data Management in Scientific Research

Who else is using SwiftStack for this? What problems are they solving?

Research Paper | Modernizing Data Management in Scientific Research

We have included in the paper many details about solutions we deployed in close collaboration with our clients and ideas of how we hope they can help you as well, but if you would prefer a conversation, we’d love to speak with you directly.

Contact Us

About Author

Chris Nelson

Chris Nelson

Chris Nelson is the Vice President of Solutions at SwiftStack and leads the technical arm of the sales organization—including both the technical sales team of Solutions Architects and a Solutions Engineering team focused on developing and demonstrating integrations of SwiftStack software with third-party products and common industry workflows. Prior to joining SwiftStack, he spent several years leading sales engineering teams in storage software companies after spending over 10 years in a variety of storage product design and development roles at Sun Microsystems. Chris holds a B.S. in Computer Engineering, summa cum laude, from Kettering University and an M.S. in Computer Engineering from San Jose State University.