Tuesday, October 15, 2013

Backing up and archiving your multiphoton data: the MCS 2.0.6 cloud-based solution

Optical imaging instruments such as multiphoton microscopes generate large amount of data. Because MOMs are often used to collect two-photon images many times per second over periods ranging from minutes to hours, typical files can routinely exceed hundreds of megabytes. The trend towards producing tens or hundred of gigabytes of data per day is accelerating with the use of video-rate resonant scanning imaging in behavioral experiments. As a resut, the problem of backing up and archiving terabytes of two-photon datasets is becoming particularly acute. One could use a local file server but this requires funds to buy the hardware, expertise to set up the server and some kind of periodic administration. And of course, when the file server is full, someone will have to take care of the problem.

The computer industry has already a solution, called "cloud storage" to safely store large amount of data without having to invest in a costly infrastructure. In exchange for a small fee, Web behemoths such as Amazon, Google and Microsoft offer disk space on their servers to end-users, provide a Web-based interface to administrate accounts and programming tools to upload or download files. The advantages of cloud storage include:

  • virtually unlimited or extremely large storage capacity.
  • reliability: the data is usually stored in triplicate and, optionally, at sites that are geographically distinct to prevent losing data in case of a catastrophic hardware failure.
  • universal availability through an Internet connection.
  • flexible pricing that scales up with the amount of data you store.

MCS 2.0.6 now leverages cloud storage capabilities by allowing:
  • in MScan, to systematically upload your newly created data files to a user-specific location on the cloud upon finishing an imaging session. You won't have to back up (or forget to archive...) your data files manually anymore.
  • in MView, to download data files from the cloud to your local disk storage to perform analysis. Your data is available anywhere and at all times as long as you have an Internet connection to access your cloud account.

MCS 2.0.6 at present supports two services, Amazon S3 (Simple Storage Service) and Microsoft Azure. If you opt for their fee-based plan, Amazon has no restriction on the amount of data you can store while Microsoft now caps the amount of data per storage to 200 TB. However, you can create as many storage accounts as you want. Maximal file size is 200 GB and 5 TB for Azure and S3, respectively.

There are many other cloud storage companies. Google has a service similar to S3 and Azure called Drive, but files cannot be larger than 10 GB, which could be a problem with data collected in resonant scanning mode. For those who are accustomed to Dropbox, it appears that this company uses Amazon S3 as its underlying infrastructure.

