Skip to main content

TL;DR 

  • The hardware underlying Mass Storage (/ms) is being replaced with a new system
  • It is critical to understand a few things about the migration to ensure you do not lose any data
  • You may need to take action, especially if you use scripts to copy data into or out of /ms
  • Migration of existing data in /ms will take a year (roughly) to complete (petabytes, 10’s of millions of files) and is proceeding by tape, not by onyen
  • Things look different to Mass Storage users as of October 27, 2020

 

Detailed Version:

ITS Research Computing is replacing the hardware underlying the service known as Mass Storage, or /ms from the command prompt on Longleaf, Dogwood, and Data Mover nodes.

On October 27 2020, the old /ms system was placed into READ ONLY mode, and mounted with a different name (/ms-old) while the new system is now available READ/WRITE and mounted with the names currently used by the old system (/ms/home and /ms/depts).

 

As of October 27, 2020 (8am EST):

  • Old system (where all current /ms data resides)
    • /ms-old/home  (read only)
    • /ms-old/depts   (read only)
  • New system (where new data is written and data from the old system is being migrated to)
    • /ms/home (read/write)
    • /ms/depts (read/write)

Once migration completes (approx. Fall 2021)

  • Old system
    • Not available
  • New system
    • /ms/home (read/write)
    • /ms/depts (read/write)

 

Beginning October 27, 2020:

To retrieve a file from Mass Storage: Please check the new system first (/ms) – if it is there, it is significantly faster to retrieve and the retrieval has zero impact on the ongoing migration. If the file is not yet available on the new system, you will need to retrieve it from the old system (/ms-old). If you copy a file from /ms-old to /ms, consider renaming it: The migration script does not check that the file is already there before moving a file to the new system; it will overwrite it instead (see NOTE about Potential for data Loss below).

To write to Mass Storage: you must write to the new system (/ms) since the old will be in read-only mode.

 

[table “” not found /]

 

File Sizes in /ms:

Your efforts to package data into appropriate file sizes/counts before copying them into /ms is greatly appreciated by the entire community of /ms users.

File sizes are an important consideration in archival storage. As a large holding tank for many UNC researchers and departments, the performance of the system can be severely impacted by how it is used. If a number of users attempt to store millions of small files, the file system will be severely degraded. The cost of data storage varies widely based on its characteristics. /ms is designed to be “cold storage”, or archival storage – data that is not frequently accessed. The system is optimized for high capacity, not for high performance, throughput, etc.

 

Recommended Minimum File Size for the new /ms: 10 GB
One of the greatest threats to the long term health and performance of a large archival storage system supporting a community of researchers is too many small files. Please use the tar command to combine files, directories, directories of files, etc. It is not necessary to compress files using tar/zip, however it is fine to do so. If your data is less than 10GB, our recommendation for long term storage is to retain this data on /proj instead of copying it to /ms.

Recommended Maximum File Size for new /ms:  3 Terabytes
While the system is capable of handling files larger than 3TBs, we recommend that any individual file copied into the archive be no larger than 3 terabytes. If you wish/need to store files that are larger than 3TB’s, please contact us at research@unc.edu

 

Order of file migration:

The files are being moved by tape and in the order they appear on each tape, not by onyen or directory name. This is because it is much faster to move an entire tape’s set of files once it is mounted and also produces much less wear and tear on the tape (so it is less likely to break). What this means to you is that some of your files from the old mass storage may get migrated at the beginning of the migration period while others may appear at the end, but they probably won’t appear all at once unless they are all on the same tape.

 

Migrated files last modified date:

Once a file is migrated, it will show its migration date as the file’s last modified date. Preservation of last modified information across migration requires a manual reset by a system administrator for each file, which would prohibitively increase the migration project timeline. At the request of file owners, a few file exceptions have been made and plans put in place for preservation of last modified information for those files only.

 

Hardware details:

  • Old system: Tape library with robot; two copies of each file stored: ITS Manning, Iron Mountain
  • New system: Disk based; two copies of each file stored: ITS Manning, ITS Franklin