Transferring Data#
This guide covers methods for transferring data between your local system and the NMTHPC cluster.
Before You Transfer#
Planning Your Transfer#
Consider:
Data size: Small files (< 1 GB), medium (1-100 GB), or large (> 100 GB)
Number of files: Few large files vs. many small files
Frequency: One-time transfer vs. regular synchronization
Location: Where you’re transferring from (on-campus, off-campus, another HPC center)
Destination Filesystems#
Choose the appropriate destination:
Home directory (
/home/username): Small datasets, code, scriptsZFS: Active project data, large datasets
BeeGFS: Long-term storage, archival data
See Nodes and Filesystems for more information.
Command-Line Tools#
SCP (Secure Copy)#
Best for: Small to medium-sized files, simple one-time transfers
Copy file TO NMTHPC:
$ scp localfile.txt username@nmthpc.id.nmt.edu:~/
Copy file FROM NMTHPC:
$ scp username@nmthpc.id.nmt.edu:~/remotefile.txt ./
Copy directory recursively:
$ scp -r local_directory username@nmthpc.id.nmt.edu:~/destination/
Copy multiple files:
$ scp file1.txt file2.txt username@nmthpc.id.nmt.edu:~/data/
Rsync (Recommended for Most Transfers)#
Best for: Synchronizing directories, resumable transfers, large datasets
Basic rsync to NMTHPC:
$ rsync -avzP local_directory username@nmthpc.id.nmt.edu:~/remote_directory/
Flags explained:
-a: Archive mode (preserves permissions, timestamps)-v: Verbose output-z: Compress during transfer-P: Show progress and allow resuming
Rsync from NMTHPC:
$ rsync -avzP username@nmthpc.id.nmt.edu:~/remote_directory ./local_directory/
Dry run (see what would be transferred):
$ rsync -avzPn local_directory username@nmthpc.id.nmt.edu:~/remote_directory/
Exclude files:
$ rsync -avzP --exclude='*.log' --exclude='tmp/' local_dir username@nmthpc.id.nmt.edu:~/
Delete files on destination (use carefully):
$ rsync -avzP --delete local_directory username@nmthpc.id.nmt.edu:~/remote_directory/
Tip
Rsync only transfers changed files, making it ideal for synchronizing directories and resuming interrupted transfers.
Tar and Compress Before Transfer#
For many small files, compress into a single archive first:
Create compressed archive:
$ tar -czf mydata.tar.gz my_directory/
Transfer archive:
$ rsync -avP mydata.tar.gz username@nmthpc.id.nmt.edu:~/
Extract on NMTHPC:
$ tar -xzf mydata.tar.gz
Tip
Transferring a single compressed archive is much faster than transferring thousands of small files individually.
Transfer Best Practices#
Optimize Transfer Speed#
Compress data: Use
-zwith rsync or create tar.gz archivesUse multiple connections: Some tools support parallel transfers
On-campus transfers: Faster when on NMT network
Off-campus: Use VPN for security and potentially better routing
Avoid peak hours: Large transfers are faster during off-peak times
Many Small Files#
For directories with many small files:
Option 1: Create archive first:
$ tar -czf archive.tar.gz directory/
$ rsync -avP archive.tar.gz username@nmthpc.id.nmt.edu:~/
Option 2: Use rsync with appropriate options:
$ rsync -avzP --info=progress2 directory/ username@nmthpc.id.nmt.edu:~/directory/
Verification#
Verify transfer with checksums:
Local system:
$ md5sum largefile.dat > largefile.md5
After transfer to NMTHPC:
$ md5sum -c largefile.md5
Compare directory sizes:
$ du -sh directory
Transferring from Other HPC Systems#
Direct Transfer Between HPC Systems#
From NMTHPC to another HPC:
# On NMTHPC
$ scp -r data/ username@other-hpc.edu:~/destination/
From another HPC to NMTHPC:
# On the other HPC system
$ scp -r data/ username@nmthpc.id.nmt.edu:~/destination/
Troubleshooting#
Permission Denied#
Verify destination directory exists
Check write permissions on destination
Ensure you have quota space available:
$ quota -s
Too Many Small Files#
Create tar archive first
Use
rsyncwith--info=progress2for better progress reportingConsider parallel transfer tools for very large numbers of files
Automated Transfers#
Using SSH Keys for Automation#
Set up SSH keys to avoid entering passwords:
# Generate key (if you haven't already)
$ ssh-keygen -t ed25519
# Copy to NMTHPC
$ ssh-copy-id username@nmthpc.id.nmt.edu
Questions?#
For questions about data transfer or issues with file transfers, contact hpc@nmthpc.atlassian.net.