pcp is similar to cp -r ; simply give it a source directory and destination and pcp will recursively copy the source directory to the destination in parallel. It has several features that make it useful for copying data in a cluster computing environment with Lustre distributed filesystems:
- files are chunked and each chunk is copied in parallel
- Lustre striping information can be copied (or optionally set based on file size)
- built-in checkpointing to safely resume transfers that are interrupted by nodes failing, jobs being killed for running over queue limits, etc.
- checksumming of files after transfer to verify integrity of the copied data
pcp works best in a cluster environment with MPI support using the Lustre filesystem, but it can also copy files to/from other filesystems (without Lustre-specific functionality like stripe-awareness).