Hello.
I need to copy ~40,000 files from a server to my computer, and I'm wondering what is the best approach to solve this problem.
using scp
- slow
- consume lots of bandwidth
using rsync
- slow
- consume less bandwidth
- can resume copy after a network problem
using tar
then scp
- less slow
- consume less bandwidth
using tar
then rsync
- less slow
- consume less bandwidth
- can resume copy after a network problem
using tar
then split
then parallel
with scp
- fast
- consume less bandwidth
using tar
then split
then parallel
with rsync
- fast
- consume less bandwidth
- can resume copy after a network problem
I think I will opt for the last one, but what would you do in my case?
Edit: bash commands for using tar
then split
then parallel
with rsync
:
Prerequisite: Install parallel and remove warning:
sudo apt install parallel && echo "will cite" | parallel --citation &>/dev/null
# on server
tar cfz files.tar.gz ~/path/to/folder/
split -b 20M files.tar.gz fragment_
# on local machine
cat $(ssh host@server ls -1 fragment_*) | parallel rsync -z host@server:{} .
cat frament_* > files.tar.gz
tar xvf files.tar.gz
Edit 2: I used a simple rsync command, since it can compress files on the fly and handle restart from where the transfer stopped.
Since rsync always use the max bandwidth available it isn't a bottleneck that can be solved with parallel
.
Top comments (0)