GCP Increase Download Speed for gsutil

If you’re downloading large files, you can speed up download speeds by modifying gsutil. Decrease the number of threads and increase the number of components.

gsutil \
-o "GSUtil:parallel_thread_count=1" \
-o "GSUtil:parallel_process_count=8" \
cp gs://bucket/source.dat /download/dest/file.dat

Elapse Time on Bash Script

Here’s a nice little Bash function in a script to display the elapse time. It’s a nice function for showing how long a process ran.

#!/bin/bash
# pass number of seconds as argument. 
# Example below calculates 1000s.
# elapse.sh 1000
function show_time () {
    num=$1
    min=0
    hour=0
    day=0
    if((num>59));then
        ((sec=num%60))
        ((num=num/60))
        if((num>59));then
            ((min=num%60))
            ((num=num/60))
            if((num>23));then
                ((hour=num%24))
                ((day=num/24))
            else
                ((hour=num))
            fi
        else
            ((min=num))
        fi
    else
        ((sec=num))
    fi
    echo "$day"d "$hour"h "$min"m "$sec"s
}
show_time $1

Fpsync

Fpsync is command line tool for synchronizing directories in parallel using fpart and rsync tools. You can specify a number of concurrent sync jobs, number of files per sync job, and the maximum byte size per sync among other things. Fpsync is believed to be 4 to 5 times faster than rsync. Fpsync makes sense when syncing massive drives with thousands of directories and small files.

To install fpsync.

apt install fpart

Fpsync with 8 parallel jobs.

log='/root/fpsync.log'
fpsync -n 8 -v /root/tmp1/ /root/tmp2/ >> $log

A sample Script with timestamps to display elapse time.

#!/bin/bash
log='/root/fpsync.log'
start=$(date)
begin=$(date +%s)
echo 'Start: '$start > $log
fpsync -n 8 -v /root/tmp1/ /root/tmp2/ >> $log
stop=$(date)
end=$(date +%s)
echo 'Stop: '$stop >> $log
elapse=$((end-begin))
 
function show_time () {
    num=$elapse
    min=0
    hour=0
    day=0
    if((num>59));then
        ((sec=num%60))
        ((num=num/60))
        if((num>59));then
            ((min=num%60))
            ((num=num/60))
            if((num>23));then
                ((hour=num%24))
                ((day=num/24))
            else
                ((hour=num))
            fi
        else
            ((min=num))
        fi
    else
        ((sec=num))
    fi
    echo "$day"d "$hour"h "$min"m "$sec"s
}
show_time $elapse >> $log

For comparison, you can substitute fpsync with rysnc and see the performance difference.

fpsync -n 8 -v /root/tmp1/ /root/tmp2/ >> $log
# or
rsync -av /root/tmp1 /root/tmp2/ > /var/null