Remote Copy

Copy files to a remote server, archiving them on the fly for optimal throughput

— Binary Adventures

There are a multitude of options when it comes to transferring files to remote servers: SMB, (S)FTP(S), WebDAV, etc. In this post, we’ll look at transferring files using the Copy-Item cmdlet in PowerShell, using a PSSession.

What is a (PS)Session? In short:

Technically, a session is an execution environment in which PowerShell runs. […] From a Windows perspective, a session is a Windows process on the target computer.

If you’re interested in the technical background, check this excellent blog post.

So what does the code below do?

  1. It starts by looking at the local folder you pass it. If it contains any files, it creates a temporary archive.
  2. Given a server name and a credential, a remote session (PSSession) is established to the server.
  3. The archive is copied to the remote server, and extracted in a newly created subfolder (with a random name to avoid overwriting existing objects).

Rationale

Why is this “better” or more optimal than just copying the files individually using Copy-Item?

If you’ve ever tried copying a large number of relatively small files over the network, you’ll notice that the transfer speed drops drastically, regardless of your bandwith. In contrast, copying a single file, equally large as the sum of all the individual files, will yield a much higher transfer speed.

By archiving the files before we transfer them, and extracting them on the remote server, we avoid having to copy many individual files and thereby optimise the transfer (here‘s a good post on Server Fault about this).

Proof and pudding

The proof of the pudding is in the eating, so here’s a quick test to show the difference when copying the files individually, or using when using an archive.

Given a set of folders with 538 files, totalling 198 MB, this is the time it took to transfer the files:

  • Using Copy-Item: 4 minutes, 33 seconds
  • Using Copy-Item in combination with archiving: 25 seconds.

While by no means a scientific test, it does show that there’s a big difference in throughput time, even when taking into account the overhead of archiving and extracting the files before and after the copy operation.

Code

Reference