Amy Whitehead's Research

the ecological musings of a conservation biologist

Copying files with R

9 Comments

Following on from my recent experience with deleting files using R, I found myself needing to copy a large number of raster files from a folder on my computer to a USB drive so that I could post them to a colleague (yes, snail mail – how old and antiquated!).  While this is not typically a difficult task to do manually, I didn’t want to copy all of the files within the folder and there was no way to sort the folder in a sensible manner that meant I could select out the files that I wanted without individually clicking on all 723 files (out of ~4,300) and copying them over.  Not only would this have been incredibly tedious(!), it’s highly likely that I would have made a mistake and missed something important or copied over files that they didn’t need. So enter my foray into copying files using R.

R has a nice set of file manipulation commands in the base package that make it really easy to find out if a file exists (file.exist), rename files (file.rename), copy them to a different directory (file.copy) or delete them (file.delete).  Basically, you point R at the directory where your files live, identify the files that you want to manipulate and then tell it what you want to do to them.  In my case, I wanted to identify all Geo-tiff formatted rasters whose filenames ended in “SDM” and copy them to a new directory.

# identify the folders
current.folder <- "C:/Where my files currently live"
new.folder <- "H:/Where I want my files to be copied to"

# find the files that you want
list.of.files <- list.files(current.folder, "SDM\\.tif$",full.names=T)

# copy the files to the new folder
file.copy(list.of.files, new.folder)

This will chug away for a bit (time for a coffee, anyone?) and then produce a vector of TRUE/FALSE the same length as your list.of.files that identifies whether it was able to copy them or not.  Pretty simple really.  The only tricky bit can be getting the regex pattern right to pull out the files you want to manipulate. There are many regex guides online – I often head over to Rubular to test a pattern if I’m having trouble (note that it’s actually designed for Ruby and not R but they seem similar enough that it has always worked so far).

9 thoughts on “Copying files with R

  1. A minor point, but in regex, a period (“.”) matches any character, not just the period itself. In your case, it typically wouldn’t be a problem, but for a literal period, you need to escape the “.” with a double backslash “SDM\\.tif$”, or by specifying fixed=TRUE. 🙂

    • Good point, John! That’s why I always check the list of files before I commit to copying (or deleting!) them to make sure I haven’t inadvertently done something stupid with my regex!

  2. You can also cheat and use glob2rx() to translate file-system wildcarding syntax to regular expressions. Also note that you can recurse through subdirectories in list.files, and retain or ignore subdirectories on output with copy.files.

    Many protected areas and field stations have minimal bandwidth. I’m with US NPS and have a couple of large usb sticks just for mailing back & forth. Alas, inconsistent file naming conventions make targeting files difficult, but that’s where the power or regular expressions shines through!

    • Thanks for your comments, Tom. I haven’t looked into glob2rx() but that sounds like it could be helpful – I’ll have to investigate further as I don’t always use sensible naming systems for my files! And I understand your pain with minimal bandwidth – definitely a problem that I come across often.

  3. I think you may want to get full file paths (with full.names=TRUE under the list.files()) otherwise you will be copying from paths relative to your $PWD. If your files are in your $PWD it’s fine, but if not, it won’t find them. Just an idea! :O)

    • Thanks Vanessa and good point! When I actually ran this code on my files, I used paste0 to stick the current.path in front of the file names but full.names=TRUE would be more sensible. I’ll amend the code in the post to add this in.

  4. Why didn’t you just use the search string “SDM.tif” in Windows explorer? Maybe I’ve missed some aspect of this challenge.

    • Hi Chris,
      While that would work, it is MUCH faster to do it using R (or the command line). It’s also useful if you want to rename/delete/move/copy a large number of files at once, particularly if this is part of an analytical process.

  5. Hi Amy
    I have a csv with all the paths (and names) of the images I want to copy, but those come from diferent folders. If I run this it copies everything in the same folder and only the first image with the same name (the images names are repited)
    thanks for the post

Leave a comment