Amy Whitehead's Research

the ecological musings of a conservation biologist

Converting shapefiles to rasters in R

22 Comments

I’ve been doing a lot of analyses recently that need rasters representing features in the landscape. In most cases, these data have been supplied as shapefiles, so I needed to quickly extract parts of a shapefile dataset and convert them to a raster in a standardised format. Preferably with as little repetitive coding as possible. So I created a simple and relatively flexible function to do the job for me.

The function requires two main input files: the shapefile (shp) that you want to convert and a raster that represents the background area (mask.raster), with your desired extent and resolution. The value of the background raster should be set to a constant value that will represent the absence of the data in the shapefile (I typically use zero).

The function steps through the following:

  1. Optional: If shp is not in the same projection as the mask.raster, set the current projection (proj.from) and then transform the shapefile to the new projection (proj.to) using transform=TRUE.
  2. Convert shp to a raster based on the specifications of mask.raster (i.e. same extent and resolution).
  3. Set the value of the cells of the raster that represent the polygon to the desired value.
  4. Merge the raster with mask.raster, so that the background values are equal to the value of mask.raster.
  5. Export as a tiff file in the working directory with the label specified in the function call.
  6. If desired, plot the new raster using map=TRUE.
  7. Return as an object in the global R environment.

The function is relatively quick, although is somewhat dependant on how complicated your shapefile is. The more individual polygons that need to filtered through and extracted, the longer it will take.

shp2raster <- function(shp, mask.raster, label, value, transform = FALSE, proj.from = NA,
    proj.to = NA, map = TRUE) {
    require(raster, rgdal)

    # use transform==TRUE if the polygon is not in the same coordinate system as
    # the output raster, setting proj.from & proj.to to the appropriate
    # projections
    if (transform == TRUE) {
        proj4string(shp) <- proj.from
        shp <- spTransform(shp, proj.to)
    }

    # convert the shapefile to a raster based on a standardised background
    # raster
    r <- rasterize(shp, mask.raster)
    # set the cells associated with the shapfile to the specified value
    r[!is.na(r)] <- value
    # merge the new raster with the mask raster and export to the working
    # directory as a tif file
    r <- mask(merge(r, mask.raster), mask.raster, filename = label, format = "GTiff",
        overwrite = T)

    # plot map of new raster
    if (map == TRUE) {
        plot(r, main = label, axes = F, box = F)
    }

    names(r) <- label
    return(r)
}

Below is a trivial example based on some readily available data in the maptools and biomod2 packages. Here I load a raster and a shapefile that represent our background of interest and foreground feature, respectively, and then plot them.

library(maptools)
library(raster)

## example: import world raster from package biomod2 and set the background
## values to zero
worldRaster <- raster(system.file("external/bioclim/current/bio3.grd", package = "biomod2"))
worldRaster[!is.na(worldRaster)] <- 0
plot(worldRaster, axes = F, box = F, legend = F, main = "The world")

# import world polygon shapefile from package maptools
data(wrld_simpl, package = "maptools")
plot(wrld_simpl, add = T)


The World

One of the nice things about working with shapefiles in R is that you can subset the data based on attribute data the same way that you would any dataframe. This is really useful when combined with the shp2raster function as it means that we only need to convert the parts of the shapefile that we are actually interested in. For example, you may wish to create separate rasters for different landuse types that are contained in one shapefile as polygons with different attributes. You can see this in the trivial example below where I create two rasters from our world polygon data that select specific countries to convert based on the attribute NAME in the wrld_simpl shapefile.

# extract all Australian polygons and convert to a world raster where cells
# associated with Australia have a value of 1 and everything else has a
# value of 0.
australia <- shp2raster(shp = wrld_simpl[grepl(c("Australia";), wrld_simpl$NAME), ],
    mask.raster = worldRaster, label = "Where Amy currently lives", transform = FALSE, value = 1)

## Found 1 region(s) and 97 polygon(s)

Australia

# extract Australia, NZ & USA and convert to a world raster where cells
# associated with these countries have a value of 3 and everything
# else has a value of 0.
aus.nz.us <- shp2raster(shp = wrld_simpl[grepl(c("Australia|New Zealand|United States"),
    wrld_simpl$NAME), ], mask.raster = worldRaster, label = "All countries Amy has lived in",
    transform = FALSE, value = 3)

## Found 5 region(s) and 384 polygon(s)

NZ, USA & Australia

Clearly these are quite trivial examples, where the raster and polygon layers don’t match very well (I mean, what is that blob of four pixels that represents my home country?!). But you hopefully get the general idea. In this case, I haven’t needed to do any transformation of the projections.

Below is an example where I was interested in creating a layer that represented protected areas within the Hunter Valley, Australia, for a conservation planning exercise. The available shapefile represented four different reserve types (National Parks, Nature Reserves, Regional parks and State Conservation Areas). In this case, the initial mask and polygon layers were in different projections, so I converted the shapefile to the correct projection by using transform=TRUE and proj.from and proj.to in the shp2raster call. I also wanted to extract a subset of the reserve types contained within the shapefile that related to National Parks (NP) and Nature Reserves (NR). The protected areas of interest needed to have a value of 3, while the background raster has a value of 0. Unfortunately I can’t supply the original layers but a real example of the code in action and the output map are below.

# set relevant projections
GDA94.56 <- CRS("+proj=utm +zone=56 +south +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs")
GDA94 <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")

LH.mask <- raster("LH.mask.tif")
# set the background cells in the raster to 0
LH.mask[!is.na(LH.mask)] <- 0

NPWS.reserves <- readShapePoly("NPWSReserves.shp", proj4 = GDA94)

# convert the NPWS.reserves polygon data for National Parks & Nature
# Reserves to a raster, after changing the projection.
NPWS.raster <- shp2raster(shp = NPWS.reserves[grepl(c("NP|NR"), NPWS.reserves$reservetyp),],
    mask.raster = LH.mask, label = "National Parks & Nature Reserves", value = 3,
    transform = TRUE, proj.from = GDA94, proj.to = GDA94.56)

## Found 111 region(s) and 837 polygon(s)

Hunter Valley

The function should work with both polygon and polyline data. You can also use point data if you buffer it by a small amount first to create tiny polygons around the points. I’d pick a buffer size that makes sense relative to the resolution of the mask.raster.

Go forth and convert! Feedback on improvements definitely welcome – please leave a comment.

22 thoughts on “Converting shapefiles to rasters in R

  1. Pingback: Converting shapefiles to rasters in R | R for J...

  2. Hello Amy,

    Haven’t tested this, but couldn’t the rasterizing and writing lines of your function be simplified to:

    out.r <- rasterize(shp, mask.raster, field=value, filename=label)

    which will set all cells inside polygons to the specified value and write the raster to a file with the format inferred from the file extension.

    • Thanks for the suggestion, Michael. In my case, I wanted to make sure that my final raster has two values – a value inside the shapefile and a background value that represents the area in my original raster that is outside the shapefile. This is why the function uses the mask function to set the background values. Your suggestion produces a raster that only represents the data inside the shapefile. However, I could have nested the rasterize function inside the mask function to simplfy things a bit.

      • Could you just use the background arg of the rasterize function ?

        rasterize(shp, mask.raster, field=value, background=bkg.value, filename=label)

      • That’s a very good point, Michael – I hadn’t thought of doing that. That’s what happens I guess when you don’t thoroughly read the documentation that goes with existing functions! Thanks for the suggestion.

  3. Thanks Amy, this is great! I work with vector data in various datums (GDA94, AGD66) and projections (Geodetic, MGA 54/55, Australian Map Grid 55, Victoria Grid 1994) and I’m trying to make rasters of the same datum/projection. I thought the following might help anyone trying to adapt the code to fit region data projections etc.

    To find out what code to use for I should use for my input datum/projection I looked up the EPSG codes here: http://ttmaps.free.fr/en/projections.html

    Then I searched that code here: http://www.spatialreference.org/

    And this gave me the Proj4js format code to put in Amy’s code.

    For example, for my parochial DEM on

    EPSG: 3111 Datum: GDA94 Projection: VICTORIA GRID 1994

    Proj4js.defs[“EPSG:3111”] = “+proj=lcc +lat_1=-36 +lat_2=-38 +lat_0=-37 +lon_0=145 +x_0=2500000 +y_0=2500000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs”;

  4. Thanks a lot, your post was nice and clear.
    Incidentally, I am doing hons at the University of Canterbury, and I was in a first-year lecture that you took for Angus!
    I have been working on a community ecology project for one of my papers (with rasters of all NZ’s terrestrial birds), and after all the struggles of getting used to GIS in R rather than ArcMap, I have found it a lot better to work with. Good posts like this make life a lot easier, so cheers and I will keep checking your blog for other helpful tips!

    • Thanks David – glad to hear that you found it useful. I have definitely switched most of my geoprocessing work over to R as it seems an easier way to go (once you figure out how the code works!). It certainly makes it much simpler when you need to repeat analyses. Glad to hear that you survived my lecturing and are now doing honours! All the best for your project – it sounds really interesting.

  5. Hello Amy, I have a question about converting a shapefile into a raster. Thank you in advance!

    I have a shapefile (sp dataframe) which contains a contribute (say, the population). Now, I want to convert this sp dataframe into a raster file, and the value of each cell should be equal to a corresponding population (this can be calculated by weighted sum of population from each cells which intersect with the polygons)

    I think maybe rasterize does not work here. Do you have any suggestion about this problem?

    • Hi Xijia,
      If I understand the question correctly (& please correct me if I’m wrong!), this should be possible with rasterize and then some post-processing of your new raster layer. My understanding is that you would like the value in a given cell to be equal to the value of the intersecting polygon, divided by the total number of cells that intersect the polygon(?). Assuming that each of your polygons has a different initial value (i.e. population size), then this should be relatively simple. First, rasterize your initial polygon using rasterize or my shp2raster function. Then, within your new raster, identify each of the unique cell values (i.e. population sizes). For set of unique cells, work out how many cells there are and then divide the value of each cell by the number of cells. The code might look something like that below.

      unique.values <- unique(myNewRaster)

      for(i in 1:length(unique.values)){
      myNewRaster[myNewRaster==unique.values[i]] <- myNewRaster[myNewRaster==unique.values[i]]/length(myNewRaster[myNewRaster==unique.values[i]])
      }

      Hope that helps. But, as I said, I may have misinterpreted your question, so let me know if that wasn’t what you were after.

  6. Hi Amy, as I was able to adapt your code to my files and get it running (I am not really good with R) but after some time of calculation I got this error message in R: Error in par(plt = bigplot) : invalid value specified for graphical parameter “plt” . I am reading about it but I have no clue what I can do to fix this… Do you have any idea what is the problem? Thank you in advance!

    • Hi Thales, sorry for the slow reply – I’m currently cycle-touring in the Candian Rockies and have limited internet. I suspect that the problem is something to do with the line of code that plots the output so that you can check that it’s converting the right part of the shapefile. I’d try commenting that out and see if that fixes the problem. Good luck, Amy

  7. If you don’t want to do any masking or whatever, here’s an automated conversion:
    https://github.com/brry/misc/blob/master/shp2raster.R

    • Thanks Berry – that’s really useful. I was just wondering how to implement a solution when I didn’t have an existing raster for smooshing my shapefile into.

  8. Hi Amy,
    Is it possible to choose the size of the raster cells?
    Thank you in advance
    Marta

    • Hi Marta,
      That’s a really great question. In this function the cell size is set by resolution of the mask raster. You can alter the cell size of the mask raster by using aggregate (if you want a coarser resolution) or resample if you want a finer resolution. Another option is to check out Berry’s comment where he has a function that allows you to set the cell size as one of the parameters (https://github.com/brry/misc/blob/master/shp2raster.R).
      Good luck!

      • Hi Amy,

        I used your first and the second suggestion, to alter the cell size with success.
        Thank you for your answer.

        Marta

  9. Hi Amy,

    I am trying to use this function to rasterize a road shapefile. It all seems to work but I am getting a final raster in which the polylines for the roads are not completely represented and it’s more of a patchy network. Any ideas what could be causing this?

    Thanks,
    Ale

    • Hi Ale,
      Sometimes it doesn’t work very well with line features. Try buffering your roads by a width that corresponds with your cell resolution and then try again. From memory this worked for me.
      Good luck!
      Amy

      • Hi Amy!

        I have no idea what happened because when I plotted the raster in R it showed as incomplete but when I brought it into my modelling environment it seemed to be fine. So all good! Thank you for the help 🙂

      • I hate it when that sort of stuff happens! Glad it worked out in the end.

Leave a reply to Amy Whitehead Cancel reply