Since I mentioned kernels last week, I figured this would be a good time to go over the geographically weighted regression (GWR) procedure in SAGA. Once again, I don’t have good example data I can post online for this topic, so you’ll have to forgive the lack of pictures.
GWR is an interesting method with lots of applications. GWR is different from other techniques because it allows the relationships between covariates to vary spatially. For some research questions, it is helpful to use GWR to see where certain predictor variables are more or less important for predicting an outcome. GWR is flexible enough that you can analyze your data at many scales, and see error estimates and at many scales too.
Today we’re using GWR for interpolation. If you haven’t used SAGA before, I recommend starting with this SAGA Basics post. You’ll need to know how to open raster data and point data.
In today’s hypothetical interpolation example, let’s say you’re trying to get a map of soil lead concentration surrounding the site of an old factory. You have soil test results with lead levels stored as point data, and raster layers of historic rainfall and elevation. After opening your data sources, you need to open the GWR pop up menu by going to Geoprocessing > Spatial and Geostatistics > GWR. Since we have two different raster layers (rainfall and elevation), you’ll want to use GWR for multiple grid predictors.
Once you’re in the GWR window, it looks pretty similar to other procedures in SAGA. Select your grid system and your data layers. Be careful to select not just the right point file, but also the correct attribute within this point data. I also like to create a layer of residual values at the study spot– this is useful if you want to calculate RMSE or other global error metrics. If you want to see how the parameters for each variable changes across your study space, click the checkbox next to “Output of Regression Parameters.”
From here you’re into the kernel bit! The weighting function has a variety of options like we talked about in this post about kernels. SAGA will suggest a bandwidth based on the grid system size and resolution, but you can also input your own. The search range is a cut off that helps determine how far the algorithm will look for points to include if you use a variable bandwidth. I’ve always used global searches, but I bet this would be good for really excessively large maps.
From here you can run the GWR procedure, output your data as a TIFF if you so choose (directions buried in here), and look at your pretty map in SAGA. Thanks for reading, and good luck map making!
5 thoughts on “Geographically Weighted Regression in SAGA”
Which is the unit of the bandwidth? And how bandwidth is exactly calculated?
Also, I try to run GWR Gaussian for multiple predictor grids, but apparently the bandwidth changes depending on the Points (vector database) not based on the grid system size. Any suggestion?
Thanks a lot!
Hi Nicola– apologies for taking a while to get back to you. It looks like in SAGA 2.2.0 the default bandwidth is 1 (http://www.saga-gis.org/saga_tool_doc/2.2.0/statistics_regression_7.html). I’m having trouble finding documentation on the process for newer versions of SAGA GIS. When I have run this tool before, the units seemed to be the units from the projection used in my grid system and point layer. I am not sure what happens if you run this tool with unprojected data or with point and grid data that have different projections from each other. If you try running this tool in the SAGA plugin for QGIS instead of in the stand alone SAGA app, the units may be easier to figure out since the error messages tend to be easier to interpret. Sorry I can’t be more help!
Thanks a lot! it was useful.
Hi, this is very useful information. Was wondering how we can interprete the result of GWR. Is there any resource on that too?
Thanks for reaching out– I do not think I’ve blogged on that topic before, and will definitely have to write about my interpretation process in the future! In the meantime, I would encourage you to look at the ArcGIS website to learn about the different metrics that are a part of GWR output (link: https://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-statistics-toolbox/interpreting-gwr-results.htm). There’s also a really great case study in a paper from the Journal of Geographical Systems (Multicollinearity and correlation among local regression coefficients in geographically weighted regression by Wheeler and Tiefelsdorf in 2005– DOI: 10.1007/s10109-005-0155-6) that has a case study with very detailed interpretation descriptions– if you have access to that journal, I highly recommend reading it.