In agronomy research, its very common to collect 10 plants from a single plot, take measurements on them, and average those measurements. The plot average is the value we run statistics on, whether that means building a regression or running an ANOVA.
I’ve recently seen a lot of my fellow grad students averaging measurements by hand or in Excel. This works, but it takes a long time and has lots of opportunities for human error. I know I promised you more posts about SAGA and writing, but watching this process is very painful so I’m channeling my energy into writing this blog post. Assuming I don’t get distracted by other exciting or frustration-inducing things again, I’ll get back on topic next post.
To get started, here is the short, silly example spreadsheet I will be using today. I saved it as a CSV file, mostly out of habit but also because CSV files work well in SAGA, QGIS, and R. I also don’t have spaces or punctuation in my column/variable names, which helps prevent confusion later on.
I made an R project file for this example. This is not a necessary step, but if you’re not familiar with project files, you may want to learn more here.
I will attach my full script at the end, and include screenshots throughout. Let’s pretend I wanted to know the average height of my plants by rep. The first 10 lines are just getting the data from the CSV files and into R. The 11th line gives you the mean for each rep, and you can see that output in the console panel at the bottom. Learn more about the by function in the R Documentation here.
I use the by statement when I just need to get a quick look at my data. You can also make a bar chart of the means by a rep by saving the output of the by statement to a variable.
When I’m averaging data by plot, I usually use an aggregate statement. You can average multiple columns at once, save the result to a data frame, and then write that data frame. You can see this in line 15, where I use cbind to group all the variables I want to average and aggregate to say how to group the data. Line 18 is an example of how to write a CSV file in R.
Learn more about aggregate here, learn more about write.csv here, and a copy of the whole script is available here. These functions work for many applications beyond averaging plot data, and there are also many other ways to do precisely this thing. This is a brief example simply meant to introduce an efficient solution to a common problem. Please don’t keep averaging things by hand– there are a bunch of better ways.