R data types: data.frames
I've been harping on about R programming basics - but a solid foundation is essential as you progress in a language. Review clears up misconceptions you may have picked up along the way. Let's spend a moment reviewing data.frames...
Data frames are like spreadsheets
Vectors, matrices, and arrays can only contain one data structure: numeric, character, logical, and so on. An R list and a data.frame may contain multiple structures. Lists keep the relationships between data sets, data.frames present that information in a row/column format. For example...
> numericVector <- c(1,2,3,4,5,6)
> characterVector <- c("twas","brillig","and","the","slithey","toves")
> monthVector<- month.abb[1:6] # e.g. jan, feb, mar, apr, may, jun
> I.am.a.dataframe <- data.frame(numericVector,characterVector,monthVector)
> I.am.a.dataframe
numericVector characterVector monthVector
1 1 twas Jan
2 2 brillig Feb
3 3 and Mar
4 4 the Apr
5 5 slithey May
6 6 toves Jun
So I.am.a.dataframe shows how the three vectors have been placed in columns in the data.frame. We should look at the structure as well...
> str(I.am.a.dataframe)
'data.frame': 6 obs. of 3 variables:
$ numericVector : num 1 2 3 4 5 6
$ characterVector: chr "twas" "brillig" "and" "the" ...
$ monthVector : chr "Jan" "Feb" "Mar" "Apr" ...
This shows that numericVector is a column of numbers. characterVector and monthVector are columns of characters. Incidentally, obs is another term for rows. variables is another term for columns.
Recommended by LinkedIn
Want to know more?
I publish a weekly video series on the R language. Here's a session on data.frames...
By the way...
I write Science Fiction. If you're a member of goodreads, you can win a copy (between August 19 and September 17, 2021).
It's told in an unusual way, and it's mostly interesting. I'm a hard scifi fan this worked for me. The author has a good imagination, and puts it to good use here. I look forward to reading more from him. Recommended.
This is great Mark Niemann-Ross! I love this hack in #rstats for making datasets, where you make the columns as a set of vectors, then assemble them together into a dataframe! You can't easily do things like that in SAS; SAS is very picky about how to handle arrays, which are the closest thing to a vector in SAS. I have a student taking a SAS class who does all the analysis in R first to "get the right answer", then redoes them in SAS for the homework! 🤣 Thanks for another great post!