Some love for Base R. Part 4

Following on parts 1, 2 & 3—yes, a series—we arrive to part 4 revisiting Base R. See part 1 for the rationale, in case you’re wondering Whyyyy?

A typical question going back to Base from the tidyverse: How do I join datasets? What do I use instead of bind_rows() and bind_cols()? Easy, rbind() and cbind(), yes, r for rows and c for cols, because base is concise.

By rows

If we have a couple of data frames with the same variables (columns), then using rbind() binds/glues/stitches the data frames one after the other.

example_df1 <- data.frame(record = 1:24,
                          treatment = rep(LETTERS[1:3], each = 8))

example_df2 <- data.frame(record = 25:48,
                          treatment = rep(LETTERS[4:6], each = 8))

example_df3 <- data.frame(record = 49:72)

# This one works
example_bound <- rbind(example_df1, example_df2)

# This one doesn't as they don't have the same variables
example_bound <- rbind(example_df1, example_df3)

# If we redefine the data frame we can join more than two data frames
example_df3 <- data.frame(record = 49:72,
                          treatment = rep(LETTERS[7:9], each = 8))

example_bound <- rbind(example_df1, example_df2, example_df3)

Of course we can use pipes too:

example_df1 |> rbind(example_df2) -> example_bound2

By columns

If we have a couple of data frames with the same number of rows (cases), then using cbind() binds/glues/stitches the data frames side by side.

example_df4 <- data.frame(record = 1:24,
                          treat1 = rep(LETTERS[1:3], each = 8))

example_df5 <- data.frame(treat2 = rep(LETTERS[4:5], 12),
                          meas = rnorm(24))

example_cbound <- cbind(example_df4, example_df5)
example_cbound

   record treat1 treat2       meas
1       1      A      D -2.1158479
2       2      A      E  0.7784022
3       3      A      D -0.0112054
4       4      A      E -0.1986594
...

When you are working with data frames you get pretty much what you’d expect in dplyr. However, if you are not working with data frames but, instead, you’re dealing with vectors you end up with matrices, in which all elements have the same type. Coercing different types may produce unexpected results

# Binding columns
x <- 1:26
y <- sqrt(x)

example_1 <- cbind(x, y)

# What do we get?
is.matrix(example_1)
[1] TRUE

example_1
       x        y
 [1,]  1 1.000000
 [2,]  2 1.414214
 [3,]  3 1.732051
 [4,]  4 2.000000
 ...

# Perhaps unexpected result. Variable x
# was coerced to character
example_2 <- cbind(x, letters)
example_2
      x    letters
 [1,] "1"  "a"    
 [2,] "2"  "b"    
 [3,] "3"  "c"    
 [4,] "4"  "d"  
 ...

By one or more indices

When you have data frames with one or more variables “in common” the function to use is merge(), which may work like left_join() and right_join() in dplyr.

merge(x, y, by =)
# which you can read as
merge(left, right, by = )

Think of x as left and y as right. Using all.x = TRUE extra rows will be added to the output, one for each row in x that has no matching row in y. Using all.y = TRUE extra rows will be added to the output, one for each row in y that has no matching row in x.

As an example, I have two data frames with a tree id (ids) and a derived variable (first tree ring to achieve a technical threshold for microfibril angle and modulus of elasticity). I would like to join them by ids:

head(firstmfa)
    ids assess
1 DM001      3
2 DM002      5
3 DM003      4
4 DM004      6
5 DM005      5
6 DM006      7

head(firstmoe)
    ids ring
1 DM001    8
2 DM002    8
3 DM003    8
4 DM004    8
5 DM005    9
6 DM006   12

# Merging keeping all observations
gendata <- merge(firstmfa, firstmoe, by = 'ids', all = TRUE)

Another example using more than one joining variable. Actual wood density (in kg/m³) and microfibril angle (in degrees) assessments per tree ring, joined by tree code and ring number

> head(densdataT)
    ids ring density
1 DM001    1      NA
2 DM001    2      NA
3 DM001    3  327.96
4 DM001    4  325.37
5 DM001    5  336.59
6 DM001    6  360.82
...

> head(mfadataT)
    ids ring   mfa
1 DM001    1    NA
2 DM001    2    NA
3 DM001    3 31.93
4 DM001    4 31.70
5 DM001    5 33.21
6 DM001    6 27.98

assess <- merge(densdataT, mfadataT, by = c('tree', 'ring'), all = TRUE)

Some love for Base R. Part 4

By rows

By columns

By one or more indices

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112