Join operations.
left_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) right_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) inner_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) full_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) anti_join_dt(x, y, by = NULL) semi_join_dt(x, y, by = NULL)
x | data.frame |
---|---|
y | data.frame |
by | a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they're right (to suppress the message, simply explicitly list the variables that you want to join). To join by different variables on x and y use a named vector. For example, by = c("a" = "b") will match x.a to y.b. |
suffix | If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2. |
data.table
# use the examples from `maditr` package library(data.table) library(tidydt) workers = fread(" name company Nick Acme John Ajax Daniela Ajax ") positions = fread(" name position John designer Daniela engineer Cathie manager ") workers %>% inner_join_dt(positions)#> Joining by: name #>#> name company position #> 1: Daniela Ajax engineer #> 2: John Ajax designerworkers %>% left_join_dt(positions)#> Joining by: name #>#> name company position #> 1: Daniela Ajax engineer #> 2: John Ajax designer #> 3: Nick Acme <NA>workers %>% right_join_dt(positions)#> Joining by: name #>#> name company position #> 1: Cathie <NA> manager #> 2: Daniela Ajax engineer #> 3: John Ajax designerworkers %>% full_join_dt(positions)#> Joining by: name #>#> name company position #> 1: Cathie <NA> manager #> 2: Daniela Ajax engineer #> 3: John Ajax designer #> 4: Nick Acme <NA># filtering joins workers %>% anti_join_dt(positions)#> name company #> 1: Nick Acmeworkers %>% semi_join_dt(positions)#> name company #> 1: Daniela Ajax #> 2: John Ajax# To suppress the message, supply 'by' argument workers %>% left_join_dt(positions, by = "name")#> name company position #> 1: Daniela Ajax engineer #> 2: John Ajax designer #> 3: Nick Acme <NA># Use a named 'by' if the join variables have different names positions2 = setNames(positions, c("worker", "position")) # rename first column in 'positions' workers %>% inner_join_dt(positions2, by = c("name" = "worker"))#> name company position #> 1: Daniela Ajax engineer #> 2: John Ajax designer