Join operations, wrapped directly from maditr.
left_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) right_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) inner_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) full_join_dt(x, y, by = NULL, suffix = c(".x", ".y")) anti_join_dt(x, y, by = NULL) semi_join_dt(x, y, by = NULL)
x | data.frame |
---|---|
y | data.frame |
by | a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they're right (to suppress the message, simply explicitly list the variables that you want to join). To join by different variables on x and y use a named vector. For example, by = c("a" = "b") will match x.a to y.b. |
suffix | If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2. |
data.table
# use the examples from `maditr` package workers = fread(" name company Nick Acme John Ajax Daniela Ajax ") positions = fread(" name position John designer Daniela engineer Cathie manager ") workers %>% inner_join_dt(positions)#>#> name company position #> 1: John Ajax designer #> 2: Daniela Ajax engineerworkers %>% left_join_dt(positions)#>#> name company position #> 1: Nick Acme <NA> #> 2: John Ajax designer #> 3: Daniela Ajax engineerworkers %>% right_join_dt(positions)#>#> name company position #> 1: John Ajax designer #> 2: Daniela Ajax engineer #> 3: Cathie <NA> managerworkers %>% full_join_dt(positions)#>#> name company position #> 1: Nick Acme <NA> #> 2: John Ajax designer #> 3: Daniela Ajax engineer #> 4: Cathie <NA> manager# filtering joins workers %>% anti_join_dt(positions)#>#> name company #> 1: Nick Acmeworkers %>% semi_join_dt(positions)#>#> name company #> 1: John Ajax #> 2: Daniela Ajax# To suppress the message, supply 'by' argument workers %>% left_join_dt(positions, by = "name")#> name company position #> 1: Nick Acme <NA> #> 2: John Ajax designer #> 3: Daniela Ajax engineer# Use a named 'by' if the join variables have different names positions2 = setNames(positions, c("worker", "position")) # rename first column in 'positions' workers %>% inner_join_dt(positions2, by = c("name" = "worker"))#> name company position #> 1: John Ajax designer #> 2: Daniela Ajax engineer