Join operations, wrapped directly from maditr.

left_join_dt(x, y, by = NULL, suffix = c(".x", ".y"))

right_join_dt(x, y, by = NULL, suffix = c(".x", ".y"))

inner_join_dt(x, y, by = NULL, suffix = c(".x", ".y"))

full_join_dt(x, y, by = NULL, suffix = c(".x", ".y"))

anti_join_dt(x, y, by = NULL)

semi_join_dt(x, y, by = NULL)

Arguments

x

data.frame

y

data.frame

by

a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they're right (to suppress the message, simply explicitly list the variables that you want to join). To join by different variables on x and y use a named vector. For example, by = c("a" = "b") will match x.a to y.b.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

Value

data.table

See also

Examples

# use the examples from `maditr` package workers = fread(" name company Nick Acme John Ajax Daniela Ajax ") positions = fread(" name position John designer Daniela engineer Cathie manager ") workers %>% inner_join_dt(positions)
#> dt_inner_join: joining, by = "name"
#> name company position #> 1: John Ajax designer #> 2: Daniela Ajax engineer
workers %>% left_join_dt(positions)
#> dt_left_join: joining, by = "name"
#> name company position #> 1: Nick Acme <NA> #> 2: John Ajax designer #> 3: Daniela Ajax engineer
workers %>% right_join_dt(positions)
#> dt_right_join: joining, by = "name"
#> name company position #> 1: John Ajax designer #> 2: Daniela Ajax engineer #> 3: Cathie <NA> manager
workers %>% full_join_dt(positions)
#> dt_full_join: joining, by = "name"
#> name company position #> 1: Nick Acme <NA> #> 2: John Ajax designer #> 3: Daniela Ajax engineer #> 4: Cathie <NA> manager
# filtering joins workers %>% anti_join_dt(positions)
#> dt_anti_join: joining, by = "name"
#> name company #> 1: Nick Acme
workers %>% semi_join_dt(positions)
#> dt_semi_join: joining, by = "name"
#> name company #> 1: John Ajax #> 2: Daniela Ajax
# To suppress the message, supply 'by' argument workers %>% left_join_dt(positions, by = "name")
#> name company position #> 1: Nick Acme <NA> #> 2: John Ajax designer #> 3: Daniela Ajax engineer
# Use a named 'by' if the join variables have different names positions2 = setNames(positions, c("worker", "position")) # rename first column in 'positions' workers %>% inner_join_dt(positions2, by = c("name" = "worker"))
#> name company position #> 1: John Ajax designer #> 2: Daniela Ajax engineer