Work like the `*_join_dt` series functions, joining tables with common or customized keys in various ways. The only difference is the joining is case insensitive like SQL.
sql_join_dt(x, y, by = NULL, type = "inner", suffix = c(".x", ".y"))
A data.table
A data.table
(Optional) A character vector of variables to join by.
If `NULL`, the default, `*_join_dt()` will perform a natural join, using all variables in common across `x` and `y`. A message lists the variables so that you can check they're correct; suppress the message by supplying `by` explicitly.
To join by different variables on `x` and `y`, use a named vector. For example, `by = c("a" = "b")` will match `x$a` to `y$b`.
To join by multiple variables, use a vector with length > 1. For example, `by = c("a", "b")` will match `x$a` to `y$a` and `x$b` to `y$b`. Use a named vector to match different variables in `x` and `y`. For example, `by = c("a" = "b", "c" = "d")` will match `x$a` to `y$b` and `x$c` to `y$d`.
Notice that in `sql_join`, the joining variables would turn to upper case in the output table.
Which type of join would you like to use? Default uses "inner", other options include "left", "right", "full", "anti", "semi".
If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.
A data.table
dt1 = data.table(x = c("A","b"),y = 1:2)
dt2 = data.table(x = c("a","B"),z = 4:5)
sql_join_dt(dt1,dt2)
#> Key: <x>
#> x y z
#> <char> <int> <int>
#> 1: A 1 4
#> 2: B 2 5