Select only unique/distinct rows from a data frame.

distinct_dt(.data, ..., .keep_all = FALSE, fromLast = FALSE)

Arguments

.data

data.frame

...

Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables.

.keep_all

If TRUE, keep all variables in data.frame. If a combination of ... is not distinct, this keeps the first row of values.

fromLast

Logical indicating if duplication should be considered from the reverse side. Defaults to FALSE.

Value

data.table

See also

Examples

iris %>% distinct_dt()
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 145:          6.7         3.0          5.2         2.3 virginica
#> 146:          6.3         2.5          5.0         1.9 virginica
#> 147:          6.5         3.0          5.2         2.0 virginica
#> 148:          6.2         3.4          5.4         2.3 virginica
#> 149:          5.9         3.0          5.1         1.8 virginica
iris %>% distinct_dt(Species)
#>       Species
#>        <fctr>
#> 1:     setosa
#> 2: versicolor
#> 3:  virginica
iris %>% distinct_dt(Species,.keep_all = TRUE)
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#>           <num>       <num>        <num>       <num>     <fctr>
#> 1:          5.1         3.5          1.4         0.2     setosa
#> 2:          7.0         3.2          4.7         1.4 versicolor
#> 3:          6.3         3.3          6.0         2.5  virginica
mtcars %>% distinct_dt(cyl,vs)
#>      cyl    vs
#>    <num> <num>
#> 1:     6     0
#> 2:     4     1
#> 3:     6     1
#> 4:     8     0
#> 5:     4     0
mtcars %>% distinct_dt(cyl,vs,.keep_all = TRUE)
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear
#>    <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1:  21.0     6 160.0   110  3.90 2.620 16.46     0     1     4
#> 2:  22.8     4 108.0    93  3.85 2.320 18.61     1     1     4
#> 3:  21.4     6 258.0   110  3.08 3.215 19.44     1     0     3
#> 4:  18.7     8 360.0   175  3.15 3.440 17.02     0     0     3
#> 5:  26.0     4 120.3    91  4.43 2.140 16.70     0     1     5
#> 1 variable not shown: [carb <num>]
mtcars %>% distinct_dt(cyl,vs,.keep_all = TRUE,fromLast = TRUE)
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear
#>    <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1:  17.8     6 167.6   123  3.92  3.44  18.9     1     0     4
#> 2:  26.0     4 120.3    91  4.43  2.14  16.7     0     1     5
#> 3:  19.7     6 145.0   175  3.62  2.77  15.5     0     1     5
#> 4:  15.0     8 301.0   335  3.54  3.57  14.6     0     1     5
#> 5:  21.4     4 121.0   109  4.11  2.78  18.6     1     1     4
#> 1 variable not shown: [carb <num>]