Select only unique/distinct rows from a data frame.
distinct_dt(.data, ..., .keep_all = FALSE, fromLast = FALSE)
data.frame
Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables.
If TRUE
, keep all variables in data.frame. If a combination of ... is not distinct,
this keeps the first row of values.
Logical indicating if duplication should be
considered from the reverse side. Defaults to FALSE
.
data.table
iris %>% distinct_dt()
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 145: 6.7 3.0 5.2 2.3 virginica
#> 146: 6.3 2.5 5.0 1.9 virginica
#> 147: 6.5 3.0 5.2 2.0 virginica
#> 148: 6.2 3.4 5.4 2.3 virginica
#> 149: 5.9 3.0 5.1 1.8 virginica
iris %>% distinct_dt(Species)
#> Species
#> <fctr>
#> 1: setosa
#> 2: versicolor
#> 3: virginica
iris %>% distinct_dt(Species,.keep_all = TRUE)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 7.0 3.2 4.7 1.4 versicolor
#> 3: 6.3 3.3 6.0 2.5 virginica
mtcars %>% distinct_dt(cyl,vs)
#> cyl vs
#> <num> <num>
#> 1: 6 0
#> 2: 4 1
#> 3: 6 1
#> 4: 8 0
#> 5: 4 0
mtcars %>% distinct_dt(cyl,vs,.keep_all = TRUE)
#> mpg cyl disp hp drat wt qsec vs am gear
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4
#> 2: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4
#> 3: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3
#> 4: 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3
#> 5: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5
#> 1 variable(s) not shown: [carb <num>]
mtcars %>% distinct_dt(cyl,vs,.keep_all = TRUE,fromLast = TRUE)
#> mpg cyl disp hp drat wt qsec vs am gear
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 17.8 6 167.6 123 3.92 3.44 18.9 1 0 4
#> 2: 26.0 4 120.3 91 4.43 2.14 16.7 0 1 5
#> 3: 19.7 6 145.0 175 3.62 2.77 15.5 0 1 5
#> 4: 15.0 8 301.0 335 3.54 3.57 14.6 0 1 5
#> 5: 21.4 4 121.0 109 4.11 2.78 18.6 1 1 4
#> 1 variable(s) not shown: [carb <num>]