Package provides pipe-style interface for data.table
. It preserves
all data.table features without significant impact on performance. 'let
'
and 'take
' functions are simplified interfaces for most common data
manipulation tasks.
To select rows from data: take_if(mtcars, am==0)
To select columns from data: take(mtcars, am, vs, mpg)
To aggregate data: take(mtcars, mean_mpg = mean(mpg), by = am)
To aggregate all non-grouping columns: take(mtcars, fun = mean, by = am)
To aggregate several columns with one summary: take(mtcars, mpg, hp, fun = mean, by = am)
To get total summary skip 'by' argument: take(mtcars, fun = mean)
Use magrittr pipe '%>%' to chain several operations:
mtcars %>% let(mpg_hp = mpg/hp) %>% take(mean(mpg_hp), by = am)
To modify variables or add new variables:
mtcars %>% let(new_var = 42, new_var2 = new_var*hp) %>% head()
To drop variable assign NULL: let(mtcars, am = NULL) %>% head()
For parametric assignment use ':=':
new_var = "my_var" old_var = "mpg" mtcars %>% let((new_var) := get(old_var)*2) %>% head()
For more sophisticated operations see 'query'/'query_if': these
functions translates its arguments one-to-one to '[.data.table
'
method. Additionally there are some conveniences such as automatic
'data.frame' conversion to 'data.table'.
# examples form 'dplyr' package data(mtcars) # Newly created variables are available immediately mtcars %>% let( cyl2 = cyl * 2, cyl4 = cyl2 * 2 ) %>% head()#> mpg cyl disp hp drat wt qsec vs am gear carb cyl2 cyl4 #> 1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 12 24 #> 2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 12 24 #> 3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 8 16 #> 4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 12 24 #> 5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 16 32 #> 6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 12 24# You can also use let() to remove variables and # modify existing variables mtcars %>% let( mpg = NULL, disp = disp * 0.0163871 # convert to litres ) %>% head()#> cyl disp hp drat wt qsec vs am gear carb #> 1: 6 2.621936 110 3.90 2.620 16.46 0 1 4 4 #> 2: 6 2.621936 110 3.90 2.875 17.02 0 1 4 4 #> 3: 4 1.769807 93 3.85 2.320 18.61 1 1 4 1 #> 4: 6 4.227872 110 3.08 3.215 19.44 1 0 3 1 #> 5: 8 5.899356 175 3.15 3.440 17.02 0 0 3 2 #> 6: 6 3.687098 105 2.76 3.460 20.22 1 0 3 1# window functions are useful for grouped computations mtcars %>% let(rank = rank(-mpg, ties.method = "min"), by = cyl) %>% head()#> mpg cyl disp hp drat wt qsec vs am gear carb rank #> 1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2 #> 2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 2 #> 3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 8 #> 4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 1 #> 5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 2 #> 6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 6#> mpg disp hp drat wt qsec vs am gear carb #> 1: 21.0 160 110 3.90 2.620 16.46 0 1 4 4 #> 2: 21.0 160 110 3.90 2.875 17.02 0 1 4 4 #> 3: 22.8 108 93 3.85 2.320 18.61 1 1 4 1 #> 4: 21.4 258 110 3.08 3.215 19.44 1 0 3 1 #> 5: 18.7 360 175 3.15 3.440 17.02 0 0 3 2 #> 6: 18.1 225 105 2.76 3.460 20.22 1 0 3 1#> mpg cyl disp hp drat wt qsec vs am gear carb displ_l #> 1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2.621932 #> 2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 2.621932 #> 3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1.769804 #> 4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 4.227866 #> 5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 5.899347 #> 6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 3.687092#> displ_l #> 1: 2.621932 #> 2: 2.621932 #> 3: 1.769804 #> 4: 4.227866 #> 5: 5.899347 #> 6: 3.687092 #> 7: 5.899347 #> 8: 2.403984 #> 9: 2.307300 #> 10: 2.746474 #> 11: 2.746474 #> 12: 4.519556 #> 13: 4.519556 #> 14: 4.519556 #> 15: 7.734700 #> 16: 7.538055 #> 17: 7.210313 #> 18: 1.289663 #> 19: 1.240502 #> 20: 1.165121 #> 21: 1.968088 #> 22: 5.211090 #> 23: 4.981671 #> 24: 5.735477 #> 25: 6.554830 #> 26: 1.294579 #> 27: 1.971365 #> 28: 1.558411 #> 29: 5.751864 #> 30: 2.376126 #> 31: 4.932510 #> 32: 1.982836 #> displ_l# can refer to both contextual variables and variable names: var = 100 mtcars %>% let(cyl = cyl * var) %>% head()#> Error in cyl * var: non-numeric argument to binary operator#> mpg cyl disp hp drat wt qsec vs am gear carb #> 1: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 #> 2: 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 #> 3: 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 #> 4: 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 #> 5: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 #> 6: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 #> 7: 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 #> 8: 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 #> 9: 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 #> 10: 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 #> 11: 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 #> 12: 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 #> 13: 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 #> 14: 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 #> 15: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 #> 16: 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 #> 17: 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 #> 18: 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 #> 19: 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2#> mpg cyl disp hp drat wt qsec vs am gear carb #> 1: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 #> 2: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 #> 3: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 #> 4: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1# A 'take' with summary functions applied without 'by' argument returns an aggregated data mtcars %>% take(mean = mean(disp), n = .N)#> mean n #> 1: 230.7219 32#> am mean n #> 1: 1 143.5308 13 #> 2: 0 290.3789 19#> am vs mean n #> 1: 1 0 206.2167 6 #> 2: 1 1 89.8000 7 #> 3: 0 1 175.1143 7 #> 4: 0 0 357.6167 12#> eval(var) #> 1: 6.1875#> vsam mpg cyl disp hp drat wt qsec #> 1: 1 20.28462 5.692308 189.4692 138.46154 3.738462 3.038846 18.04231 #> 2: 2 28.37143 4.000000 89.8000 80.57143 4.148571 2.028286 18.70000 #> 3: 0 15.05000 8.000000 357.6167 194.16667 3.120833 4.104083 17.14250 #> gear carb #> 1: 4.076923 3.307692 #> 2: 4.142857 1.428571 #> 3: 3.000000 3.083333