Overview

tidyft is an extension of data.table. Using modification by reference whenever possible, this toolkit is designed for big data analysis in high-performance desktop or laptop computers. The syntax of the package is similar or identical to tidyverse. It is user friendly, memory efficient and time saving. For more information, check its ancestor package tidyfst.

This design is best for big data manipulation on out of memory data using facilities provided by fst. In such ways, you can handle the most quantity of data in the least time and space on your computer.

Installation

You can install the released version of tidyft via:

install.packages("tidyft") 

Tutorial

See vignettes.

Performance



rm(list = ls())

library(profvis)
library(dplyr)
library(tidyft)
as.data.frame(starwars) -> starwars
starwars[sample.int(1:nrow(starwars),1e6,replace = T),] -> starwars
copy(starwars) -> dat1
copy(starwars) -> dat2
copy(starwars) -> dat3

profvis({
  dat1 %>%
    dplyr::as_tibble() %>%
    dplyr::select(name, dplyr::ends_with("color")) %>%
    dplyr::arrange(hair_color,skin_color,eye_color) -> a

  setorder(setDT(dat2)[,.SD,.SDcols = patterns("name|color$")],
           hair_color,skin_color,eye_color) -> b

  dat3 %>%
    tidyft::setDT() %>%
    tidyft::select("name|color$") %>%
    tidyft::arrange(hair_color,skin_color,eye_color) -> c


})

all.equal(a,b)
#> [1] TRUE
all.equal(b,c)
#> [1] TRUE