Count the number of times each pair of items appear together within a group. For example, this could count the number of times two words appear within documents. This function has referred to pairwise_count in widyr package, but with very different defaults on several parameters.

pairwise_count_dt(
  .data,
  .group,
  .value,
  upper = FALSE,
  diag = FALSE,
  sort = TRUE
)

Arguments

.data

A data.frame.

.group

Column name of counting group.

.value

Item to count pairs, will end up in V1 and V2 columns.

upper

When FALSE(Default), duplicated combinations would be removed.

diag

Whether to include diagonal (V1==V2) in output. Default uses FALSE.

sort

Whether to sort rows by counts. Default uses TRUE.

Value

A data.table with 3 columns (named as "V1","V2" and "n"), containing combinations in "V1" and "V2", and counts in "n".

See also

Examples


dat <- data.table(group = rep(1:5, each = 2),
              letter = c("a", "b",
                         "a", "c",
                         "a", "c",
                         "b", "e",
                         "b", "f"))
pairwise_count_dt(dat,group,letter)
#>        V1     V2     n
#>    <char> <char> <int>
#> 1:      a      c     2
#> 2:      a      b     1
#> 3:      b      e     1
#> 4:      b      f     1
pairwise_count_dt(dat,group,letter,sort = FALSE)
#>        V1     V2     n
#>    <char> <char> <int>
#> 1:      a      b     1
#> 2:      a      c     2
#> 3:      b      e     1
#> 4:      b      f     1
pairwise_count_dt(dat,group,letter,upper = TRUE)
#>        V1     V2     n
#>    <char> <char> <int>
#> 1:      a      c     2
#> 2:      c      a     2
#> 3:      a      b     1
#> 4:      b      a     1
#> 5:      b      e     1
#> 6:      e      b     1
#> 7:      b      f     1
#> 8:      f      b     1
pairwise_count_dt(dat,group,letter,diag = TRUE)
#>        V1     V2     n
#>    <char> <char> <int>
#> 1:      a      a     3
#> 2:      b      b     3
#> 3:      a      c     2
#> 4:      c      c     2
#> 5:      a      b     1
#> 6:      b      e     1
#> 7:      e      e     1
#> 8:      b      f     1
#> 9:      f      f     1
pairwise_count_dt(dat,group,letter,diag = TRUE,upper = TRUE)
#>         V1     V2     n
#>     <char> <char> <int>
#>  1:      a      a     3
#>  2:      b      b     3
#>  3:      a      c     2
#>  4:      c      a     2
#>  5:      c      c     2
#>  6:      a      b     1
#>  7:      b      a     1
#>  8:      b      e     1
#>  9:      e      b     1
#> 10:      e      e     1
#> 11:      b      f     1
#> 12:      f      b     1
#> 13:      f      f     1

# The column name could be specified using character.
pairwise_count_dt(dat,"group","letter")
#>        V1     V2     n
#>    <char> <char> <int>
#> 1:      a      c     2
#> 2:      a      b     1
#> 3:      b      e     1
#> 4:      b      f     1