Create a tbl_graph(a class provided by tidygraph) from the tidy table with document ID and keyword. Each entry(row) should contain only one keyword in the tidy format.This function would automatically computes the frequency and classification group number of nodes representing keywords.

keyword_group(
  dt,
  id = "id",
  keyword = "keyword",
  top = 200,
  min_freq = 1,
  com_detect_fun = group_fast_greedy
)

Arguments

dt

A data.frame containing at least two columns with document ID and keyword.

id

Quoted characters specifying the column name of document ID.Default uses "id".

keyword

Quoted characters specifying the column name of keyword.Default uses "keyword".

top

The number of keywords selected with the largest frequency. If there is a tie,more than top entries would be selected.

min_freq

Minimum occurrence of selected keywords.Default uses 1.

com_detect_fun

Community detection function,provided by tidygraph(wrappers around clustering functions provided by igraph), see group_graph to find other optional algorithms. Default uses group_fast_greedy.

Value

A tbl_graph, representing the keyword co-occurence network with frequency and group number of the keywords.

Details

This function receives a tidy table with document ID and keyword.Only top keywords with largest frequency would be selected and the minimum occurrence of keywords could be specified. For suggestions of community detection algorithm, see the references provided below.

References

de Sousa, Fabiano Berardo, and Liang Zhao. "Evaluating and comparing the igraph community detection algorithms." 2014 Brazilian Conference on Intelligent Systems. IEEE, 2014.

Yang, Z., Algesheimer, R., & Tessone, C. J. (2016). A comparative analysis of community detection algorithms on artificial networks. Scientific reports, 6, 30750.

Examples

library(akc)
# \donttest{
bibli_data_table %>%
  keyword_clean(id = "id",keyword = "keyword") %>%
  keyword_group(id = "id",keyword = "keyword")
#> # A tbl_graph: 203 nodes and 1223 edges
#> #
#> # An undirected simple graph with 1 component
#> #
#> # Node Data: 203 × 3 (active)
#>   name                  freq group
#>   <chr>                <int> <int>
#> 1 information literacy    58     4
#> 2 academic libraries     133     1
#> 3 archives                12     4
#> 4 higher education        16     4
#> 5 bibliometrics           31     3
#> 6 assessment              15     2
#> # … with 197 more rows
#> #
#> # Edge Data: 1,223 × 3
#>    from    to     n
#>   <int> <int> <int>
#> 1     1   116    14
#> 2     1     2    12
#> 3     2    29     8
#> # … with 1,220 more rows

# use 'louvain' algorithm for community detection

bibli_data_table %>%
  keyword_clean(id = "id",keyword = "keyword") %>%
  keyword_group(id = "id",keyword = "keyword",
  com_detect_fun = group_louvain)
#> # A tbl_graph: 203 nodes and 1223 edges
#> #
#> # An undirected simple graph with 1 component
#> #
#> # Node Data: 203 × 3 (active)
#>   name                  freq group
#>   <chr>                <int> <int>
#> 1 information literacy    58     7
#> 2 academic libraries     133     4
#> 3 archives                12     2
#> 4 higher education        16     6
#> 5 bibliometrics           31     5
#> 6 assessment              15     7
#> # … with 197 more rows
#> #
#> # Edge Data: 1,223 × 3
#>    from    to     n
#>   <int> <int> <int>
#> 1     1   116    14
#> 2     1     2    12
#> 3     2    29     8
#> # … with 1,220 more rows

# get more alternatives by searching '?tidygraph::group_graph'
# }