Construct network of documents based on keyword co-occurrence

Create a tbl_graph(a class provided by tidygraph) from the tidy table with document ID and keyword. Each entry(row) should contain only one document and keyword in the tidy format.This function would group the documents.

doc_group(
  dt,
  id = "id",
  keyword = "keyword",
  com_detect_fun = group_fast_greedy
)

Arguments

dt: A data.frame containing at least two columns with document ID and keyword.
id: Quoted characters specifying the column name of document ID.Default uses "id".
keyword: Quoted characters specifying the column name of keyword.Default uses "keyword".
com_detect_fun: Community detection function,provided by tidygraph(wrappers around clustering functions provided by igraph), see group_graph to find other optional algorithms. Default uses group_fast_greedy.

Value

A tbl_graph, representing the document relation network based on keyword co-occurrence.

Details

As we could classify keywords using document ID, we could also classify documents with keywords. In the output network, the nodes are documents and the edges mean the two documents share same keywords with each other.

Examples

 library(akc)
 bibli_data_table %>%
   keyword_clean(id = "id",keyword = "keyword") %>%
   doc_group(id = "id",keyword = "keyword") -> grouped_doc

 grouped_doc
#> # A tbl_graph: 894 nodes and 20317 edges
#> #
#> # An undirected simple graph with 2 components
#> #
#> # Node Data: 894 × 2 (active)
#>    id    group
#>    <chr> <int>
#>  1 647       1
#>  2 219       1
#>  3 264       4
#>  4 1095      1
#>  5 830       1
#>  6 227       1
#>  7 356       4
#>  8 15        2
#>  9 981       3
#> 10 10        2
#> # ℹ 884 more rows
#> #
#> # Edge Data: 20,317 × 3
#>    from    to     n
#>   <int> <int> <int>
#> 1     1   186     9
#> 2     2   440     8
#> 3     3   298     6
#> # ℹ 20,314 more rows