Create A document-class matrix from a patentsview bulk table.

patentsview_class_matrix(table, outFile = NULL, class_id = NULL,
  doc_id = "patent_id", rank = "sequence", type = NULL, ...,
  sparse = TRUE, overwrite = FALSE)

Arguments

table

A matrix-like object with columns specified by class_id, doc_id, and rank. Alternatively, the path to a tab-separated file containing such a matrix, an Arrow dataset, or the name of a table, to be passed to download_patentsview_bulk.

outFile

Path to an rds file to save the results to.

class_id, doc_id, rank

Column names of the class, document ID, and class rank to be pulled from table.

type

Table name, used to set a default class_id.

...

Additional arguments to be passed to download_patentsview_bulk, if table is a table name.

sparse

Logical; if FALSE, returns a regular, dense matrix.

overwrite

Logical; if TRUE, overwrites an existing outFile rather than loading it.

Value

A sparse matrix (or regular matrix if sparse is FALSE) with documents in rows, classes in columns, and the class rank (sequence) as values.

Examples

table <- data.frame(
  patent_id = c("a", "a", "b"),
  class = c(1, 3, 2),
  sequence = c(1, 0, 0)
)
patentsview_class_matrix(table, class_id = "class")
#> 2 x 3 sparse Matrix of class "dgCMatrix"
#>   1 2 3
#> a 2 . 1
#> b . 1 .

if (FALSE) {

# get a matrix of WIPO class assignments
wipo_fields <- patentsview_class_matrix("wipo")

# get a subset without creating the full matrix
wipo <- download_patentsview_bulk("wipo", make_db = TRUE)
wipo_fields_sub <- patentsview_class_matrix(dplyr::compute(dplyr::filter(
  wipo, patent_id %in% c("10000002", "10000015", "10000017")
)))
}