patentsview_class_matrix.Rd
Create A document-class matrix from a patentsview bulk table.
patentsview_class_matrix(table, outFile = NULL, class_id = NULL,
doc_id = "patent_id", rank = "sequence", type = NULL, ...,
sparse = TRUE, overwrite = FALSE)
A matrix-like object with columns specified by class_id
, doc_id
, and rank
.
Alternatively, the path to a tab-separated file containing such a matrix, an Arrow
dataset, or the name
of a table, to be passed to download_patentsview_bulk
.
Path to an rds file to save the results to.
Column names of the class, document ID, and class rank to be pulled from table
.
Table name, used to set a default class_id
.
Additional arguments to be passed to download_patentsview_bulk
,
if table
is a table name.
Logical; if FALSE
, returns a regular, dense matrix.
Logical; if TRUE
, overwrites an existing outFile
rather than loading it.
A sparse matrix (or regular matrix if sparse
is FALSE
) with documents
in rows, classes in columns, and the class rank (sequence) as values.
table <- data.frame(
patent_id = c("a", "a", "b"),
class = c(1, 3, 2),
sequence = c(1, 0, 0)
)
patentsview_class_matrix(table, class_id = "class")
#> 2 x 3 sparse Matrix of class "dgCMatrix"
#> 1 2 3
#> a 2 . 1
#> b . 1 .
if (FALSE) {
# get a matrix of WIPO class assignments
wipo_fields <- patentsview_class_matrix("wipo")
# get a subset without creating the full matrix
wipo <- download_patentsview_bulk("wipo", make_db = TRUE)
wipo_fields_sub <- patentsview_class_matrix(dplyr::compute(dplyr::filter(
wipo, patent_id %in% c("10000002", "10000015", "10000017")
)))
}