Title: | Access to the 'DraCor' API |
---|---|
Description: | Provide an interface for 'Drama Corpora Project' ('DraCor') API: <https://dracor.org/documentation/api>. |
Authors: | Ivan Pozdniakov [aut, cre] |
Maintainer: | Ivan Pozdniakov <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.4 |
Built: | 2025-02-24 05:49:29 UTC |
Source: | https://github.com/dracor-org/rdracor |
Function dracor_api()
sends a GET request to DraCor API with a
specified expected type and parses results depending on selected expected
type.
dracor_api( request, expected_type = c("application/json", "application/xml", "text/csv", "text/plain"), parse = TRUE, default_type = FALSE, split_text = TRUE, as_tibble = TRUE, ... )
dracor_api( request, expected_type = c("application/json", "application/xml", "text/csv", "text/plain"), parse = TRUE, default_type = FALSE, split_text = TRUE, as_tibble = TRUE, ... )
request |
Character, valid GET request. |
expected_type |
Character, 'MIME' type: one of
|
parse |
Logical, if |
default_type |
Logical, if |
split_text |
Logical, if |
as_tibble |
Logical, if |
... |
Other arguments passed to a parser function. |
There are four different 'MIME' types (aka internet media type) that can be
retrieved for DraCor API, the specific combination of possible 'MIME' types
depends on API command. When parse = TRUE
is used, the content is
parsed depending on selected 'MIME' type in expected_type
:
application/json
application/xml
text/csv
text/plain
No need for additional preprocessing
A content of a response to GET method to the 'DraCor' API. If
parse = FALSE
or default_type = TRUE
, a single character value
is returned. Otherwise, the resulting value is parsed according to a value of
default_type
parameter. The resulting structure of the output depends
on the selected default_type
value, the respective function for
parsing (see default_type
) and additional parameters that are passed
to the function for parsing.
dracor_api("https://dracor.org/api/v1/info", expected_type = "application/json")
dracor_api("https://dracor.org/api/v1/info", expected_type = "application/json")
dracor_api_info()
returns information about 'DraCor' API: name of
the API, status, existdb version, API version etc.
dracor_api_info(dracor_api_url = NULL) get_dracor_api_url() set_dracor_api_url(new_dracor_api_url)
dracor_api_info(dracor_api_url = NULL) get_dracor_api_url() set_dracor_api_url(new_dracor_api_url)
dracor_api_url |
Character, 'DraCor' API URL. If NULL (default), the current 'DraCor' API URL is used. |
new_dracor_api_url |
Character, 'DraCor' API URL that will replace the current 'DraCor' API URL. |
get_dracor_api_url()
: Returns 'DraCor' API URL in use
set_dracor_api_url()
: Set new 'DraCor' API URL (globally), returns NULL
dracor_api_info() dracor_api_info("https://staging.dracor.org/api") get_dracor_api_url()
dracor_api_info() dracor_api_info("https://staging.dracor.org/api") get_dracor_api_url()
dracor_sparql()
submits SPARQL queries and parses the result.
dracor_sparql(sparql_query = NULL, parse = TRUE, ...)
dracor_sparql(sparql_query = NULL, parse = TRUE, ...)
sparql_query |
Character, SPARQL query. |
parse |
Logical, if |
... |
Additional arguments passed to |
SPARQL xml parsed.
dracor_sparql("SELECT * WHERE {?s ?p ?o} LIMIT 10") # If you want to avoid parsing by xml2::read_xml(): dracor_sparql("SELECT * WHERE {?s ?p ?o} LIMIT 10", parse = FALSE)
dracor_sparql("SELECT * WHERE {?s ?p ?o} LIMIT 10") # If you want to avoid parsing by xml2::read_xml(): dracor_sparql("SELECT * WHERE {?s ?p ?o} LIMIT 10", parse = FALSE)
get_character_plays()
requests plays that include a character that can
by found in 'Wikidata' by it's id. get_character_plays()
sends a
request and parses the the result to get those plays as a data frame.
get_character_plays(char_wiki_id)
get_character_plays(char_wiki_id)
char_wiki_id |
Character value with 'Wikidata ID' for a character. 'Wikidata ID' can be found on https://www.wikidata.org/wiki/Wikidata:Main_Page. Character vector (longer than 1) is not supported. |
Data frame, in which one row represents one play. Information on author(s) name, character name, play name, URL and ID is represented in separate columns.
wiki_id <- "Q131412" get_character_plays(wiki_id)
wiki_id <- "Q131412" get_character_plays(wiki_id)
get_dracor_meta()
returns metadata on available corpora as a
dracor_meta
object that inherits data frame (and can be used as such).
Use summary()
and plot()
on this object to get an even more
condensed summary.
get_dracor_meta() ## S3 method for class 'dracor_meta' summary(object, ...) ## S3 method for class 'dracor_meta' plot(x, ...)
get_dracor_meta() ## S3 method for class 'dracor_meta' summary(object, ...) ## S3 method for class 'dracor_meta' plot(x, ...)
object |
An object of class |
... |
Other arguments to be passed. |
x |
A |
dracor_meta
object that inherits data frame (and can be used
as such).
summary(dracor_meta)
: Meaningful summary for dracor_meta
object.
plot(dracor_meta)
: Plots how many plays are
available for each corpus.
corpora_meta <- get_dracor_meta() corpora_meta summary(corpora_meta) plot(corpora_meta)
corpora_meta <- get_dracor_meta() corpora_meta summary(corpora_meta) plot(corpora_meta)
get_net_cooccur_edges()
requests edges list for a play network, given
corpus and play names. Each row represents co-occurrences of two characters
in a play — number of scenes where two characters appeared together. This
edges list can be used to construct a network for a play.
get_net_cooccur_edges(play = NULL, corpus = NULL, ...) get_net_relations_edges(play = NULL, corpus = NULL, ...)
get_net_cooccur_edges(play = NULL, corpus = NULL, ...) get_net_relations_edges(play = NULL, corpus = NULL, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
... |
Additional arguments passed to |
data frame with edges (each row = one edge of a network).
get_net_relations_edges()
: Retrieves kinship and other relationship
data, following the encoding scheme proposed in
(Wiedmer et al. 2020).
Wiedmer N, Pagel J, Reiter N (2020). “Romeo, Freund des Mercutio: Semi-Automatische Extraktion von Beziehungen zwischen dramatischen Figuren.” In Konferenz Digital Humanities im deutschsprachigen Raum. doi:10.5281/zenodo.4621778.
get_net_cooccur_igraph
get_net_cooccur_gexf
get_net_cooccur_graphml
get_net_cooccur_metrics
get_net_relations_igraph
get_net_cooccur_edges(play = "lessing-emilia-galotti", corpus = "ger")
get_net_cooccur_edges(play = "lessing-emilia-galotti", corpus = "ger")
get_net_cooccur_gexf()
requests a play co-occurrence network in 'GEXF'
(Graph Exchange XML Format), given play and corpus names. 'GEXF' is a format
used in 'Gephi' — an open source software for network analysis and
visualization.
get_net_cooccur_gexf(play = NULL, corpus = NULL, parse = TRUE, ...) get_net_relations_gexf(play = NULL, corpus = NULL, ...)
get_net_cooccur_gexf(play = NULL, corpus = NULL, parse = TRUE, ...) get_net_relations_gexf(play = NULL, corpus = NULL, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
parse |
Logical, if |
... |
Additional arguments passed to |
'GEXF' data.
get_net_relations_gexf()
: Retrieves kinship and other relationship
data, following the encoding scheme proposed in
(Wiedmer et al. 2020).
Wiedmer N, Pagel J, Reiter N (2020). “Romeo, Freund des Mercutio: Semi-Automatische Extraktion von Beziehungen zwischen dramatischen Figuren.” In Konferenz Digital Humanities im deutschsprachigen Raum. doi:10.5281/zenodo.4621778.
get_net_cooccur_igraph
get_net_cooccur_metrics
get_net_cooccur_graphml
get_net_cooccur_edges
get_net_relations_igraph
get_net_cooccur_gexf(play = "lessing-emilia-galotti", corpus = "ger") # If you want 'GEXF' without parsing by xml2::read_xml(): get_net_cooccur_gexf( play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE )
get_net_cooccur_gexf(play = "lessing-emilia-galotti", corpus = "ger") # If you want 'GEXF' without parsing by xml2::read_xml(): get_net_cooccur_gexf( play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE )
get_net_cooccur_graphml()
requests a play co-occurrence network in
'GraphML', given play and corpus names. 'GraphML' is a common format for
graphs based on XML.
get_net_cooccur_graphml(play = NULL, corpus = NULL, parse = TRUE, ...) get_net_relations_graphml(play = NULL, corpus = NULL, ...)
get_net_cooccur_graphml(play = NULL, corpus = NULL, parse = TRUE, ...) get_net_relations_graphml(play = NULL, corpus = NULL, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
parse |
Logical, if |
... |
Additional arguments passed to |
'GraphML' data.
get_net_relations_graphml()
: Retrieves kinship and other relationship
data, following the encoding scheme proposed in
(Wiedmer et al. 2020).
Wiedmer N, Pagel J, Reiter N (2020). “Romeo, Freund des Mercutio: Semi-Automatische Extraktion von Beziehungen zwischen dramatischen Figuren.” In Konferenz Digital Humanities im deutschsprachigen Raum. doi:10.5281/zenodo.4621778.
get_net_cooccur_igraph
get_net_cooccur_gexf
get_net_cooccur_metrics
get_net_cooccur_edges
get_net_relations_igraph
get_net_cooccur_graphml(play = "lessing-emilia-galotti", corpus = "ger") # If you want 'GEXF' without parsing by xml2::read_xml(): get_net_cooccur_graphml(play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE)
get_net_cooccur_graphml(play = "lessing-emilia-galotti", corpus = "ger") # If you want 'GEXF' without parsing by xml2::read_xml(): get_net_cooccur_graphml(play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE)
get_net_cooccur_igraph()
returns a play network, given play and corpus
names. Play network is constructed based on characters' co-occurrence matrix.
Each node (vertex) is a character (circle) or a group of characters (square),
edges width is proportional to the number of common play segments where two
characters occur together.
get_net_cooccur_igraph(play = NULL, corpus = NULL, as_igraph = FALSE) ## S3 method for class 'cooccur_igraph' plot( x, layout = igraph::layout_with_kk, vertex.label = label_cooccur_igraph(x), gender_colors = c(MALE = "#0073C2", FEMALE = "#EFC000", UNKNOWN = "#99979D"), vertex_size_metric = c("numOfWords", "numOfScenes", "numOfSpeechActs", "degree", "weightedDegree", "closeness", "betweenness", "eigenvector"), vertex_size_scale = c(5, 20), edge_size_scale = c(0.5, 4), vertex_label_adjust = TRUE, vertex.label.color = "#03070f", vertex.label.family = "sans", vertex.label.font = 2L, vertex.frame.color = "white", ... ) ## S3 method for class 'cooccur_igraph' summary(object, ...)
get_net_cooccur_igraph(play = NULL, corpus = NULL, as_igraph = FALSE) ## S3 method for class 'cooccur_igraph' plot( x, layout = igraph::layout_with_kk, vertex.label = label_cooccur_igraph(x), gender_colors = c(MALE = "#0073C2", FEMALE = "#EFC000", UNKNOWN = "#99979D"), vertex_size_metric = c("numOfWords", "numOfScenes", "numOfSpeechActs", "degree", "weightedDegree", "closeness", "betweenness", "eigenvector"), vertex_size_scale = c(5, 20), edge_size_scale = c(0.5, 4), vertex_label_adjust = TRUE, vertex.label.color = "#03070f", vertex.label.family = "sans", vertex.label.font = 2L, vertex.frame.color = "white", ... ) ## S3 method for class 'cooccur_igraph' summary(object, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
as_igraph |
Logical, if |
x |
A |
layout |
Function, an algorithm used for the graph layout. See igraph.plotting. |
vertex.label |
Character vector of character names. By default,
function |
gender_colors |
Named vector with 3 values with colors for
MALE, FEMALE and UNKNOWN respectively. Set |
vertex_size_metric |
Character value, one of |
vertex_size_scale |
Numeric vector with two values. The first number is
for mean size of node(vertex), the second one is for node size variance. If
you specify vertex size by yourself using parameter
|
edge_size_scale |
Numeric vector with two values. The first number
defines average size of edges, the second number defines edges size variance.
If you specify edges size by yourself using parameter
|
vertex_label_adjust |
Logical. If |
vertex.label.color |
See igraph.plotting. |
vertex.label.family |
See igraph.plotting. |
vertex.label.font |
See igraph.plotting. |
vertex.frame.color |
See igraph.plotting. |
... |
Other arguments to be passed to plot.igraph |
object |
An object of class |
cooccur_igraph
— an object that inherits igraph
and can be
treated as such.
plot(cooccur_igraph)
: Plot cooccur_igraph
using
plot.igraph
with slightly modified defaults.
summary(cooccur_igraph)
: Meaningful summary for
"cooccur_igraph"
object: network properties, gender distribution
get_net_relations_igraph
label_cooccur_igraph
emilia_igraph <- get_net_cooccur_igraph( play = "lessing-emilia-galotti", corpus = "ger" ) igraph::diameter(emilia_igraph) plot(emilia_igraph) summary(emilia_igraph)
emilia_igraph <- get_net_cooccur_igraph( play = "lessing-emilia-galotti", corpus = "ger" ) igraph::diameter(emilia_igraph) plot(emilia_igraph) summary(emilia_igraph)
get_net_cooccur_metrics()
requests network metrics for a specific play,
given play and corpus names. Play network is constructed based on characters'
co-occurrence matrix.
get_net_cooccur_metrics(play = NULL, corpus = NULL, ...)
get_net_cooccur_metrics(play = NULL, corpus = NULL, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
... |
Additional arguments passed to |
List with network metrics for a specific play.
get_net_cooccur_igraph
get_net_cooccur_gexf
get_net_cooccur_graphml
get_net_cooccur_edges
get_net_relations_igraph
get_net_cooccur_metrics(play = "lessing-emilia-galotti", corpus = "ger")
get_net_cooccur_metrics(play = "lessing-emilia-galotti", corpus = "ger")
get_net_relations_igraph()
a play network, given play and corpus names
. The network represent kinship and other relationships data, following the
encoding scheme proposed in
(Wiedmer et al. 2020).
get_net_relations_igraph(play = play, corpus = corpus, as_igraph = FALSE) ## S3 method for class 'relations_igraph' summary(object, ...) ## S3 method for class 'relations_igraph' plot( x, layout = igraph::layout_nicely, gender_colors = c(MALE = "#0073C2", FEMALE = "#EFC000", UNKNOWN = "#99979D"), show_others = c("vertex", "vertex_label", "none"), vertex_size = c(13, 4), vertex_label_size = c(0.8, 0.5), vertex_label_adjust = TRUE, vertex.label.color = "#03070f", vertex.label.family = "sans", vertex.label.font = 2L, vertex.frame.color = "white", edge.arrow.size = 0.25, edge.arrow.width = 1.5, edge.curved = 0.15, edge.label.family = "sans", edge.label.font = 4L, edge.label.cex = 0.75, ... )
get_net_relations_igraph(play = play, corpus = corpus, as_igraph = FALSE) ## S3 method for class 'relations_igraph' summary(object, ...) ## S3 method for class 'relations_igraph' plot( x, layout = igraph::layout_nicely, gender_colors = c(MALE = "#0073C2", FEMALE = "#EFC000", UNKNOWN = "#99979D"), show_others = c("vertex", "vertex_label", "none"), vertex_size = c(13, 4), vertex_label_size = c(0.8, 0.5), vertex_label_adjust = TRUE, vertex.label.color = "#03070f", vertex.label.family = "sans", vertex.label.font = 2L, vertex.frame.color = "white", edge.arrow.size = 0.25, edge.arrow.width = 1.5, edge.curved = 0.15, edge.label.family = "sans", edge.label.font = 4L, edge.label.cex = 0.75, ... )
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
as_igraph |
Logical, if |
object |
An object of class |
... |
Other arguments to be passed to plot.igraph |
x |
A |
layout |
Function, an algorithm used for graph layout. See layout_. |
gender_colors |
Named vector with 3 values with colors for
MALE, FEMALE and UNKNOWN respectively. Set |
show_others |
Character value. What to do with vertices without relations?
The default is |
vertex_size |
Numeric vector with two values. The first number is for nodes with relations, the second number is for all other nodes. |
vertex_label_size |
Numeric vector with two values. The first number defines label sizes for nodes with relations, the second number for nodes without relations. |
vertex_label_adjust |
Logical value. If |
vertex.label.color |
See igraph.plotting. |
vertex.label.family |
See igraph.plotting. |
vertex.label.font |
See igraph.plotting. |
vertex.frame.color |
See igraph.plotting. |
edge.arrow.size |
See igraph.plotting. |
edge.arrow.width |
See igraph.plotting. |
edge.curved |
See igraph.plotting. |
edge.label.family |
See igraph.plotting. |
edge.label.font |
See igraph.plotting. |
edge.label.cex |
See igraph.plotting. |
relations_igraph
— an object that inherits igraph
and
can be treated as such.
summary(relations_igraph)
: Meaningful summary for
"relations_igraph"
object: relationships and their type.
plot(relations_igraph)
: Plot relations_igraph
using
plot.igraph
with slightly modified defaults.
Wiedmer N, Pagel J, Reiter N (2020). “Romeo, Freund des Mercutio: Semi-Automatische Extraktion von Beziehungen zwischen dramatischen Figuren.” In Konferenz Digital Humanities im deutschsprachigen Raum. doi:10.5281/zenodo.4621778.
galotti_relations <- get_net_relations_igraph( play = "lessing-emilia-galotti", corpus = "ger" ) plot(galotti_relations) summary(galotti_relations)
galotti_relations <- get_net_relations_igraph( play = "lessing-emilia-galotti", corpus = "ger" ) plot(galotti_relations) summary(galotti_relations)
get_play_characters()
requests miscellaneous information for characters in
a play, given play and corpus names: name, number and size of their lines,
gender, some network metrics etc.
get_play_characters(play = NULL, corpus = NULL, ...)
get_play_characters(play = NULL, corpus = NULL, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
... |
Additional arguments passed to |
Data frame, every raw represents one character in the play.
get_play_characters(play = "lessing-emilia-galotti", corpus = "ger")
get_play_characters(play = "lessing-emilia-galotti", corpus = "ger")
get_play_metadata()
requests metadata for a specific play, given play
and corpus names.
get_play_metadata(play = NULL, corpus = NULL, full_metadata = TRUE, ...)
get_play_metadata(play = NULL, corpus = NULL, full_metadata = TRUE, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
full_metadata |
Logical: if |
... |
Additional arguments passed to |
List with the play metadata.
get_net_cooccur_edges
get_play_rdf
get_play_characters
get_play_metadata( play = "lessing-emilia-galotti", corpus = "ger", full_metadata = FALSE )
get_play_metadata( play = "lessing-emilia-galotti", corpus = "ger", full_metadata = FALSE )
get_play_rdf()
requests an RDF (Resource Description Framework) data
for a play, given play and corpus names. RDF for plays can be useful for
extraction data for a play from
https://www.wikidata.org/wiki/Wikidata:Main_Page.
get_play_rdf(play = NULL, corpus = NULL, parse = TRUE, ...)
get_play_rdf(play = NULL, corpus = NULL, parse = TRUE, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
parse |
Logical, if |
... |
Additional arguments passed to |
RDF data parsed by xml2::read_xml()
.
get_play_metadata
get_play_characters
get_play_rdf(play = "lessing-emilia-galotti", corpus = "ger") # If you want RDF without parsing by xml2::read_xml(): get_play_rdf(play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE)
get_play_rdf(play = "lessing-emilia-galotti", corpus = "ger") # If you want RDF without parsing by xml2::read_xml(): get_play_rdf(play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE)
get_text_chr_spoken()
request lines and stage directions for a play,
given play and corpus names.
get_text_chr_spoken( play = NULL, corpus = NULL, gender = NULL, split_text = TRUE, ... ) get_text_chr_spoken_bych( play = NULL, corpus = NULL, split_text = TRUE, as_data_frame = FALSE, ... ) get_text_chr_stage(play = NULL, corpus = NULL, split_text = TRUE, ...) get_text_chr_stage_with_sp(play = NULL, corpus = NULL, split_text = TRUE, ...)
get_text_chr_spoken( play = NULL, corpus = NULL, gender = NULL, split_text = TRUE, ... ) get_text_chr_spoken_bych( play = NULL, corpus = NULL, split_text = TRUE, as_data_frame = FALSE, ... ) get_text_chr_stage(play = NULL, corpus = NULL, split_text = TRUE, ...) get_text_chr_stage_with_sp(play = NULL, corpus = NULL, split_text = TRUE, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
gender |
Character, optional parameter to extract lines for characters
of specified gender: |
split_text |
If |
... |
Additional arguments passed to |
as_data_frame |
If |
For get_text_chr_spoken()
, get_text_chr_stage()
and
get_text_chr_stage_with_sp()
: a character vector (if
split_text = TRUE
, the default value) or a single character value (if
split_text = FALSE)
.
For get_text_chr_spoken_bych()
:
split_text = TRUE
and as_data_frame = FALSE
(default)a named list with character vectors for every character
split_text = FALSE
and as_data_frame = FALSE
a named character vector (one value = one character)
split_text = TRUE
and as_data_frame = TRUE
a data
frame: every row represent a character, text of a play is stored in a
"text"
column, the "text"
column is a list column with a
character vector of lines
split_text = FALSE
and as_data_frame = TRUE
a data
frame: every row represent a character, text of a play is stored in a
"text"
column, the "text"
column is a simple character
column
get_text_chr_spoken_bych()
: Retrieves lines grouped by characters in a
play, given play and corpus names.
get_text_chr_stage()
: Retrieves all stage directions of a play,
given play and corpus names.
get_text_chr_stage_with_sp()
: Retrieves all stage directions of a play
including speakers (if applicable), given play and corpus names.
get_text_chr_spoken(play = "lessing-emilia-galotti", corpus = "ger") get_text_chr_spoken( play = "lessing-emilia-galotti", corpus = "ger", gender = "FEMALE" ) get_text_chr_spoken( play = "lessing-emilia-galotti", corpus = "ger", gender = "FEMALE", split_text = FALSE ) get_text_chr_spoken_bych( play = "lessing-emilia-galotti", corpus = "ger" ) get_text_chr_stage( play = "lessing-emilia-galotti", corpus = "ger" ) get_text_chr_stage_with_sp( play = "lessing-emilia-galotti", corpus = "ger" )
get_text_chr_spoken(play = "lessing-emilia-galotti", corpus = "ger") get_text_chr_spoken( play = "lessing-emilia-galotti", corpus = "ger", gender = "FEMALE" ) get_text_chr_spoken( play = "lessing-emilia-galotti", corpus = "ger", gender = "FEMALE", split_text = FALSE ) get_text_chr_spoken_bych( play = "lessing-emilia-galotti", corpus = "ger" ) get_text_chr_stage( play = "lessing-emilia-galotti", corpus = "ger" ) get_text_chr_stage_with_sp( play = "lessing-emilia-galotti", corpus = "ger" )
get_text_tei()
requests a text for a play in 'TEI' format, given play
and corpus names. 'TEI' is an XML vocabulary, which makes it easy to extract
structural information (Fischer et al. 2019).
get_text_tei(play = NULL, corpus = NULL, ...)
get_text_tei(play = NULL, corpus = NULL, ...)
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
... |
Additional arguments passed to |
TEI data parsed by
xml2::read_xml()
.
Fischer F, Börner I, Göbel M, Hechtl A, Kittel C, Milling C, Trilcke P (2019). “Programmable corpora: Introducing DraCor, an infrastructure for the research on European drama.” In Digital Humanities 2019: "Complexities" (DH2019). doi:10.5281/zenodo.4284002.
get_text_df
get_text_chr_spoken
tei_to_df
get_text_tei(play = "lessing-emilia-galotti", corpus = "ger") # If you want a text in TEI without parsing by xml2::read_xml(): get_text_tei(play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE)
get_text_tei(play = "lessing-emilia-galotti", corpus = "ger") # If you want a text in TEI without parsing by xml2::read_xml(): get_text_tei(play = "lessing-emilia-galotti", corpus = "ger", parse = FALSE)
label_cooccur_igraph()
returns labels for plotting cooccur_igraph
object. label_cooccur_igraph
gives control of overplotting for labels (i.e. character names) by deleting
extra labels if there are too many of them. Thus, it highlights the most
significant characters of the selected play. This function can be used to set
vertex.label
parameter for plot.cooccur_igraph
.
label_cooccur_igraph( graph, max_graph_size = 30L, top_nodes = 3L, label_size_metric = c("betweenness", "numOfWords", "numOfScenes", "numOfSpeechActs", "degree", "weightedDegree", "closeness", "eigenvector") )
label_cooccur_igraph( graph, max_graph_size = 30L, top_nodes = 3L, label_size_metric = c("betweenness", "numOfWords", "numOfScenes", "numOfSpeechActs", "degree", "weightedDegree", "closeness", "eigenvector") )
graph |
|
max_graph_size |
Integer, maximum network size for plotting all labels.
If you don't want to delete any labels, set |
top_nodes |
Integer, number of labels to be plotted. Characters with the highest number of words will be selected. |
label_size_metric |
Character, a metric that is used to rank characters in a play. |
label_cooccur_igraph
takes labels from a vertices data frame column
"name"
, checks that network size is more than max_graph_size
,
if it is true, returns names for top top_nodes
and NA for the rest.
Character vector of character names.
emilia_igraph <- get_net_cooccur_igraph( play = "lessing-emilia-galotti", corpus = "ger" ) label_cooccur_igraph(emilia_igraph, max_graph_size = 10, top_nodes = 4)
emilia_igraph <- get_net_cooccur_igraph( play = "lessing-emilia-galotti", corpus = "ger" ) label_cooccur_igraph(emilia_igraph, max_graph_size = 10, top_nodes = 4)
get_dracor()
request data on all plays in selected (or all) corpora.
get_dracor()
returns dracor
object that inherits
data frame (and can be used as such) but specified summary
method.
## S3 method for class 'dracor' summary(object, ...) get_dracor(corpus = "all", full_metadata = TRUE)
## S3 method for class 'dracor' summary(object, ...) get_dracor(corpus = "all", full_metadata = TRUE)
object |
An object of class |
... |
Other arguments to be passed to |
corpus |
Character vector with names of the corpora (you can find all
corpora names in |
full_metadata |
Logical: if |
You need to provide a vector with valid names of the corpora, e.g.
"rus"
, "ger"
or "shake"
. Use function
get_dracor_meta
to extract names for all available corpora.
dracor
object that inherits data frame (and can be used as
such).
summary(dracor)
: Meaningful summary for dracor_meta
object.
tat <- get_dracor("tat") summary(tat) get_dracor(c("ita", "span", "greek")) get_dracor()
tat <- get_dracor("tat") summary(tat) get_dracor(c("ita", "span", "greek")) get_dracor()
The function get_text_df()
returns you a data frame with text of
the selected play. tei_to_df()
allows to convert an existing 'TEI'
object to a data frame.
tei_to_df(tei) get_text_df(play, corpus)
tei_to_df(tei) get_text_df(play, corpus)
tei |
A TEI object stored as an object of class |
play |
Character, name of a play (you can find all play names in
|
corpus |
Character, name of the corpus (you can find all corpus names in
|
Text of a play as a data frame in
tidy text format.
Each row represent one token. The text tokenised by lines, notes and stage
directions (<p>, <l>, <stage> or <note>).
Column text
contains text of the line, other columns contain metadata
for the line.
get_text_df()
: Retrieves all stage directions of a play,
given play and corpus names.
get_text_df(play = "lessing-emilia-galotti", corpus = "ger") emilia_tei <- get_text_tei(play = "lessing-emilia-galotti", corpus = "ger") tei_to_df(emilia_tei)
get_text_df(play = "lessing-emilia-galotti", corpus = "ger") emilia_tei <- get_text_tei(play = "lessing-emilia-galotti", corpus = "ger") tei_to_df(emilia_tei)