Download data from the 'layer' ("camada") table of one or more datasets published in the Free Brazilian Repository for Open Soil Data (FEBR), https://www.pedometria.org/febr/. This table includes data such as sampling depth, horizon designation, and variables such as pH, carbon and clay content, and much more.
layer( data.set, variable, stack = FALSE, missing = list(depth = "keep", data = "keep"), standardization = list(plus.sign = "keep", plus.depth = 2.5, lessthan.sign = "keep", lessthan.frac = 0.5, repetition = "keep", combine.fun = "mean", transition = "keep", smoothing.fun = "mean", units = FALSE, round = FALSE), harmonization = list(harmonize = FALSE, level = 2), progress = TRUE, verbose = TRUE, febr.repo = NULL )
data.set | Character vector indicating the identification code of one or more data sets.
Use |
---|---|
variable | (optional) Character vector indicating one or more variables. Accepts only general
identification codes, e.g. |
stack | (optional) Logical value indicating if tables from different datasets should be stacked on a
single table for output. Requires |
missing | (optional) List with named sub-arguments indicating what should be done with a
layer missing data on sampling depth, |
standardization | (optional) List with named sub-arguments indicating how to perform data #' standardization.
|
harmonization | (optional) List with named sub-arguments indicating if and how to perform data harmonization.
|
progress | (optional) Logical value indicating if a download progress bar should be displayed. |
verbose | (optional) Logical value indicating if informative messages should be displayed. Generally useful to identify datasets with inconsistent data. Please report to febr-forum@googlegroups.com if you find any issue. |
febr.repo | (optional) Defaults to the remote file directory of the Federal University of Technology - Paraná at https://cloud.utfpr.edu.br/index.php/s/Df6dhfzYJ1DDeso. Alternatively, a local directory path can be informed if the user has a local copy of the data repository. |
A list of data frames or a data frame with data on the chosen variable(s) of the chosen dataset(s).
Standard identification variables and their content are as follows:
dataset_id
. Identification of the dataset in FEBR to which an observation belongs.
observacao_id
. Identification code of an observation in a dataset.
camada_id
. Sequential layer number, from top to bottom.
camada_nome
. Layer designation according to some standard description guide.
amostra_id
. Laboratory number of a sample.
profund_sup
. Upper boundary of a layer (cm).
profund_inf
. Lower boundary of a layer (cm).
Further details about the content of the standard identification variables can be found in https://docs.google.com/document/d/1Bqo8HtitZv11TXzTviVq2bI5dE6_t_fJt0HE-l3IMqM (in Portuguese).
Data harmonization consists of converting the values of a variable determined using some method B so that they are (approximately) equivalent to the values that would have been obtained if the standard method A had been used instead. For example, converting carbon content values obtained using a wet digestion method to the standard dry combustion method is data harmonization.
A heuristic data harmonization procedure is implemented in the febr package. It consists of
grouping variables based on a chosen number of levels of their identification code. For example,
consider a variable with an identification code composed of four levels, aaa_bbb_ccc_ddd
, where
aaa
is the first level and ddd
is the fourth level. Now consider a related variable,
aaa_bbb_eee_fff
. If the harmonization is to consider all four coding levels (level = 4
),
then these two variables will remain coded as separate variables. But if level = 2
, then both
variables will be re-coded as aaa_bbb
, thus becoming the same variable.
Check the new core data download function readFEBR()
.
Alessandro Samuel-Rosa alessandrosamuelrosa@gmail.com
res <- layer(data.set = "ctb0003")#> | | | 0%#> #>#> | |======================================================================| 100%if (interactive()) { # Download two data sets and standardize units res <- layer( data.set = paste("ctb000", 4:5, sep = ""), variable = "carbono", stack = TRUE, standardization = list(units = TRUE)) # Try to download a data set that is not available yet res <- layer(data.set = "ctb0020") # Try to download a non existing data set # res <- observation(data.set = "ctb0000") # Try to read all files from local directory febr.repo <- "~/ownCloud/febr-repo/publico" febr.repo <- ifelse(dir.exists(febr.repo), febr.repo, NULL) res <- layer(data.set = "all", febr.repo = febr.repo) }