Download data from the 'observation' ("observacao") table of one or more datasets published in the Free Brazilian Repository for Open Soil Data (FEBR), https://www.pedometria.org/febr/. This table includes data such as latitude, longitude, date of observation, underlying geology, land use and vegetation, local topography, soil classification, and much more.
observation( data.set, variable, stack = FALSE, missing = list(coord = "keep", time = "keep", data = "keep"), standardization = list(crs = NULL, time.format = NULL, units = FALSE, round = FALSE), harmonization = list(harmonize = FALSE, level = 2), progress = TRUE, verbose = TRUE, febr.repo = NULL )
data.set | Character vector indicating the identification code of one or more data sets.
Use |
---|---|
variable | (optional) Character vector indicating one or more variables. Accepts only general
identification codes, e.g. |
stack | (optional) Logical value indicating if tables from different datasets should be stacked on a
single table for output. Requires |
missing | (optional) List with named sub-arguments indicating what should be done with an
observation missing spatial coordinates, |
standardization | (optional) List with named sub-arguments indicating how to perform data standardization.
|
harmonization | (optional) List with named sub-arguments indicating if and how to perform data harmonization.
|
progress | (optional) Logical value indicating if a download progress bar should be displayed. |
verbose | (optional) Logical value indicating if informative messages should be displayed. Generally useful to identify datasets with inconsistent data. Please report to febr-forum@googlegroups.com if you find any issue. |
febr.repo | (optional) Defaults to the remote file directory of the Federal University of Technology - Paraná at https://cloud.utfpr.edu.br/index.php/s/Df6dhfzYJ1DDeso. Alternatively, a local directory path can be informed if the user has a local copy of the data repository. |
A list of data frames or a data frame with data on the chosen variable(s) of the chosen dataset(s).
Standard identification variables and their content are as follows:
dataset_id
. Identification code of the dataset in the FEBR to which an observation
belongs.
observacao_id
. Identification code of an observation in a dataset.
sisb_id
. Identification code of an observation in the Brazilian Soil Information System
maintained by the Brazilian Agricultural Research Corporation (EMBRAPA).
ibge_id
. Identification code of an observation in the database of the Brazilian Institute
of Geography and Statistics (IBGE).
observacao_data
. Date (dd-mm-yyyy) in which an observation was made.
coord_sistema
. EPSG code of the coordinate reference system.
coord_x
. Longitude (deg) or easting (m).
coord_y
. Latitude (deg) or northing (m).
coord_precisao
. Precision with which x- and y-coordinates were determined (m).
coord_fonte
. Source of the x- and y-coordinates.
pais_id
. Country code (ISO 3166-1 alpha-2).
estado_id
. Code of the Brazilian federative unit where an observation was made.
municipio_id
. Name of the Brazilian municipality where as observation was made.
amostra_tipo
. Type of sample taken.
amostra_quanti
. Number of samples taken.
amostra_area
. Sampling area.
Further details about the content of the standard identification variables can be found in https://docs.google.com/document/d/1Bqo8HtitZv11TXzTviVq2bI5dE6_t_fJt0HE-l3IMqM (in Portuguese).
Data harmonization consists of converting the values of a variable determined using some method B so that they are (approximately) equivalent to the values that would have been obtained if the standard method A had been used instead. For example, converting carbon content values obtained using a wet digestion method to the standard dry combustion method is data harmonization.
A heuristic data harmonization procedure is implemented in the febr package. It consists of
grouping variables based on a chosen number of levels of their identification code. For example,
consider a variable with an identification code composed of four levels, aaa_bbb_ccc_ddd
, where
aaa
is the first level and ddd
is the fourth level. Now consider a related variable,
aaa_bbb_eee_fff
. If the harmonization is to consider all four coding levels (level = 4
),
then these two variables will remain coded as separate variables. But if level = 2
, then both
variables will be re-coded as aaa_bbb
, thus becoming the same variable.
Check the new core data download function readFEBR()
.
Alessandro Samuel-Rosa alessandrosamuelrosa@gmail.com
res <- observation(data.set = "ctb0013")#> | | | 0%#> #>#> | |======================================================================| 100%if (interactive()) { # Download two data sets and standardize CRS res <- observation( data.set = paste("ctb000", 4:5, sep = ""), variable = "taxon", standardization = list(crs = "EPSG:4674")) # Try to download a data set that is not available yet res <- observation(data.set = "ctb0020") # Try to download a non existing data set #res <- observation(data.set = "ctb0000") # Try to read all files from local directory febr.repo <- "~/ownCloud/febr-repo/publico" febr.repo <- ifelse(dir.exists(febr.repo), febr.repo, NULL) res <- observation(data.set = "all", febr.repo = febr.repo) }