Download data from the 'observation' ("observacao") table of one or more datasets published in the Free Brazilian Repository for Open Soil Data (FEBR), https://www.pedometria.org/febr/. This table includes data such as latitude, longitude, date of observation, underlying geology, land use and vegetation, local topography, soil classification, and much more.
observation( data.set, variable, stack = FALSE, missing = list(coord = "keep", time = "keep", data = "keep"), standardization = list(crs = NULL, time.format = NULL, units = FALSE, round = FALSE), harmonization = list(harmonize = FALSE, level = 2), progress = TRUE, verbose = TRUE, febr.repo = NULL )
| data.set | Character vector indicating the identification code of one or more data sets.
Use |
|---|---|
| variable | (optional) Character vector indicating one or more variables. Accepts only general
identification codes, e.g. |
| stack | (optional) Logical value indicating if tables from different datasets should be stacked on a
single table for output. Requires |
| missing | (optional) List with named sub-arguments indicating what should be done with an
observation missing spatial coordinates, |
| standardization | (optional) List with named sub-arguments indicating how to perform data standardization.
|
| harmonization | (optional) List with named sub-arguments indicating if and how to perform data harmonization.
|
| progress | (optional) Logical value indicating if a download progress bar should be displayed. |
| verbose | (optional) Logical value indicating if informative messages should be displayed. Generally useful to identify datasets with inconsistent data. Please report to febr-forum@googlegroups.com if you find any issue. |
| febr.repo | (optional) Defaults to the remote file directory of the Federal University of Technology - Paraná at https://cloud.utfpr.edu.br/index.php/s/Df6dhfzYJ1DDeso. Alternatively, a local directory path can be informed if the user has a local copy of the data repository. |
A list of data frames or a data frame with data on the chosen variable(s) of the chosen dataset(s).
Standard identification variables and their content are as follows:
dataset_id. Identification code of the dataset in the FEBR to which an observation
belongs.
observacao_id. Identification code of an observation in a dataset.
sisb_id. Identification code of an observation in the Brazilian Soil Information System
maintained by the Brazilian Agricultural Research Corporation (EMBRAPA).
ibge_id. Identification code of an observation in the database of the Brazilian Institute
of Geography and Statistics (IBGE).
observacao_data. Date (dd-mm-yyyy) in which an observation was made.
coord_sistema. EPSG code of the coordinate reference system.
coord_x. Longitude (deg) or easting (m).
coord_y. Latitude (deg) or northing (m).
coord_precisao. Precision with which x- and y-coordinates were determined (m).
coord_fonte. Source of the x- and y-coordinates.
pais_id. Country code (ISO 3166-1 alpha-2).
estado_id. Code of the Brazilian federative unit where an observation was made.
municipio_id. Name of the Brazilian municipality where as observation was made.
amostra_tipo. Type of sample taken.
amostra_quanti. Number of samples taken.
amostra_area. Sampling area.
Further details about the content of the standard identification variables can be found in https://docs.google.com/document/d/1Bqo8HtitZv11TXzTviVq2bI5dE6_t_fJt0HE-l3IMqM (in Portuguese).
Data harmonization consists of converting the values of a variable determined using some method B so that they are (approximately) equivalent to the values that would have been obtained if the standard method A had been used instead. For example, converting carbon content values obtained using a wet digestion method to the standard dry combustion method is data harmonization.
A heuristic data harmonization procedure is implemented in the febr package. It consists of
grouping variables based on a chosen number of levels of their identification code. For example,
consider a variable with an identification code composed of four levels, aaa_bbb_ccc_ddd, where
aaa is the first level and ddd is the fourth level. Now consider a related variable,
aaa_bbb_eee_fff. If the harmonization is to consider all four coding levels (level = 4),
then these two variables will remain coded as separate variables. But if level = 2, then both
variables will be re-coded as aaa_bbb, thus becoming the same variable.
Check the new core data download function readFEBR().
Alessandro Samuel-Rosa alessandrosamuelrosa@gmail.com
res <- observation(data.set = "ctb0013")#> | | | 0%#> #>#> | |======================================================================| 100%if (interactive()) { # Download two data sets and standardize CRS res <- observation( data.set = paste("ctb000", 4:5, sep = ""), variable = "taxon", standardization = list(crs = "EPSG:4674")) # Try to download a data set that is not available yet res <- observation(data.set = "ctb0020") # Try to download a non existing data set #res <- observation(data.set = "ctb0000") # Try to read all files from local directory febr.repo <- "~/ownCloud/febr-repo/publico" febr.repo <- ifelse(dir.exists(febr.repo), febr.repo, NULL) res <- observation(data.set = "all", febr.repo = febr.repo) }