Package: bdc 1.1.5

Bruno Ribeiro

bdc: Biodiversity Data Cleaning

It brings together several aspects of biodiversity data-cleaning in one place. 'bdc' is organized in thematic modules related to different biodiversity dimensions, including 1) Merge datasets: standardization and integration of different datasets; 2) Pre-filter: flagging and removal of invalid or non-interpretable information, followed by data amendments; 3) Taxonomy: cleaning, parsing, and harmonization of scientific names from several taxonomic groups against taxonomic databases locally stored through the application of exact and partial matching algorithms; 4) Space: flagging of erroneous, suspect, and low-precision geographic coordinates; and 5) Time: flagging and, whenever possible, correction of inconsistent collection date. In addition, it contains features to visualize, document, and report data quality – which is essential for making data quality assessment transparent and reproducible. The reference for the methodology is Bruno et al. (2022) <doi:10.1111/2041-210X.13868>.

Authors:Bruno Ribeiro [aut, cre], Santiago Velazco [aut], Karlo Guidoni-Martins [aut], Geiziane Tessarolo [aut], Lucas Jardim [aut], Steven Bachman [ctb], Rafael Loyola [ctb]

bdc_1.1.5.tar.gz
bdc_1.1.5.zip(r-4.5)bdc_1.1.5.zip(r-4.4)bdc_1.1.5.zip(r-4.3)
bdc_1.1.5.tgz(r-4.4-any)bdc_1.1.5.tgz(r-4.3-any)
bdc_1.1.5.tar.gz(r-4.5-noble)bdc_1.1.5.tar.gz(r-4.4-noble)
bdc_1.1.5.tgz(r-4.4-emscripten)bdc_1.1.5.tgz(r-4.3-emscripten)
bdc.pdf |bdc.html
bdc/json (API)
NEWS

# Install 'bdc' in R:
install.packages('bdc', repos = c('https://bdc-proj.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/brunobrr/bdc/issues

On CRAN:

bdcbiodiversity-dataworkflow

8.04 score 23 stars 63 scripts 397 downloads 23 exports 123 dependencies

Last updated 13 hours agofrom:97d51175e9 (on master). Checks:OK: 1 ERROR: 6. Indexed: no.

TargetResultDate
Doc / VignettesOKOct 30 2024
R-4.5-winERROROct 30 2024
R-4.5-linuxERROROct 30 2024
R-4.4-winERROROct 30 2024
R-4.4-macERROROct 30 2024
R-4.3-winERROROct 30 2024
R-4.3-macERROROct 30 2024

Exports:%>%bdc_basisOfRecords_notStandardbdc_clean_namesbdc_coordinates_country_inconsistentbdc_coordinates_emptybdc_coordinates_from_localitybdc_coordinates_outOfRangebdc_coordinates_precisionbdc_coordinates_transposedbdc_country_from_coordinatesbdc_country_standardizedbdc_create_figuresbdc_create_reportbdc_eventDate_emptybdc_filter_out_flagsbdc_filter_out_namesbdc_query_names_taxadbbdc_quickmapbdc_scientificName_emptybdc_standardize_datasetsbdc_summary_colbdc_year_from_eventDatebdc_year_outOfRange

Dependencies:askpassbase64encBHbitbit64blobbslibcachemclassclassIntclicliprcodetoolscolorspacecontentidCoordinateCleanercpp11crayoncrosstalkcrulcurldata.tableDBIdbplyrdigestdoParalleldplyrDTduckdbe1071evaluatefansifarverfastmapfontawesomeforeachfsgenericsgeosphereggplot2gluegtableherehighrhmshtmltoolshtmlwidgetshttpcodehttpuvhttrisobanditeratorsjquerylibjsonliteKernSmoothknitrlabelinglaterlatticelazyevallifecyclemagrittrMASSMatrixmemoisemgcvmimemunsellnlmeoaiopensslpillarpkgconfigplyrprettyunitsprogresspromisesproxypurrrqsR6RApiSerializerappdirsRColorBrewerRcppRcppParallelreadrrgbifrgnparserrlangrmarkdownrnaturalearthrprojroots2sassscalessfspstringdiststringfishstringistringrsystaxadbterratibbletidyrtidyselecttinytextriebeardtzdbunitsurltoolsutf8vctrsviridisLitevroomwhiskerwithrwkxfunxml2yaml

Pre-filter

Rendered fromprefilter.Rmdusingknitr::rmarkdownon Oct 30 2024.

Last update: 2023-09-08
Started: 2021-03-18

Space

Rendered fromspace.Rmdusingknitr::rmarkdownon Oct 30 2024.

Last update: 2022-04-21
Started: 2021-03-23

Standardization and integration of different datasets

Rendered fromintegrate_datasets.Rmdusingknitr::rmarkdownon Oct 30 2024.

Last update: 2022-05-24
Started: 2021-03-23

Taxonomy

Rendered fromtaxonomy.Rmdusingknitr::rmarkdownon Oct 30 2024.

Last update: 2022-05-24
Started: 2021-03-23

Time

Rendered fromtime.Rmdusingknitr::rmarkdownon Oct 30 2024.

Last update: 2022-04-21
Started: 2021-03-23

Readme and manuals

Help Manual

Help pageTopics
Identify records from doubtful source (e.g., 'fossil', MachineObservation')bdc_basisOfRecords_notStandard
Clean and parse scientific namesbdc_clean_names
Identify records within a reference countrybdc_coordinates_country_inconsistent
Identify records with empty geographic coordinatesbdc_coordinates_empty
Identify records lacking or with invalid coordinates but containing locality informationbdc_coordinates_from_locality
Identify records with out-of-range geographic coordinatesbdc_coordinates_outOfRange
Flag low-precise geographic coordinatesbdc_coordinates_precision
Identify transposed geographic coordinatesbdc_coordinates_transposed
Get country names from coordinatesbdc_country_from_coordinates
Standardizes country names and gets country codebdc_country_standardized
Create figures reporting the results of the bdc packagebdc_create_figures
Create a report summarizing the results of data quality testsbdc_create_report
Identify records with empty event datebdc_eventDate_empty
Remove columns with the results of data quality testsbdc_filter_out_flags
Filter out records according to their taxonomic statusbdc_filter_out_names
Harmonizing taxon names against local stored taxonomic databasesbdc_query_names_taxadb
Create a map of points using ggplot2bdc_quickmap
Identify records with empty scientific namesbdc_scientificName_empty
Standardize datasets columns based on metadatabdc_standardize_datasets
Create or update the column summarizing the results of data quality testsbdc_summary_col
Extract year from eventDatebdc_year_from_eventDate
Identify records with year out-of-rangebdc_year_outOfRange