labtaxa provides reproducible access to the USDA-NRCS Kellogg Soil Survey Laboratory (KSSL) database snapshots in R.
The package automatically downloads, caches, and loads the complete Lab Data Mart database (~65,000 soil profiles with detailed laboratory analyses) for soil science research and education.
To get up and running quickly you can use the Docker container. The
labtaxa container is based on rocker/rstudio with a pinned R
version (see Dockerfile for current version) for reproducibility. In
addition to the standard RStudio tools, the container has:
- Cached Lab Data Mart GeoPackage - Complete KSSL database (lab and spatial data)
- Morphologic database - Derived from NASIS pedon field descriptions
- Pre-processed data -
SoilProfileCollectionobjects cached as RDS files - Curated packages - All dependencies for soil science analysis
- RStudio Server - Full IDE accessible via web browser
The data and packages are exactly versioned (not floating), guaranteeing reproducible results: the same Docker tag always gives the same environment and data.
From Docker Hub:
docker pull brownag/labtaxa:latestOr from GitHub:
docker pull ghcr.io/brownag/labtaxa:latestThe easiest way to run the container is with Docker Compose. A
docker-compose.yml file is included in this repository:
# Clone the repository
git clone https://github.com/brownag/labtaxa.git
cd labtaxa
# Start the container (downloads image on first run)
docker-compose up -d
# Stop the container
docker-compose downThen open your web browser and navigate to http://localhost:8787. The
default username is rstudio and the default password is soilscience.
Features:
- Persistent
projects/directory for your work - Automatic volume management for package cache
- Pre-configured resource limits (16GB memory, 4 CPUs)
- Health checks to monitor container status
Alternatively, you can run the container directly:
docker run -d -p 8787:8787 -e PASSWORD=mypassword -v ~/Documents:/home/rstudio/Documents -e ROOT=TRUE brownag/labtaxaThen open your web browser and navigate to http://localhost:8787. The
default username is rstudio and the default password is mypassword.
You can install the development version of {labtaxa} from GitHub:
if (!require("labtaxa"))
remotes::install_github("brownag/labtaxa")Download (and cache) the latest Lab Data Mart SQLite snapshot from https://ncsslabdatamart.sc.egov.usda.gov/ like so:
library(labtaxa)
ldm <- get_LDM_snapshot()
#> Loading required namespace: RSQLite
#> Using cached database: /home/andrew/.local/share/R/labtaxa/ncss_labdata.gpkg
#> Patching Lab Data Mart database...
#> Patching morphologic database...
#> Loading Lab Data Mart data...
#> converting profile IDs from integer to character
#> Warning: Horizon top depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> Warning: Horizon bottom depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> Warning: One or more horizon bottom depths are shallower than top depth. Check
#> depth logic with aqp::checkHzDepthLogic()
#> Loaded 65403 soil profiles
#> Loading morphologic data...
#> NOTE: some siteobsiid have surface fragment cover >= 100%
#> NOTE: some phiid have multiple lab sample IDs (labsampnum)
#> NOTE: some phiid have rock fragment volume >= 100%
#> replacing missing lower horizon depths with top depth + 1cm ... [4948 horizons]
#> top/bottom depths equal, adding 1cm to bottom depth ... [5317 horizons]
#> Warning: Horizon top depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> Warning: Horizon bottom depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> Warning: One or more horizon bottom depths are shallower than top depth. Check
#> depth logic with aqp::checkHzDepthLogic()
#> -> QC: duplicate pedons:
#> Use `get('dup.pedon.ids', envir=get_soilDB_env())` for pedon record IDs (peiid)
#> -> QC: pedon horizons with rock fragment volume >=100%:
#> Use `get('rock.fragment.volume.gt100.phiid', envir=get_soilDB_env())` for pedon horizon record IDs (phiid)
#> -> QC: surface fragment records from multiple site observations:
#> Use `get('multisiteobs.surface', envir=get_soilDB_env())` for site (siteiid) and site observation (siteobsiid)
#> -> QC: pedons with surface fragment cover >=100%:
#> Use `get('surface.fragment.cover.gt100.siteobsiid', envir=get_soilDB_env())` for site observation record IDs (siteobsiid)
#> -> QC: horizons with multiple lab samples:
#> Use `get('multiple.labsampnum.per.phiid', envir=get_soilDB_env())` for pedon horizon record IDs (phiid)
#> Warning in aqp::`hzidname<-`(`*tmp*`, value = "phiid"): horizon ID name (phiid)
#> not unique. unique ID not changed.
#> -> QC: pedons missing bottom horizon depths:
#> Use `get('missing.bottom.depths', envir=get_soilDB_env())` for User Pedon IDs (upedonid)
#> -> QC: equal horizon top and bottom depths:
#> Use `get('top.bottom.equal', envir=get_soilDB_env())` for User Pedon IDs (upedonid)
#> Caching results...Downloaded and derived files will be cached in platform-specific
directory specified by ldm_data_dir() using cache_labtaxa()
In the Docker container the snapshot has already been created and cached from the latest data (as of the last time the container was built). Updates to the method used to create the cache, as well as scheduled (monthly) updates occur.
The cached data help to get off and running quickly analyzing the entire KSSL database using the {aqp} R package toolchain.
The lab data are pre-loaded in a large SoilProfileCollection object (over 65,000 profiles). In only a few seconds from when you have the Docker container loaded, you can be filtering and processing the lab data object. Downloading archives of the complete databases can take 10s of minutes to a couple hours (depending on internet connection). Only in cases where the absolute most recent data are needed would require doing a cache update.
The downloaded databases (GeoPackage, SQLite) are queried locally using
{soilDB} functions fetchLDM() and fetchNASIS(). The {soilDB}
functions can take a couple minutes to process on larger databases like
this, so the container building process front loads these more costly
processing steps. Querying the data using a method like this essentially
precedes all analyses. soilDB provides standard aggregation methods that
produce {aqp} SoilProfileCollections, which provide a convenient data
structure for working with horizon and site level data associated with
specific soil profiles.
When you start up {labtaxa} in the Docker container you hwill ave the latest database and the first-step data object (as if you ran the {soilDB} functions) readily available for post-processing for answering specific questions.
# Check the loaded object
length(ldm)
#> [1] 65403
str(ldm)
#> Formal class 'SoilProfileCollection' [package "aqp"] with 8 slots
#> ..@ idcol : chr "pedon_key"
#> ..@ hzidcol : chr "hzID"
#> ..@ depthcols : chr [1:2] "hzn_top" "hzn_bot"
#> ..@ metadata :List of 6
#> .. ..$ aqp_df_class : chr "data.frame"
#> .. ..$ aqp_group_by : chr ""
#> .. ..$ aqp_hzdesgn : chr "hzn_desgn"
#> .. ..$ aqp_hztexcl : chr ""
#> .. ..$ depth_units : chr "cm"
#> .. ..$ stringsAsFactors: logi FALSE
#> ..@ horizons :'data.frame': 403461 obs. of 435 variables:
#> .. ..$ OBJECTID : int [1:403461] 70 71 72 73 74 75 715 716 717 718 ...
#> .. ..$ objectid_1 : int [1:403461] 66 67 68 69 70 71 711 712 713 714 ...
#> .. ..$ layer_key : int [1:403461] 66 67 68 69 70 71 711 712 713 714 ...
#> .. ..$ labsampnum : chr [1:403461] "40A00124" "40A00125" "40A00126" "40A00127" ...
#> .. ..$ project_key : int [1:403461] 1 1 1 1 1 1 1 1 1 1 ...
#> .. ..$ pedon_key : chr [1:403461] "10" "10" "10" "10" ...
#> .. ..$ layer_sequence : int [1:403461] 1 2 3 4 5 6 1 2 3 4 ...
#> .. ..$ layer_type : chr [1:403461] "horizon" "horizon" "horizon" "horizon" ...
#> .. ..$ layer_field_label_1 : chr [1:403461] "2124" "2125" "2126" "2127" ...
#> .. ..$ layer_field_label_2 : chr [1:403461] NA NA NA NA ...
#> .. ..$ layer_field_label_3 : chr [1:403461] NA NA NA NA ...
#> .. ..$ hzn_top : num [1:403461] 0 18 46 71 91 117 0 13 30 48 ...
#> .. ..$ hzn_bot : num [1:403461] 18 46 71 91 117 152 13 30 48 81 ...
#> .. ..$ hzn_desgn_old : chr [1:403461] "A1p" "ACca" "Cca" "C1" ...
#> .. ..$ hzn_desgn : chr [1:403461] "Ap" "ABk" "Bk" "C" ...
#> .. ..$ hzn_discontinuity : int [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ hzn_master : chr [1:403461] "A" "AB" "B" "C" ...
#> .. ..$ hzn_prime : chr [1:403461] NA NA NA NA ...
#> .. ..$ hzn_vert_subdvn : int [1:403461] NA NA NA NA 1 2 NA NA 1 2 ...
#> .. ..$ hzn_desgn_other : chr [1:403461] NA NA NA NA ...
#> .. ..$ non_hzn_desgn : chr [1:403461] NA NA NA NA ...
#> .. ..$ stratified_textures_flag : int [1:403461] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ texture_description : chr [1:403461] NA NA NA NA ...
#> .. ..$ result_source_key : int [1:403461] 66 67 68 69 70 71 711 712 713 714 ...
#> .. ..$ prep_code : chr [1:403461] "S" "S" "S" "S" ...
#> .. ..$ texture_lab : chr [1:403461] "cl" "cl" "sicl" "sicl" ...
#> .. ..$ particle_size_method : chr [1:403461] "3A1a1a" "3A1a1a" "3A1a1a" "3A1a1a" ...
#> .. ..$ clay_total : num [1:403461] 32.1 37.5 37.9 33.9 35.5 31.4 25.3 25.6 28.9 26.7 ...
#> .. ..$ silt_total : num [1:403461] 40.2 41.1 55.8 62.3 59.7 64.5 43.9 32.7 35.1 32.1 ...
#> .. ..$ sand_total : num [1:403461] 27.7 21.4 6.3 3.8 4.8 4.1 30.8 41.7 36 41.2 ...
#> .. ..$ clay_fine : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ clay_caco3 : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ silt_fine : num [1:403461] 21.8 21.1 37.1 43.2 41.2 40.3 24 16 18.8 18 ...
#> .. ..$ silt_coarse : num [1:403461] 18.4 20 18.7 19.1 18.5 24.2 19.9 16.7 16.3 14.1 ...
#> .. ..$ sand_very_fine : num [1:403461] 17.3 17.4 4.3 2.5 1.4 2.2 10.3 11.5 9.3 10 ...
#> .. ..$ sand_fine : num [1:403461] 6 2.5 1 0.7 0.9 1.1 9.5 13.7 12 12.6 ...
#> .. ..$ sand_medium : num [1:403461] 2 0.7 0.4 0.3 0.6 0.5 4.8 7.2 5.9 6.7 ...
#> .. ..$ sand_coarse : num [1:403461] 1.8 0.6 0.4 0.2 1.1 0.3 4.4 6.6 5.5 6.7 ...
#> .. ..$ sand_very_coarse : num [1:403461] 0.6 0.2 0.2 0.1 0.8 0 1.8 2.7 3.3 5.2 ...
#> .. ..$ frag_2_5_mm_wt_pct_lt_75 : num [1:403461] 0 0 0 0 0 0 1 1 1 4 ...
#> .. ..$ frag__2_20_mm_wt_pct_lt_75 : num [1:403461] 0 0 0 0 0 0 1 1 2 6 ...
#> .. ..$ frag_5_20_mm_wt_pct_lt_75 : num [1:403461] 0 0 0 0 0 0 0 0 1 2 ...
#> .. ..$ frag_20_75_mm_wt_pct_lt_75 : num [1:403461] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ total_frag_wt_pct_gt_2_mm_ws : num [1:403461] 0 0 0 0 0 0 1 1 2 6 ...
#> .. ..$ wt_pct_1_tenth_to_75_mm : num [1:403461] 10 4 2 1 3 2 21 30 27 31 ...
#> .. ..$ bulk_density_tenth_bar : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ bulk_density_tenth_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ bulk_density_third_bar : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ bulk_density_third_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ bulk_density_oven_dry : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ bulk_density_oven_dry_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ bulk_density_lt_2_mm_air_dry : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ bulk_density_air_dry_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ bd_third_bar_lt2_reconstituted : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ bd_thirdbar_reconstituted_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ bulk_den_ovendry_reconstituted : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ bulk_density_odreconstituted_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ bulk_density_field_moist : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ bulk_density_field_moist_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ particle_density_less_than_2mm : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ particle_density_lt_2mm_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ particle_density_gt_2_mm : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ particle_density_gt_2mm_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ cole_whole_soil : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ cole_whole_soil_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ le_third_fifteen_lt2_mm : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ le_third_fifteen_lt2_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ le_third_ovendry_lt_2_mm : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ le_third_ovendry_lt_2_mm_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ le_field_moist_to_oben_dry : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ le_fm_to_od_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_0_bar_sieve : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_0_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_6_hundredths : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_6_hund_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_10th_bar : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_10th_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_third_bar : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_thirdbar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_1_bar : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_1_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_2_bar : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_2_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_3_bar_sieve : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_3_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_5_bar_sieve : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_5_bar_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ water_retention_15_bar : num [1:403461] 16.6 15.2 14.3 14 15.9 14.4 18.4 10.8 9.4 9.2 ...
#> .. ..$ water_retention_15_bar_method : chr [1:403461] "3C2a1a" "3C2a1a" "3C2a1a" "3C2a1a" ...
#> .. ..$ water_retention_field_state : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ water_retention_field_state_me : chr [1:403461] NA NA NA NA ...
#> .. ..$ airdry_ovendry_ratio : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ atterberg_liquid_limit : chr [1:403461] NA NA NA NA ...
#> .. ..$ atterberg_liquid_limit_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ atterberg_plasticity_index : chr [1:403461] NA NA NA NA ...
#> .. ..$ plastic_limit : chr [1:403461] NA NA NA NA ...
#> .. ..$ plastic_limit_method : chr [1:403461] NA NA NA NA ...
#> .. ..$ aggregate_stability_05_2_mm : num [1:403461] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ aggregate_stability_05_2_method : chr [1:403461] NA NA NA NA ...
#> .. .. [list output truncated]
#> ..@ site :'data.frame': 65403 obs. of 116 variables:
#> .. ..$ pedon_key : chr [1:65403] "10" "100" "1000" "10000" ...
#> .. ..$ Shape : blob [1:65403]
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 11 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. ..$ : raw [1:29] 47 50 00 01 ...
#> .. .. .. [list output truncated]
#> .. .. ..@ ptype: raw(0)
#> .. ..$ site_key : int [1:65403] 10 100 1000 10000 10001 10002 10003 10004 10005 10006 ...
#> .. ..$ pedlabsampnum : chr [1:65403] "40A0017" "40A0107" "40A1039" "83P0141" ...
#> .. ..$ peiid : int [1:65403] 47360 47278 106853 524594 524595 524596 524597 524598 524599 524600 ...
#> .. ..$ upedonid : chr [1:65403] "1954ND067001" "S1950ND075006" "S1957WA063005" "S1982NY061003" ...
#> .. ..$ labdatadescflag : int [1:65403] 0 1 0 1 1 1 1 1 1 1 ...
#> .. ..$ priority : chr [1:65403] "B" "B" "D" "B" ...
#> .. ..$ priority2 : chr [1:65403] "C" "A" "C" "A" ...
#> .. ..$ samp_name : chr [1:65403] "Bearden" "Hamlet" "COUSE" "Snd" ...
#> .. ..$ samp_class_type : chr [1:65403] "series" NA "series" NA ...
#> .. ..$ samp_classdate : chr [1:65403] "1954-06-02T00:00:00.0Z" "1950-09-19T00:00:00.0Z" "1957-09-11T00:00:00.0Z" "1982-11-01T00:00:00.0Z" ...
#> .. ..$ samp_classification_name : chr [1:65403] "Fine-silty, mixed, superactive, frigid Aeric Calciaquolls" "Fine-loamy, mixed Aquic Haploborolls" NA "Loamy Aquic Udorthent" ...
#> .. ..$ samp_taxorder : chr [1:65403] "mollisols" "mollisols" NA "entisols" ...
#> .. ..$ samp_taxsuborder : chr [1:65403] "aquolls" "borolls" NA "orthents" ...
#> .. ..$ samp_taxgrtgroup : chr [1:65403] "calciaquolls" "haploborolls" NA "udorthents" ...
#> .. ..$ samp_taxsubgrp : chr [1:65403] "aeric calciaquolls" "aquic haploborolls" NA "aquic udorthents" ...
#> .. ..$ samp_taxpartsize : chr [1:65403] "fine-silty" "fine-loamy" NA "loamy" ...
#> .. ..$ samp_taxpartsizemod : chr [1:65403] NA NA NA NA ...
#> .. ..$ samp_taxceactcl : chr [1:65403] "superactive" NA NA NA ...
#> .. ..$ samp_taxreaction : chr [1:65403] NA NA NA NA ...
#> .. ..$ samp_taxtempcl : chr [1:65403] "frigid" NA NA NA ...
#> .. ..$ samp_taxmoistscl : chr [1:65403] NA NA NA "aquic" ...
#> .. ..$ samp_taxtempregime : chr [1:65403] NA NA NA NA ...
#> .. ..$ samp_taxminalogy : chr [1:65403] "mixed" "mixed" NA NA ...
#> .. ..$ samp_taxother : chr [1:65403] NA NA NA NA ...
#> .. ..$ samp_osdtypelocflag : int [1:65403] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ corr_name : chr [1:65403] "Bearden" "Hamlet" NA "North Meadow" ...
#> .. ..$ corr_class_type : chr [1:65403] "series" "series" NA "taxadjunct" ...
#> .. ..$ corr_classdate : chr [1:65403] "1977-07-01T00:00:00.0Z" "2017-02-07T00:00:00.0Z" NA "2012-03-26T17:07:54.0Z" ...
#> .. ..$ corr_classification_name : chr [1:65403] "Fine-silty, mixed, superactive, frigid Aeric Calciaquolls" "Fine-loamy, mixed, superactive, frigid Aquic Hapludolls" NA "Coarse-loamy, mixed, superactive, nonacid, mesic Aquic Udorthents" ...
#> .. ..$ corr_taxorder : chr [1:65403] "mollisols" "mollisols" NA "entisols" ...
#> .. ..$ corr_taxsuborder : chr [1:65403] "aquolls" "udolls" NA "orthents" ...
#> .. ..$ corr_taxgrtgroup : chr [1:65403] "calciaquolls" "hapludolls" NA "udorthents" ...
#> .. ..$ corr_taxsubgrp : chr [1:65403] "aeric calciaquolls" "aquic hapludolls" NA "aquic udorthents" ...
#> .. ..$ corr_taxpartsize : chr [1:65403] "fine-silty" "fine-loamy" NA "coarse-loamy" ...
#> .. ..$ corr_taxpartsizemod : chr [1:65403] NA NA NA NA ...
#> .. ..$ corr_taxceactcl : chr [1:65403] "superactive" "superactive" NA "superactive" ...
#> .. ..$ corr_taxreaction : chr [1:65403] NA NA NA "nonacid" ...
#> .. ..$ corr_taxtempcl : chr [1:65403] "frigid" "frigid" NA "mesic" ...
#> .. ..$ corr_taxmoistscl : chr [1:65403] NA "aquic" NA "aquic" ...
#> .. ..$ corr_taxtempregime : chr [1:65403] NA "frigid" NA "mesic" ...
#> .. ..$ corr_taxminalogy : chr [1:65403] "mixed" "mixed" NA "mixed" ...
#> .. ..$ corr_taxother : chr [1:65403] NA NA NA NA ...
#> .. ..$ corr_osdtypelocflag : int [1:65403] 1 0 NA 0 0 0 0 0 0 0 ...
#> .. ..$ SSL_name : chr [1:65403] NA NA NA "unnamed" ...
#> .. ..$ SSL_class_type : chr [1:65403] NA NA NA NA ...
#> .. ..$ SSL_classdate : chr [1:65403] NA NA NA "1993-02-03T00:00:00.0Z" ...
#> .. ..$ SSL_classification_name : chr [1:65403] NA NA NA "Coarse-loamy, mixed, mesic Aquic Udorthent" ...
#> .. ..$ SSL_taxorder : chr [1:65403] NA NA NA "entisols" ...
#> .. ..$ SSL_taxsuborder : chr [1:65403] NA NA NA "orthents" ...
#> .. ..$ SSL_taxgrtgroup : chr [1:65403] NA NA NA "udorthents" ...
#> .. ..$ SSL_taxsubgrp : chr [1:65403] NA NA NA "aquic udorthents" ...
#> .. ..$ SSL_taxpartsize : chr [1:65403] NA NA NA "coarse-loamy" ...
#> .. ..$ SSL_taxpartsizemod : chr [1:65403] NA NA NA NA ...
#> .. ..$ SSL_taxceactcl : chr [1:65403] NA NA NA NA ...
#> .. ..$ SSL_taxreaction : chr [1:65403] NA NA NA NA ...
#> .. ..$ SSL_taxtempcl : chr [1:65403] NA NA NA "mesic" ...
#> .. ..$ SSL_taxmoistscl : chr [1:65403] NA NA NA "aquic" ...
#> .. ..$ SSL_taxtempregime : chr [1:65403] NA NA NA "mesic" ...
#> .. ..$ SSL_taxminalogy : chr [1:65403] NA NA NA "mixed" ...
#> .. ..$ SSL_taxother : chr [1:65403] NA NA NA NA ...
#> .. ..$ SSL_osdtypelocflag : int [1:65403] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ siteiid : int [1:65403] 47454 103699 106870 119440 119441 119442 119443 119444 119445 119446 ...
#> .. ..$ usiteid : chr [1:65403] "1954ND067001" "S1950ND075006" "S1957WA063005" "S1982NY061003" ...
#> .. ..$ site_obsdate : chr [1:65403] "1954-06-02T00:00:00.0Z" "1950-09-19T00:00:00.0Z" "1957-09-11T00:00:00.0Z" "1982-11-01T00:00:00.0Z" ...
#> .. ..$ latitude_decimal_degrees : num [1:65403] 48.7 48.8 47.5 40.8 40.8 ...
#> .. ..$ longitude_decimal_degrees : num [1:65403] -97.4 -101.7 -117.1 -74 -74 ...
#> .. ..$ country_key : int [1:65403] 244 244 244 244 244 244 244 244 244 244 ...
#> .. ..$ state_key : int [1:65403] 3981 3981 4006 3988 3988 3988 3988 3988 3988 3988 ...
#> .. ..$ county_key : int [1:65403] 6092 6096 7399 6316 6316 6316 6316 6316 6316 6316 ...
#> .. ..$ mlra_key : int [1:65403] 7627 7624 NA 7763 7763 7763 7763 7763 7763 7763 ...
#> .. ..$ ssa_key : int [1:65403] 8322 8326 10612 9431 9431 9431 9431 9431 9431 9431 ...
#> .. ..$ npark_key : int [1:65403] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ nforest_key : int [1:65403] NA NA NA NA NA NA NA NA NA NA ...
#> .. ..$ note : chr [1:65403] "NASIS updated 5/4/2020 8:01:22 AM" "NASIS updated 5/4/2020 8:01:27 AM" "NASIS updated 5/4/2020 8:02:17 AM" "NASIS updated 5/4/2020 8:15:14 AM" ...
#> .. ..$ samp_taxfamhahatmatcl : chr [1:65403] NA NA NA NA ...
#> .. ..$ corr_taxfamhahatmatcl : chr [1:65403] NA NA NA NA ...
#> .. ..$ SSL_taxfamhahatmatcl : chr [1:65403] NA NA NA NA ...
#> .. ..$ pedobjupdate : chr [1:65403] "2014-09-10T14:01:31.0Z" "2017-02-24T20:28:27.0Z" "2016-04-24T22:08:01.0Z" "2014-03-03T20:59:42.0Z" ...
#> .. ..$ siteobjupdate : chr [1:65403] "2014-03-28T13:17:36.0Z" "2017-02-24T20:54:25.0Z" "2016-06-24T19:14:29.0Z" "2016-07-20T16:50:25.0Z" ...
#> .. ..$ wmiid : int [1:65403] 10 99 906 8099 8100 8101 8102 8103 8104 8105 ...
#> .. ..$ Series : chr [1:65403] "Bearden" "Hamlet" "Couse" "North meadow" ...
#> .. ..$ User_pedon_ID : chr [1:65403] "1954ND067001" "S1950ND075006" "S1957WA063005" "S1982NY061003" ...
#> .. ..$ pedon_Key : int [1:65403] 10 100 1000 10000 10001 10002 10003 10004 10005 10006 ...
#> .. ..$ Soil_Classification : chr [1:65403] "Fine-silty, mixed, superactive, frigid Aeric Calciaquolls" "Fine-loamy, mixed, superactive, frigid Aquic Hapludolls" NA "Coarse-loamy, mixed, superactive, nonacid, mesic Aquic Udorthents" ...
#> .. ..$ Primary_Lab_Report : chr [1:65403] "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10&r=1&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=100&r=1&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=1000&r=1&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10000&r=1&submit1=Get+Report" ...
#> .. ..$ Taxonomy_Report : chr [1:65403] "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10&r=3&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=100&r=3&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=1000&r=3&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10000&r=3&submit1=Get+Report" ...
#> .. ..$ Supplementary_Lab_Report : chr [1:65403] "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10&r=2&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=100&r=2&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=1000&r=2&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10000&r=2&submit1=Get+Report" ...
#> .. ..$ Water_Retention_Report : chr [1:65403] "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10&r=6&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=100&r=6&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=1000&r=6&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10000&r=6&submit1=Get+Report" ...
#> .. ..$ Correlation_Report : chr [1:65403] "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10&r=7&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=100&r=7&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=1000&r=7&submit1=Get+Report" "https://ncsslabdatamart.sc.egov.usda.gov/rptExecute.aspx?p=10000&r=7&submit1=Get+Report" ...
#> .. ..$ pedon_Description_Report : chr [1:65403] "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=Pedon_Site_Description_usepedoni"| __truncated__ "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=Pedon_Site_Description_usepedoni"| __truncated__ "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=Pedon_Site_Description_usepedoni"| __truncated__ "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=Pedon_Site_Description_usepedoni"| __truncated__ ...
#> .. ..$ Soil_Profile : chr [1:65403] "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=WEB-profiles-by-PEIID&pedon_peiid=47360" "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=WEB-profiles-by-PEIID&pedon_peiid=47278" "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=WEB-profiles-by-PEIID&pedon_peiid=106853" "https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=WEB-profiles-by-PEIID&pedon_peiid=524594" ...
#> .. ..$ Soil_web : chr [1:65403] "https://casoilresource.lawr.ucdavis.edu/gmap/?loc=48.6608,-97.3747" "https://casoilresource.lawr.ucdavis.edu/gmap/?loc=48.8211,-101.747" "https://casoilresource.lawr.ucdavis.edu/gmap/?loc=47.4632,-117.069" "https://casoilresource.lawr.ucdavis.edu/gmap/?loc=40.7997,-73.9571" ...
#> .. ..$ lat : num [1:65403] 48.7 48.8 47.5 40.8 40.8 ...
#> .. ..$ long : num [1:65403] -97.4 -101.7 -117.1 -74 -74 ...
#> .. ..$ user_site_id : chr [1:65403] "54ND067001" "50ND075006" "57WA063005" "82NY061003" ...
#> .. ..$ horizontal_datum_name : chr [1:65403] NA NA "NAD27" "NAD27" ...
#> .. ..$ latitude_direction : chr [1:65403] "north" "north" "north" "north" ...
#> .. .. [list output truncated]
#> ..@ diagnostic :'data.frame': 0 obs. of 0 variables
#> ..@ restrictions:'data.frame': 0 obs. of 0 variablesIf you are running on your own machine you will have to run
get_LDM_snapshot() at least once (as above) before the
load_labtaxa() command works. In future runs you will not need to
re-download or prepare the data unless you need to update the cache.
labtaxa uses semantic versioning for reproducibility:
latest- Most recent data snapshot (always updated)YYYY.MM- Specific month snapshot (e.g.,2026.02for February 2026 data)YYYY.MM.DD- Specific day snapshot (rare, for patch builds)
For reproducible research, always specify a version tag:
# Use a specific month snapshot (recommended for publications)
docker pull ghcr.io/brownag/labtaxa:2026.02Then in R, check your data version:
# Create sample metadata for demonstration
metadata <- list(
snapshot_date = "2026-02-01",
data_source = "USDA NRCS NCSS Lab Data Mart",
r_version = "4.5.2"
)
cat(sprintf("Data from: %s\\n", metadata$snapshot_date))
#> Data from: 2026-02-01\n
cat(sprintf("R version: %s\\n", metadata$r_version))
#> R version: 4.5.2\nGitHub Releases contain checksums for verification: - Each release
is tagged with the data snapshot date - Download
snapshot-metadata.json to verify file integrity
library(labtaxa)
# Checksums are included in release metadata
metadata <- list(
snapshot_date = "2026-02-01",
download_timestamp = "2026-02-01T12:00:00Z",
checksums = list(
list(file = "ncss_labdata.gpkg", sha256 = "abc123def456..."),
list(file = "ncss_morphologic.sqlite", sha256 = "xyz789uvw012...")
)
)
# Check when data was downloaded
cat(sprintf("Data snapshot: %s\\n", metadata$snapshot_date))
#> Data snapshot: 2026-02-01\n
cat(sprintf("Downloaded: %s\\n", metadata$download_timestamp))
#> Downloaded: 2026-02-01T12:00:00Z\n
# Display checksums
cat("\\nChecksums:\\n")
#> \nChecksums:\n
for (file_info in metadata$checksums) {
cat(sprintf("%s: %s\\n", file_info$file, file_info$sha256))
}
#> ncss_labdata.gpkg: abc123def456...\nncss_morphologic.sqlite: xyz789uvw012...\nWhen publishing research using NCSS laboratory data should cite:
National Cooperative Soil Survey National Cooperative Soil Survey Soil Characterization Database http://ncsslabdatamart.sc.egov.usda.gov/ Accessed
You can cite both the package and the data version:
@misc{labtaxa2026,
author = {Brown, Andrew},
title = {labtaxa: USDA KSSL Database Snapshots},
year = {2026},
url = {https://github.com/brownag/labtaxa}
}- Documentation: https://brownag.github.io/labtaxa/
- Report Issues: https://github.com/brownag/labtaxa/issues
- Discussions: https://github.com/brownag/labtaxa/discussions
- Lab Data Mart: https://ncsslabdatamart.sc.egov.usda.gov/
- aqp Package (SoilProfileCollection object): https://cran.r-project.org/package=aqp
- soilDB Package (soil database tools): https://cran.r-project.org/package=soilDB