Services
Elevation Service
Main elevation service orchestrator.
- class biosample_enricher.elevation.service.ElevationService(google_api_key=None, enable_google=True, enable_usgs=True, enable_osm=True, enable_open_topo_data=True, osm_endpoint='https://api.open-elevation.com/api/v1/lookup', open_topo_data_endpoint='https://api.opentopodata.org/v1')[source]
Bases:
objectOrchestrates elevation lookups across multiple providers.
- Parameters:
- __init__(google_api_key=None, enable_google=True, enable_usgs=True, enable_osm=True, enable_open_topo_data=True, osm_endpoint='https://api.open-elevation.com/api/v1/lookup', open_topo_data_endpoint='https://api.opentopodata.org/v1')[source]
Initialize the elevation service.
- Parameters:
google_api_key (
str|None) – Google API key (if None, reads from env)enable_google (
bool) – Whether to enable Google providerenable_usgs (
bool) – Whether to enable USGS providerenable_osm (
bool) – Whether to enable OSM providerenable_open_topo_data (
bool) – Whether to enable Open Topo Data providerosm_endpoint (
str) – OSM provider endpoint URLopen_topo_data_endpoint (
str) – Open Topo Data endpoint URL
- classmethod from_env()[source]
Create elevation service from environment variables.
- Returns:
ElevationService– Configured elevation service
- classify_coordinates(lat, lon)[source]
Classify coordinates for provider routing.
- Parameters:
- Returns:
CoordinateClassification– Coordinate classification
- classify_biosample_location(lat, lon)[source]
Classify a biosample location for routing and metadata.
This method provides biosample-specific classification that can be stored with the sample metadata for efficient provider routing.
- select_providers(classification, preferred=None)[source]
Select providers based on coordinate classification.
- Parameters:
classification (
CoordinateClassification) – Coordinate classification resultpreferred (
list[str] |None) – Preferred provider names in order
- Returns:
list[ElevationProvider] – List of providers in priority order
- get_elevation(request, *, read_from_cache=True, write_to_cache=True, timeout_s=20.0)[source]
Get elevation observations from multiple providers.
- Parameters:
request (
ElevationRequest) – Elevation requestread_from_cache (
bool) – Whether to read from cachewrite_to_cache (
bool) – Whether to write to cachetimeout_s (
float) – Request timeout in seconds
- Returns:
list[Observation] – List of elevation observations
- get_best_elevation(observations)[source]
Select the best elevation from multiple observations.
- Parameters:
observations (
list[Observation]) – List of elevation observations- Returns:
ElevationResult|None– Best elevation result, or None if no valid observations
- create_output_envelope(subject_id, observations, read_from_cache=True, write_to_cache=True)[source]
Create output envelope with observations.
- Parameters:
subject_id (
str) – Subject identifierobservations (
list[Observation]) – List of observationsread_from_cache (
bool) – Whether cache was used for readingwrite_to_cache (
bool) – Whether cache was used for writing
- Returns:
OutputEnvelope– Output envelope
Soil Service
Soil enrichment service orchestration.
- class biosample_enricher.soil.service.SoilService[source]
Bases:
objectMulti-provider soil enrichment service.
Orchestrates multiple soil data providers with intelligent cascading: - US locations: USDA NRCS SDA primary, SoilGrids fallback - Global locations: SoilGrids primary
Provides static soil site characterization including taxonomy, properties, and texture classification.
- enrich_location(latitude, longitude, depth_cm='0-5cm')[source]
Enrich a single location with soil data.
- Parameters:
- Returns:
SoilResult– SoilResult with best available soil data
Weather Service
Weather enrichment service for biosample environmental context.
Orchestrates multiple weather providers to deliver day-specific weather data with temporal precision tracking and standardized schema mapping.
- class biosample_enricher.weather.service.ClimateNormalsProvider(*args, **kwargs)[source]
Bases:
ProtocolProtocol for providers that support climate normals.
- __init__(*args, **kwargs)
- class biosample_enricher.weather.service.WeatherService(providers=None)[source]
Bases:
objectMulti-provider weather enrichment service for biosample metadata.
Provides day-specific weather data using a provider fallback chain with temporal precision tracking and standardized output schema.
- get_weather_for_biosample(biosample, target_schema='nmdc')[source]
Get weather data for a biosample and map to target schema.
- get_daily_weather(lat, lon, target_date, parameters=None)[source]
Get daily weather data by integrating results from all providers.
- Parameters:
- Returns:
WeatherResult– WeatherResult with integrated data from all available providers
- get_climate_normals(lat, lon, years_back=30, providers=None)[source]
Get climate averages (normals) for a location from all available providers.
By default, queries ALL available providers and returns results from each successful provider in a MultiProviderClimateNormals object. This allows: - Comparing values across different data sources - Detecting provider outages/failures - Validating data quality by cross-checking - Computing consensus values across providers
Supported providers: - Meteostat: Station-based 1991-2020 normals (30-year WMO standard) - NASA POWER: Satellite-based 2001-2020 climatologies (20-year MERRA-2)
Climate normals represent typical conditions over a multi-year period, providing context for biosample environmental metadata like annual precipitation totals and average temperatures.
For biosample enrichment: - Use this for annual_precpt, annual_temp slots - Use get_daily_weather() for collection-date weather
Following general-purpose design (Issue #199): This method provides comprehensive climate data that ANY project can use. Use the to_submission_schema() method on the result to extract values in submission-schema format (Issue #193).
- Parameters:
lat (
float) – Latitude in decimal degreeslon (
float) – Longitude in decimal degreesyears_back (
int) – Number of years back from current year to request (default: 30). For example, if current year is 2025 and years_back=30, requests period 1995-2025. Providers will return whatever period they actually have available, which may differ.providers (
list[str] |None) – Optional list of provider names to query (e.g., [“meteostat”, “nasa_power”]). If None, queries ALL available providers (default behavior).
- Returns:
MultiProviderClimateNormals– MultiProviderClimateNormals with results from all successful providers. The result includes both requested period and actual returned periods from each provider for transparency.- Raises:
ValueError – If no climate data available from any provider.
Example
>>> service = WeatherService() >>> # Request 30 years back from current year (dynamic period) >>> normals = service.get_climate_normals(40.7128, -74.0060) >>> >>> # Get results from all successful providers >>> print(f"Successful providers: {normals.successful_providers}") Successful providers: ['meteostat', 'nasa_power'] >>> >>> # Extract consensus values for submission-schema (Issue #191) >>> schema_values = normals.to_submission_schema(strategy="consensus") >>> print(f"annual_precpt: {schema_values['annual_precpt']} mm") annual_precpt: 907.8 mm >>> >>> # Or get result from specific provider >>> meteostat_result = normals.get_provider_result("meteostat") >>> if meteostat_result: >>> print(f"Meteostat: {meteostat_result.get_annual_precipitation()} mm/year") Meteostat: 1268.4 mm/year >>> >>> # Query only specific providers or different period >>> normals = service.get_climate_normals(40.7128, -74.0060, ... years_back=20, providers=["nasa_power"])
Marine Service
Marine enrichment service for biosample oceanographic context.
Orchestrates multiple marine data providers to deliver comprehensive marine environmental data with quality tracking and standardized schema mapping.
- class biosample_enricher.marine.service.MarineService(providers=None)[source]
Bases:
objectMulti-provider marine enrichment service for biosample metadata.
Provides comprehensive oceanographic data using multiple providers with quality tracking and standardized output schema for marine samples.
- get_marine_data_for_biosample(biosample, target_schema='nmdc')[source]
Get marine data for a biosample and map to target schema.
- get_comprehensive_marine_data(latitude, longitude, target_date)[source]
Get comprehensive marine data from all available providers.
- Parameters:
- Returns:
MarineResult– MarineResult with combined data from all providers
Land Service
Land cover and vegetation enrichment service orchestration.
- class biosample_enricher.land.service.LandService[source]
Bases:
objectMulti-provider land cover and vegetation enrichment service.
Queries ALL available providers for comprehensive data coverage: - Land Cover: ESA WorldCover, NLCD (US), MODIS Land Cover, CGLS - Vegetation: MODIS NDVI/EVI/LAI/FPAR, VIIRS NDVI, Sentinel-2 (selective)
Returns full provenance with exact coordinates/dates from each provider.
- enrich_location(latitude, longitude, target_date=None, time_window_days=16)[source]
Enrich a single location with land cover and vegetation data.
- Parameters:
- Returns:
LandResult– LandResult with data from ALL available providers
- enrich_batch(locations, target_date=None, time_window_days=16)[source]
Enrich multiple locations with land cover and vegetation data.
Forward Geocoding Service
Forward geocoding service for coordinating multiple providers (place names to coordinates).
- class biosample_enricher.forward_geocoding.service.ForwardGeocodingService[source]
Bases:
objectService for managing forward geocoding providers (place names to coordinates).
- geocode(query, provider=None, *, read_from_cache=True, write_to_cache=True, timeout_s=30.0, language='en', country_codes=None, max_results=10)[source]
Perform forward geocoding to convert place name to coordinates.
- Parameters:
query (
str) – Place name or address to search forprovider (
str|None) – Provider name (None for auto-selection)read_from_cache (
bool) – Whether to read from cachewrite_to_cache (
bool) – Whether to write to cachetimeout_s (
float) – Request timeout in secondslanguage (
str) – Language code for resultscountry_codes (
list[str] |None) – List of ISO country codes to restrict searchmax_results (
int) – Maximum number of results
- Returns:
ForwardGeocodeResult|None– Forward geocoding result or None if failed
- geocode_multiple(query, providers=None, *, read_from_cache=True, write_to_cache=True, timeout_s=30.0, language='en', country_codes=None, max_results=5)[source]
Perform forward geocoding using multiple providers for comparison.
- Parameters:
query (
str) – Place name or address to search forproviders (
list[str] |None) – List of provider names (None for all available)read_from_cache (
bool) – Whether to read from cachewrite_to_cache (
bool) – Whether to write to cachetimeout_s (
float) – Request timeout in secondslanguage (
str) – Language code for resultscountry_codes (
list[str] |None) – List of ISO country codes to restrict searchmax_results (
int) – Maximum results per provider
- Returns:
dict[str,ForwardGeocodeResult] – Dictionary mapping provider names to results
- get_coordinates_for_place(place_name, prefer_provider=None, language='en', country_hint=None)[source]
Get coordinates and enrichment data for a biosample place name.
This is the main method for biosample enrichment - converts place names from metadata into precise coordinates.
Reverse Geocoding Service
Reverse geocoding service for coordinating multiple providers.
- class biosample_enricher.reverse_geocoding.service.ReverseGeocodingService[source]
Bases:
objectService for managing reverse geocoding providers.
- reverse_geocode(lat, lon, provider=None, *, read_from_cache=True, write_to_cache=True, timeout_s=20.0, language='en', limit=10)[source]
Perform reverse geocoding using specified or default provider.
- Parameters:
lat (
float) – Latitude in decimal degreeslon (
float) – Longitude in decimal degreesprovider (
str|None) – Provider name (None for auto-selection)read_from_cache (
bool) – Whether to read from cachewrite_to_cache (
bool) – Whether to write to cachetimeout_s (
float) – Request timeout in secondslanguage (
str) – Language code for resultslimit (
int) – Maximum number of results
- Returns:
ReverseGeocodeResult|None– Reverse geocoding result or None if failed
- reverse_geocode_multiple(lat, lon, providers=None, *, read_from_cache=True, write_to_cache=True, timeout_s=20.0, language='en', limit=10)[source]
Perform reverse geocoding using multiple providers sequentially.
- Parameters:
lat (
float) – Latitude in decimal degreeslon (
float) – Longitude in decimal degreesproviders (
list[str] |None) – List of provider names (None for all available)read_from_cache (
bool) – Whether to read from cachewrite_to_cache (
bool) – Whether to write to cachetimeout_s (
float) – Request timeout in secondslanguage (
str) – Language code for resultslimit (
int) – Maximum number of results per provider
- Returns:
dict[str,ReverseGeocodeResult] – Dictionary mapping provider names to results
OSM Features Service
OSM geographic features enrichment service.
- class biosample_enricher.osm_features.service.OSMFeaturesService(default_radius_m=1000, enable_google=True)[source]
Bases:
objectService for enriching locations with geographic features from multiple providers.
- __init__(default_radius_m=1000, enable_google=True)[source]
Initialize geographic features service.
- get_combined_features_for_location(latitude, longitude, radius_m=None, timeout_s=180)[source]
Get geographic features from both OSM and Google Places providers.
- get_features_for_location(latitude, longitude, radius_m=None, timeout_s=180)[source]
Get geographic features around a location from OSM only (for backward compatibility).
- enrich_biosample_location(latitude, longitude, radius_m=None, timeout_s=180, use_combined=True)[source]
Enrich a biosample location with geographic features from multiple providers.
- get_features_for_biosample(biosample, radius_m=1000, timeout_s=180)[source]
Get OSM features for a biosample dictionary.