Skip to content

Table: ndb.collectionunits#

Description#

This table stores data for Collection Units.

TODO: Expand this description with: - What data does this table store? - What is the business/research purpose? - How is this data collected or generated? - Are there any important caveats or data quality issues?

Table Structure#

Visual Schema

Schema: ndb | Table Comment: This table stores data for Collection Units.

Statistics#

Metric Value
Row Count 39,384
Total Size 23 MB
Table Size 5376 kB
Indexes Size 18 MB

Relationships#

Primary Key: collectionunitid

Foreign Keys:

Referenced By:

TODO: Document which tables reference this table (will be auto-detected in validation).

Data Dictionary#

Column Type Nullable Default Constraints Description
collectionunitid integer nextval('ndb.seq_collection... PRIMARY KEY An arbitrary Collection Unit identification number.
handle character varying(16) - - Code name for the Collection Unit. This code may be up to 10 characters, but an effort is made to keep these to 8 characters or less. Data are frequently distributed by Collection Unit, and the Handle is used for file names.
siteid integer - FOREIGN KEY Site where CollectionUnit was located. Field links to Sites table.
colltypeid integer - FOREIGN KEY Type of Collection Unit. Field links to the CollectionTypes table.
depenvtid integer - FOREIGN KEY Depositional environment of the CollectionUnit. Normally, this key refers to the modern environment. For example, the site may be located on a colluvial slope, in which case the Depositional Environment may be Colluvium or Colluvial Fan. However, an excavation may extend into alluvial sediments, which represent a different depositional environment. These are accounted for by the Facies of the AnalysisUnit. Field links to the DepEnvtTypes table.
collunitname character varying(255) - - Name of the Collection Unit. Examples: Core BPT82A, Structure 9, P4A Test 57. If faunal data are reported from a site or locality without explicit Collection Units, then data are assigned to a single Collection Unit with the name «Locality».
colldate date - - Date Collection Unit was collected.
colldevice character varying(255) - - Device used for obtain Collection Unit. This field applies primarily to cores, for example «Wright square-rod piston corer (5 cm)».
gpslatitude double precision - - Precise latitude of the Collection Unit, typically taken with a GPS, although may be precisely measured from a map.
gpslongitude double precision - - Precise longitude of the Collection Unit, typically taken with a GPS, although may be precisely measured from a map.
gpsaltitude double precision - - Precise altitude of the Collection Unit, typically taken with a GPS or precisely obtained from a map.
gpserror double precision - - Error in the horizontal GPS coordinates, if known.
waterdepth double precision - - Depth of water at the Collection Unit location. This field applies mainly to Collection Units from lakes.
substrateid integer - - Substrate or rock type on which the Collection Unit lies. Field links to the RockTypes table. This field is especially used for rodent middens.
slopeaspect integer - - For Collection Units on slopes, the horizontal direction to which a slope faces measured in degrees clockwise from north. This field is especially used for rodent middens.
slopeangle integer - - For Collection Units on slopes, the angle of slope from horizontal. field is especially used for rodent middens.
location character varying(255) - - Short description of the location of the Collection Unit within the site.
notes text - - Free form notes or comments about the Collection Unit.
recdatecreated timestamp without time zone timezone('UTC'::text, now()) -
recdatemodified timestamp without time zone - -

TODO: Review column descriptions and add comments where missing.

Usage Examples#

Example 1: Basic Selection#

-- Get recent records from collectionunits
SELECT *
FROM collectionunits
ORDER BY collectionunitid DESC
LIMIT 10;

Purpose: Retrieve the 10 most recent records from collectionunits

Example 2: Count Records#

-- Count total records
SELECT COUNT(*) as total_records
FROM collectionunits;

Purpose: Get the total number of records in collectionunits

Example 3: Filter by Date Range#

-- Get records within a date range
SELECT *
FROM collectionunits
WHERE colldate >= '2024-01-01'
  AND colldate < '2025-01-01'
ORDER BY colldate DESC;

Purpose: Retrieve records from collectionunits within a specific date range

Example 4: Join with collectiontypes#

-- Join with related table
SELECT 
    t1.*,
    t2.*
FROM collectionunits t1
INNER JOIN collectiontypes t2 
    ON t1.colltypeid = t2.colltypeid
LIMIT 100;

Purpose: Retrieve collectionunits records with related data from collectiontypes

Example 5: Aggregate Data#

-- Aggregate records by handle
SELECT 
    handle,
    COUNT(*) as count
FROM collectionunits
GROUP BY handle
ORDER BY count DESC
LIMIT 10;

Purpose: Count records grouped by handle

TODO: Add more specific examples relevant to common research questions or operational tasks.

Data Quality Notes#

Automated Data Quality Tests#

This table is subject to the following automated quality checks:

❌ ref_003: sites_have_collection_units

  • Severity: ERROR
  • Status: FAILED
  • Description: All sites should have at least one collection unit.

  • Suggested Remediation: - Remove "floating" sites.

  • Ensure that the collection units have not been accidentally deleted.

❌ comp_002: collectionunits_have_dates

  • Severity: WARNING
  • Status: FAILED
  • Description: collectionunits should have collection dates

  • Suggested Remediation: - Review original data sources for date information.

  • Derive dates from publications where available.
  • Record the decision making processes at a Constituent Database level.

✅ comp_004: sample_ages_for_samples

  • Severity: WARNING
  • Status: PASSED
  • Description: Samples should have sample ages, whether from a chronology or collection date.

See the Data Quality Report for details.

Maintenance#

  • Data Owner: TODO: Assign owner
  • Update Frequency: TODO: Document frequency
  • Last Major Schema Change: TODO: Document when schema last changed

TODO: Link to: - Related API endpoints - Data collection procedures - Analysis notebooks or reports that use this table - External ontologies or standards