Table: ndb.collectionunits#
Description#
This table stores data for Collection Units.
TODO: Expand this description with: - What data does this table store? - What is the business/research purpose? - How is this data collected or generated? - Are there any important caveats or data quality issues?
Table Structure#
Schema: ndb | Table Comment: This table stores data for Collection Units.
Statistics#
| Metric | Value |
|---|---|
| Row Count | 39,384 |
| Total Size | 23 MB |
| Table Size | 5376 kB |
| Indexes Size | 18 MB |
Relationships#
Primary Key: collectionunitid
Foreign Keys:
colltypeid→collectiontypes.colltypeiddepenvtid→depenvttypes.depenvtidsubstrateid→rocktypes.rocktypeidsiteid→sites.siteid
Referenced By:
TODO: Document which tables reference this table (will be auto-detected in validation).
Data Dictionary#
| Column | Type | Nullable | Default | Constraints | Description |
|---|---|---|---|---|---|
collectionunitid |
integer | ✗ | nextval('ndb.seq_collection... |
PRIMARY KEY | An arbitrary Collection Unit identification number. |
handle |
character varying(16) | ✗ | - |
- | Code name for the Collection Unit. This code may be up to 10 characters, but an effort is made to keep these to 8 characters or less. Data are frequently distributed by Collection Unit, and the Handle is used for file names. |
siteid |
integer | ✗ | - |
FOREIGN KEY | Site where CollectionUnit was located. Field links to Sites table. |
colltypeid |
integer | ✓ | - |
FOREIGN KEY | Type of Collection Unit. Field links to the CollectionTypes table. |
depenvtid |
integer | ✓ | - |
FOREIGN KEY | Depositional environment of the CollectionUnit. Normally, this key refers to the modern environment. For example, the site may be located on a colluvial slope, in which case the Depositional Environment may be Colluvium or Colluvial Fan. However, an excavation may extend into alluvial sediments, which represent a different depositional environment. These are accounted for by the Facies of the AnalysisUnit. Field links to the DepEnvtTypes table. |
collunitname |
character varying(255) | ✓ | - |
- | Name of the Collection Unit. Examples: Core BPT82A, Structure 9, P4A Test 57. If faunal data are reported from a site or locality without explicit Collection Units, then data are assigned to a single Collection Unit with the name «Locality». |
colldate |
date | ✓ | - |
- | Date Collection Unit was collected. |
colldevice |
character varying(255) | ✓ | - |
- | Device used for obtain Collection Unit. This field applies primarily to cores, for example «Wright square-rod piston corer (5 cm)». |
gpslatitude |
double precision | ✓ | - |
- | Precise latitude of the Collection Unit, typically taken with a GPS, although may be precisely measured from a map. |
gpslongitude |
double precision | ✓ | - |
- | Precise longitude of the Collection Unit, typically taken with a GPS, although may be precisely measured from a map. |
gpsaltitude |
double precision | ✓ | - |
- | Precise altitude of the Collection Unit, typically taken with a GPS or precisely obtained from a map. |
gpserror |
double precision | ✓ | - |
- | Error in the horizontal GPS coordinates, if known. |
waterdepth |
double precision | ✓ | - |
- | Depth of water at the Collection Unit location. This field applies mainly to Collection Units from lakes. |
substrateid |
integer | ✓ | - |
- | Substrate or rock type on which the Collection Unit lies. Field links to the RockTypes table. This field is especially used for rodent middens. |
slopeaspect |
integer | ✓ | - |
- | For Collection Units on slopes, the horizontal direction to which a slope faces measured in degrees clockwise from north. This field is especially used for rodent middens. |
slopeangle |
integer | ✓ | - |
- | For Collection Units on slopes, the angle of slope from horizontal. field is especially used for rodent middens. |
location |
character varying(255) | ✓ | - |
- | Short description of the location of the Collection Unit within the site. |
notes |
text | ✓ | - |
- | Free form notes or comments about the Collection Unit. |
recdatecreated |
timestamp without time zone | ✗ | timezone('UTC'::text, now()) |
- | |
recdatemodified |
timestamp without time zone | ✗ | - |
- |
TODO: Review column descriptions and add comments where missing.
Usage Examples#
Example 1: Basic Selection#
-- Get recent records from collectionunits
SELECT *
FROM collectionunits
ORDER BY collectionunitid DESC
LIMIT 10;
Purpose: Retrieve the 10 most recent records from collectionunits
Example 2: Count Records#
Purpose: Get the total number of records in collectionunits
Example 3: Filter by Date Range#
-- Get records within a date range
SELECT *
FROM collectionunits
WHERE colldate >= '2024-01-01'
AND colldate < '2025-01-01'
ORDER BY colldate DESC;
Purpose: Retrieve records from collectionunits within a specific date range
Example 4: Join with collectiontypes#
-- Join with related table
SELECT
t1.*,
t2.*
FROM collectionunits t1
INNER JOIN collectiontypes t2
ON t1.colltypeid = t2.colltypeid
LIMIT 100;
Purpose: Retrieve collectionunits records with related data from collectiontypes
Example 5: Aggregate Data#
-- Aggregate records by handle
SELECT
handle,
COUNT(*) as count
FROM collectionunits
GROUP BY handle
ORDER BY count DESC
LIMIT 10;
Purpose: Count records grouped by handle
TODO: Add more specific examples relevant to common research questions or operational tasks.
Data Quality Notes#
Automated Data Quality Tests#
This table is subject to the following automated quality checks:
❌ ref_003: sites_have_collection_units
- Severity: ERROR
- Status: FAILED
-
Description: All sites should have at least one collection unit.
-
Suggested Remediation: - Remove "floating" sites.
- Ensure that the collection units have not been accidentally deleted.
❌ comp_002: collectionunits_have_dates
- Severity: WARNING
- Status: FAILED
-
Description: collectionunits should have collection dates
-
Suggested Remediation: - Review original data sources for date information.
- Derive dates from publications where available.
- Record the decision making processes at a Constituent Database level.
✅ comp_004: sample_ages_for_samples
- Severity: WARNING
- Status: PASSED
- Description: Samples should have sample ages, whether from a chronology or collection date.
See the Data Quality Report for details.
Maintenance#
- Data Owner: TODO: Assign owner
- Update Frequency: TODO: Document frequency
- Last Major Schema Change: TODO: Document when schema last changed
Related Documentation#
TODO: Link to: - Related API endpoints - Data collection procedures - Analysis notebooks or reports that use this table - External ontologies or standards