Table: ndb.analysisunits#
Description#
This table stores the data for Analysis Units.
TODO: Expand this description with: - What data does this table store? - What is the business/research purpose? - How is this data collected or generated? - Are there any important caveats or data quality issues?
Table Structure#
Schema: ndb | Table Comment: This table stores the data for Analysis Units.
Statistics#
| Metric | Value |
|---|---|
| Row Count | 565,735 |
| Total Size | 69 MB |
| Table Size | 47 MB |
| Indexes Size | 22 MB |
Relationships#
Primary Key: analysisunitid
Foreign Keys:
collectionunitid→collectionunits.collectionunitidfaciesid→faciestypes.faciesid
Referenced By:
TODO: Document which tables reference this table (will be auto-detected in validation).
Data Dictionary#
| Column | Type | Nullable | Default | Constraints | Description |
|---|---|---|---|---|---|
analysisunitid |
integer | ✗ | nextval('ndb.seq_analysisun... |
PRIMARY KEY | An arbitrary Analysis Unit identification number. |
collectionunitid |
integer | ✗ | - |
FOREIGN KEY | Collection Unit ID number. Field links to CollectionUnits table. Every Analysis Unit belongs to a Collection Unit. |
analysisunitname |
character varying(80) | ✓ | - |
- | Optional name for an Analysis Unit. Analysis Units are usually designated with either a depth or a name, sometimes both. |
depth |
double precision | ✓ | - |
- | Optional depth of the Analysis Unit in cm. Depths are typically designated for Analysis Units from cores and for Analysis Units excavated in arbitrary (e.g. 10 cm) levels. Depths are normally the midpoints of arbitrary levels. For example, for a level excavated from 10 to 20 cm or for a core section from 10 to 15 cm, the depth is 15. Designating depths as midpoints and thicknesses facilitates calculation of ages from age models that utilize single midpoint depths for Analysis Units rather than top and bottom depths. Of course, top and bottom depths can be calculated from midpoint depths and thicknesses. For many microfossil core samples, only the midpoint depths are known or published; the diameter or width of the sampling device is often not given. |
thickness |
double precision | ✓ | - |
- | Optional thickness of the Analysis Unit in cm. For many microfossil core samples, the depths are treated as points, and the thicknesses are not given in the publications, although 0.5 to 1.0 cm would be typical. |
faciesid |
integer | ✓ | - |
FOREIGN KEY | Sedimentary facies of the Analysis Unit. Field links to the FaciesTypes table. |
mixed |
boolean | ✗ | - |
- | Indicates whether specimens in the Analysis Unit are of mixed ages, for example Pleistocene fossils occurring with late Holocene fossils. Although Analysis Units may be mixed, samples from the Analysis Unit may not be, for example individually radiocarbon dated specimens. |
igsn |
character varying(40) | ✓ | - |
- | International Geo Sample Number. The IGSN is a unique identifier for a Geoscience sample. They are assigned by the SESAR, the System for Earth Sample Registration (www.geosamples.org), which is a registry that provides and administers the unique identifiers. IGSN’s may be assigned to all types of geoscience samples, including cores, rocks, minerals, and even fluids. Their purpose is to facilitate sharing and correlation of samples and sample-based data. For data in Neotoma, their primary value would be for correlation various samples from the same Analysis Units, for example pollen, charcoal, diatoms, and geochemical analyses. Conceivably, the AnalysisUnitID could be used for this purpose; however, IGSN’s could be assigned by projects before their data are submitted to the database. Moreover, AnalysisUnitID’s are intended to be internal to the database. Although IGSN’s could be assigned to Neotoma Collection Units and Samples, their primary value lies in their assignment to Analysis Units. IGSN’s are not yet assigned to Neotoma Analysis Units; however, that may change after consultation with SESAR. |
notes |
text | ✓ | - |
- | Free form notes or comments about the Analysis Unit. |
recdatecreated |
timestamp without time zone | ✗ | timezone('UTC'::text, now()) |
- | |
recdatemodified |
timestamp without time zone | ✗ | - |
- |
TODO: Review column descriptions and add comments where missing.
Usage Examples#
Example 1: Basic Selection#
-- Get recent records from analysisunits
SELECT *
FROM analysisunits
ORDER BY analysisunitid DESC
LIMIT 10;
Purpose: Retrieve the 10 most recent records from analysisunits
Example 2: Count Records#
Purpose: Get the total number of records in analysisunits
Example 3: Filter by Date Range#
-- Get records within a date range
SELECT *
FROM analysisunits
WHERE recdatecreated >= '2024-01-01'
AND recdatecreated < '2025-01-01'
ORDER BY recdatecreated DESC;
Purpose: Retrieve records from analysisunits within a specific date range
Example 4: Join with collectionunits#
-- Join with related table
SELECT
t1.*,
t2.*
FROM analysisunits t1
INNER JOIN collectionunits t2
ON t1.collectionunitid = t2.collectionunitid
LIMIT 100;
Purpose: Retrieve analysisunits records with related data from collectionunits
Example 5: Aggregate Data#
-- Aggregate records by collectionunitid
SELECT
collectionunitid,
COUNT(*) as count
FROM analysisunits
GROUP BY collectionunitid
ORDER BY count DESC
LIMIT 10;
Purpose: Count records grouped by collectionunitid
TODO: Add more specific examples relevant to common research questions or operational tasks.
Data Quality Notes#
TODO: Document: - Known data quality issues - Validation rules - Expected data ranges - Update frequency and mechanisms - Any ETL processes that populate this table
Maintenance#
- Data Owner: TODO: Assign owner
- Update Frequency: TODO: Document frequency
- Last Major Schema Change: TODO: Document when schema last changed
Related Documentation#
TODO: Link to: - Related API endpoints - Data collection procedures - Analysis notebooks or reports that use this table - External ontologies or standards