Skip to content

Table: ndb.datasetsubmissions#

Description#

Submissions to the database are of Datasets. Submissions may be original submissions, resubmissions, compilations from other databases, or recompilations. See the description of the DatasetSubmissionTypes table.

TODO: Expand this description with: - What data does this table store? - What is the business/research purpose? - How is this data collected or generated? - Are there any important caveats or data quality issues?

Table Structure#

Visual Schema

Schema: ndb | Table Comment: Submissions to the database are of Datasets. Submissions may be original submissions, resubmissions, compilations from other databases, or recompilations. See the description of the DatasetSubmissionTypes table.

Statistics#

Metric Value
Row Count 64,278
Total Size 23 MB
Table Size 5248 kB
Indexes Size 18 MB

Relationships#

Primary Key: submissionid

Foreign Keys:

Referenced By:

TODO: Document which tables reference this table (will be auto-detected in validation).

Data Dictionary#

Column Type Nullable Default Constraints Description
submissionid integer nextval('ndb.seq_datasetsub... PRIMARY KEY An arbitrary submission identification number.
datasetid integer - FOREIGN KEY Dataset identification number. Field links to the Datasets table. Datasets may occur multiple times in this table (e.g. once for the original compilation into a different database and a second time for the recompilation into Neotoma).
databaseid integer - FOREIGN KEY
contactid integer - FOREIGN KEY Contact identification number. Field links to the Contacts table. The Contact is the person who submitted, resubmitted, compiled, or recompiled the data. This person is not necessarily the Dataset PI; it is the person who submitted the data or compiled the data from the literature.
submissiontypeid integer - FOREIGN KEY Submission Type identification number. Field links to the DatasetSubmissionsType table.
submissiondate date - - Date of the submission, resubmission, compilation, or recompilation.
notes text - - Free form notes or comments about the submission.
recdatecreated timestamp without time zone timezone('UTC'::text, now()) -
recdatemodified timestamp without time zone - -

TODO: Review column descriptions and add comments where missing.

Usage Examples#

Example 1: Basic Selection#

-- Get recent records from datasetsubmissions
SELECT *
FROM datasetsubmissions
ORDER BY submissionid DESC
LIMIT 10;

Purpose: Retrieve the 10 most recent records from datasetsubmissions

Example 2: Count Records#

-- Count total records
SELECT COUNT(*) as total_records
FROM datasetsubmissions;

Purpose: Get the total number of records in datasetsubmissions

Example 3: Filter by Date Range#

-- Get records within a date range
SELECT *
FROM datasetsubmissions
WHERE submissiondate >= '2024-01-01'
  AND submissiondate < '2025-01-01'
ORDER BY submissiondate DESC;

Purpose: Retrieve records from datasetsubmissions within a specific date range

Example 4: Join with constituentdatabases#

-- Join with related table
SELECT 
    t1.*,
    t2.*
FROM datasetsubmissions t1
INNER JOIN constituentdatabases t2 
    ON t1.databaseid = t2.databaseid
LIMIT 100;

Purpose: Retrieve datasetsubmissions records with related data from constituentdatabases

Example 5: Aggregate Data#

-- Aggregate records by datasetid
SELECT 
    datasetid,
    COUNT(*) as count
FROM datasetsubmissions
GROUP BY datasetid
ORDER BY count DESC
LIMIT 10;

Purpose: Count records grouped by datasetid

TODO: Add more specific examples relevant to common research questions or operational tasks.

Data Quality Notes#

TODO: Document: - Known data quality issues - Validation rules - Expected data ranges - Update frequency and mechanisms - Any ETL processes that populate this table

Maintenance#

  • Data Owner: TODO: Assign owner
  • Update Frequency: TODO: Document frequency
  • Last Major Schema Change: TODO: Document when schema last changed

TODO: Link to: - Related API endpoints - Data collection procedures - Analysis notebooks or reports that use this table - External ontologies or standards