Table: ndb.datasetsubmissions#
Description#
Submissions to the database are of Datasets. Submissions may be original submissions, resubmissions, compilations from other databases, or recompilations. See the description of the DatasetSubmissionTypes table.
TODO: Expand this description with: - What data does this table store? - What is the business/research purpose? - How is this data collected or generated? - Are there any important caveats or data quality issues?
Table Structure#
Schema: ndb | Table Comment: Submissions to the database are of Datasets. Submissions may be original submissions, resubmissions, compilations from other databases, or recompilations. See the description of the DatasetSubmissionTypes table.
Statistics#
| Metric | Value |
|---|---|
| Row Count | 64,278 |
| Total Size | 23 MB |
| Table Size | 5248 kB |
| Indexes Size | 18 MB |
Relationships#
Primary Key: submissionid
Foreign Keys:
databaseid→constituentdatabases.databaseidcontactid→contacts.contactiddatasetid→datasets.datasetidsubmissiontypeid→datasetsubmissiontypes.submissiontypeid
Referenced By:
TODO: Document which tables reference this table (will be auto-detected in validation).
Data Dictionary#
| Column | Type | Nullable | Default | Constraints | Description |
|---|---|---|---|---|---|
submissionid |
integer | ✗ | nextval('ndb.seq_datasetsub... |
PRIMARY KEY | An arbitrary submission identification number. |
datasetid |
integer | ✗ | - |
FOREIGN KEY | Dataset identification number. Field links to the Datasets table. Datasets may occur multiple times in this table (e.g. once for the original compilation into a different database and a second time for the recompilation into Neotoma). |
databaseid |
integer | ✓ | - |
FOREIGN KEY | |
contactid |
integer | ✓ | - |
FOREIGN KEY | Contact identification number. Field links to the Contacts table. The Contact is the person who submitted, resubmitted, compiled, or recompiled the data. This person is not necessarily the Dataset PI; it is the person who submitted the data or compiled the data from the literature. |
submissiontypeid |
integer | ✓ | - |
FOREIGN KEY | Submission Type identification number. Field links to the DatasetSubmissionsType table. |
submissiondate |
date | ✓ | - |
- | Date of the submission, resubmission, compilation, or recompilation. |
notes |
text | ✓ | - |
- | Free form notes or comments about the submission. |
recdatecreated |
timestamp without time zone | ✗ | timezone('UTC'::text, now()) |
- | |
recdatemodified |
timestamp without time zone | ✗ | - |
- |
TODO: Review column descriptions and add comments where missing.
Usage Examples#
Example 1: Basic Selection#
-- Get recent records from datasetsubmissions
SELECT *
FROM datasetsubmissions
ORDER BY submissionid DESC
LIMIT 10;
Purpose: Retrieve the 10 most recent records from datasetsubmissions
Example 2: Count Records#
Purpose: Get the total number of records in datasetsubmissions
Example 3: Filter by Date Range#
-- Get records within a date range
SELECT *
FROM datasetsubmissions
WHERE submissiondate >= '2024-01-01'
AND submissiondate < '2025-01-01'
ORDER BY submissiondate DESC;
Purpose: Retrieve records from datasetsubmissions within a specific date range
Example 4: Join with constituentdatabases#
-- Join with related table
SELECT
t1.*,
t2.*
FROM datasetsubmissions t1
INNER JOIN constituentdatabases t2
ON t1.databaseid = t2.databaseid
LIMIT 100;
Purpose: Retrieve datasetsubmissions records with related data from constituentdatabases
Example 5: Aggregate Data#
-- Aggregate records by datasetid
SELECT
datasetid,
COUNT(*) as count
FROM datasetsubmissions
GROUP BY datasetid
ORDER BY count DESC
LIMIT 10;
Purpose: Count records grouped by datasetid
TODO: Add more specific examples relevant to common research questions or operational tasks.
Data Quality Notes#
TODO: Document: - Known data quality issues - Validation rules - Expected data ranges - Update frequency and mechanisms - Any ETL processes that populate this table
Maintenance#
- Data Owner: TODO: Assign owner
- Update Frequency: TODO: Document frequency
- Last Major Schema Change: TODO: Document when schema last changed
Related Documentation#
TODO: Link to: - Related API endpoints - Data collection procedures - Analysis notebooks or reports that use this table - External ontologies or standards