Table: ndb.sites#
Description#
The Sites table stores information about sites or localities, including name, geographic coordinates, and description. Sites generally have an areal extent and can be circumscribed by a latitude-longitude box. However, site data ingested from legacy databases have included only point locations. The lat-long box can be used either to circumscribe the aerial extent of a site or to provide purposeful imprecision to the site location. Site location may be imprecise because the original description was vague, e.g. «a gravel bar 5 miles east of town», or because the investigators, land owner, or land management agency may not want the exact location made public, perhaps to prevent looting and vandalism. In the first case, the lat-long box can be made sufficiently large to encompass the true location and in the second case to prevent exact location.
TODO: Expand this description with: - What data does this table store? - What is the business/research purpose? - How is this data collected or generated? - Are there any important caveats or data quality issues?
Table Structure#
Schema: ndb | Table Comment: The Sites table stores information about sites or localities, including name, geographic coordinates, and description. Sites generally have an areal extent and can be circumscribed by a latitude-longitude box. However, site data ingested from legacy databases have included only point locations. The lat-long box can be used either to circumscribe the aerial extent of a site or to provide purposeful imprecision to the site location. Site location may be imprecise because the original description was vague, e.g. «a gravel bar 5 miles east of town», or because the investigators, land owner, or land management agency may not want the exact location made public, perhaps to prevent looting and vandalism. In the first case, the lat-long box can be made sufficiently large to encompass the true location and in the second case to prevent exact location.
Statistics#
| Metric | Value |
|---|---|
| Row Count | 26,896 |
| Total Size | 1361 MB |
| Table Size | 874 MB |
| Indexes Size | 486 MB |
Relationships#
Primary Key: siteid
No foreign key relationships.
Referenced By:
TODO: Document which tables reference this table (will be auto-detected in validation).
Data Dictionary#
| Column | Type | Nullable | Default | Constraints | Description |
|---|---|---|---|---|---|
siteid |
integer | ✗ | nextval('ndb.seq_sites_site... |
PRIMARY KEY | An arbitrary Site identification number. |
sitename |
character varying(128) | ✓ | - |
- | Name of the site. Alternative names, including archaeological site numbers, are placed in square brackets, for example: |
| New Paris #4 [Lloyd's Rock Hole] | |||||
| Modoc Rock Shelter [11RA501] | |||||
| A search of the SiteName field for any of the alternative names or for the archaeological site number will find the site. Some archaeological sites are known only by their site number. | |||||
| Modifiers to site names are placed in parentheses. Authors are added for generic sites names, especially for surface samples, that are duplicated in the database, for example: | |||||
| Site 1 (Heusser 1978) | |||||
| Site 1 (Delcourt et al. 1983) | |||||
| Site 1 (Elliot-Fisk et al. 1982) | |||||
| Site 1 (Whitehead and Jackson 1990) | |||||
| For actual site names duplicated in the database, the name is followed by the 2-letter country code and state or province, for example: | |||||
| Silver Lake (US:Minnesota) | |||||
| Silver Lake (CA:Nova Scotia) | |||||
| Silver Lake (US:Ohio) | |||||
| Silver Lake (US:Pennsylvania) | |||||
longitudeeast |
double precision | ✓ | - |
- | East bounding longitude for a site. |
latitudenorth |
double precision | ✓ | - |
- | North bounding latitude for a site. |
longitudewest |
double precision | ✓ | - |
- | West bounding longitude for a site. |
latitudesouth |
double precision | ✓ | - |
- | South bounding latitude for a site. |
altitude |
double precision | ✓ | - |
- | Altitude of a site in meters. |
area |
double precision | ✓ | - |
- | Area of a site in hectares. |
sitedescription |
text | ✓ | - |
- | Free form description of a site, including such information as physiography and vegetation around the site. |
notes |
text | ✓ | - |
- | Free form notes or comments about the site. |
recdatecreated |
timestamp without time zone | ✗ | timezone('UTC'::text, now()) |
- | |
recdatemodified |
timestamp without time zone | ✗ | - |
- | |
geog |
USER-DEFINED | ✓ | - |
- |
TODO: Review column descriptions and add comments where missing.
Usage Examples#
Example 1: Basic Selection#
Purpose: Retrieve the 10 most recent records from sites
Example 2: Count Records#
Purpose: Get the total number of records in sites
Example 3: Filter by Date Range#
-- Get records within a date range
SELECT *
FROM sites
WHERE recdatecreated >= '2024-01-01'
AND recdatecreated < '2025-01-01'
ORDER BY recdatecreated DESC;
Purpose: Retrieve records from sites within a specific date range
Example 4: Aggregate Data#
-- Aggregate records by sitename
SELECT
sitename,
COUNT(*) as count
FROM sites
GROUP BY sitename
ORDER BY count DESC
LIMIT 10;
Purpose: Count records grouped by sitename
TODO: Add more specific examples relevant to common research questions or operational tasks.
Data Quality Notes#
Automated Data Quality Tests#
This table is subject to the following automated quality checks:
❌ ref_003: sites_have_collection_units
- Severity: ERROR
- Status: FAILED
-
Description: All sites should have at least one collection unit.
-
Suggested Remediation: - Remove "floating" sites.
- Ensure that the collection units have not been accidentally deleted.
✅ valid_001: coordinates_in_valid_range
- Severity: ERROR
- Status: PASSED
- Description: Site coordinates must be within valid lat/long ranges
✅ valid_004: sites_not_on_equator
- Severity: ERROR
- Status: PASSED
- Description: Some sites seem to have been placed at the equator for one reason or another.
See the Data Quality Report for details.
Maintenance#
- Data Owner: TODO: Assign owner
- Update Frequency: TODO: Document frequency
- Last Major Schema Change: TODO: Document when schema last changed
Related Documentation#
TODO: Link to: - Related API endpoints - Data collection procedures - Analysis notebooks or reports that use this table - External ontologies or standards