Skip to content

Table: ndb.sites#

Description#

The Sites table stores information about sites or localities, including name, geographic coordinates, and description. Sites generally have an areal extent and can be circumscribed by a latitude-longitude box. However, site data ingested from legacy databases have included only point locations. The lat-long box can be used either to circumscribe the aerial extent of a site or to provide purposeful imprecision to the site location. Site location may be imprecise because the original description was vague, e.g. «a gravel bar 5 miles east of town», or because the investigators, land owner, or land management agency may not want the exact location made public, perhaps to prevent looting and vandalism. In the first case, the lat-long box can be made sufficiently large to encompass the true location and in the second case to prevent exact location.

TODO: Expand this description with: - What data does this table store? - What is the business/research purpose? - How is this data collected or generated? - Are there any important caveats or data quality issues?

Table Structure#

Visual Schema

Schema: ndb | Table Comment: The Sites table stores information about sites or localities, including name, geographic coordinates, and description. Sites generally have an areal extent and can be circumscribed by a latitude-longitude box. However, site data ingested from legacy databases have included only point locations. The lat-long box can be used either to circumscribe the aerial extent of a site or to provide purposeful imprecision to the site location. Site location may be imprecise because the original description was vague, e.g. «a gravel bar 5 miles east of town», or because the investigators, land owner, or land management agency may not want the exact location made public, perhaps to prevent looting and vandalism. In the first case, the lat-long box can be made sufficiently large to encompass the true location and in the second case to prevent exact location.

Statistics#

Metric Value
Row Count 26,896
Total Size 1361 MB
Table Size 874 MB
Indexes Size 486 MB

Relationships#

Primary Key: siteid

No foreign key relationships.

Referenced By:

TODO: Document which tables reference this table (will be auto-detected in validation).

Data Dictionary#

Column Type Nullable Default Constraints Description
siteid integer nextval('ndb.seq_sites_site... PRIMARY KEY An arbitrary Site identification number.
sitename character varying(128) - - Name of the site. Alternative names, including archaeological site numbers, are placed in square brackets, for example:
New Paris #4 [Lloyd's Rock Hole]
Modoc Rock Shelter [11RA501]
A search of the SiteName field for any of the alternative names or for the archaeological site number will find the site. Some archaeological sites are known only by their site number.
Modifiers to site names are placed in parentheses. Authors are added for generic sites names, especially for surface samples, that are duplicated in the database, for example:
Site 1 (Heusser 1978)
Site 1 (Delcourt et al. 1983)
Site 1 (Elliot-Fisk et al. 1982)
Site 1 (Whitehead and Jackson 1990)
For actual site names duplicated in the database, the name is followed by the 2-letter country code and state or province, for example:
Silver Lake (US:Minnesota)
Silver Lake (CA:Nova Scotia)
Silver Lake (US:Ohio)
Silver Lake (US:Pennsylvania)
longitudeeast double precision - - East bounding longitude for a site.
latitudenorth double precision - - North bounding latitude for a site.
longitudewest double precision - - West bounding longitude for a site.
latitudesouth double precision - - South bounding latitude for a site.
altitude double precision - - Altitude of a site in meters.
area double precision - - Area of a site in hectares.
sitedescription text - - Free form description of a site, including such information as physiography and vegetation around the site.
notes text - - Free form notes or comments about the site.
recdatecreated timestamp without time zone timezone('UTC'::text, now()) -
recdatemodified timestamp without time zone - -
geog USER-DEFINED - -

TODO: Review column descriptions and add comments where missing.

Usage Examples#

Example 1: Basic Selection#

-- Get recent records from sites
SELECT *
FROM sites
ORDER BY siteid DESC
LIMIT 10;

Purpose: Retrieve the 10 most recent records from sites

Example 2: Count Records#

-- Count total records
SELECT COUNT(*) as total_records
FROM sites;

Purpose: Get the total number of records in sites

Example 3: Filter by Date Range#

-- Get records within a date range
SELECT *
FROM sites
WHERE recdatecreated >= '2024-01-01'
  AND recdatecreated < '2025-01-01'
ORDER BY recdatecreated DESC;

Purpose: Retrieve records from sites within a specific date range

Example 4: Aggregate Data#

-- Aggregate records by sitename
SELECT 
    sitename,
    COUNT(*) as count
FROM sites
GROUP BY sitename
ORDER BY count DESC
LIMIT 10;

Purpose: Count records grouped by sitename

TODO: Add more specific examples relevant to common research questions or operational tasks.

Data Quality Notes#

Automated Data Quality Tests#

This table is subject to the following automated quality checks:

❌ ref_003: sites_have_collection_units

  • Severity: ERROR
  • Status: FAILED
  • Description: All sites should have at least one collection unit.

  • Suggested Remediation: - Remove "floating" sites.

  • Ensure that the collection units have not been accidentally deleted.

✅ valid_001: coordinates_in_valid_range

  • Severity: ERROR
  • Status: PASSED
  • Description: Site coordinates must be within valid lat/long ranges

✅ valid_004: sites_not_on_equator

  • Severity: ERROR
  • Status: PASSED
  • Description: Some sites seem to have been placed at the equator for one reason or another.

See the Data Quality Report for details.

Maintenance#

  • Data Owner: TODO: Assign owner
  • Update Frequency: TODO: Document frequency
  • Last Major Schema Change: TODO: Document when schema last changed

TODO: Link to: - Related API endpoints - Data collection procedures - Analysis notebooks or reports that use this table - External ontologies or standards