• 1 Acknowledgements
  • 2 Introduction
    • 2.1 Whence Neotoma
    • 2.2 Rationale
    • 2.3 History of the Constituent Databases
      • 2.3.1 Global Pollen Database
      • 2.3.2 North American Plant Macrofossil Database
      • 2.3.3 FAUNMAP
      • 2.3.4 BEETLE
    • 2.4 Who Will Use Neotoma?
  • 3 Working with the Raw Data
    • 3.1 Using the Database Locally
  • 4 Using R
  • 5 Searching for Sites
    • 5.1 Site names: sitename="%Lago%"
      • 5.1.1 Code
      • 5.1.2 3.1.1.2. Result
    • 5.2 3.1.2. Location: loc=c()
      • 5.2.1 3.1.2.1. Code
      • 5.2.2 3.1.2.2. Result
  • 6 SQL Quickly
    • 6.1 SQL Example
      • 6.1.1 SQL Query
    • 6.2 Table Keys
    • 6.3 Data Types
      • 6.3.1 Query
  • 7 Database Design Concepts
    • 7.1 Sites, Collection Units, Analysis Units, Samples, and Datasets
    • 7.2 Taxa and Variables
    • 7.3 Taxonomy and Synonymy
    • 7.4 Taxa and Ecological Groups
    • 7.5 Chronology
    • 7.6 Sediment and Depositional Environments
    • 7.7 Date Fields
  • 8 Anatomy of a Neotoma Dataset
    • 8.1 The Minimum Object
  • 9 Neotoma Tables
    • 9.1 Site Related Tables
    • 9.2 Dataset & Collection Related Tables
    • 9.3 Chronology & Age Related Tables
    • 9.4 Sample Related Tables
    • 9.5 Specimen Related Tables
    • 9.6 Taxonomy Related Tables
    • 9.7 Individual Related Tables
    • 9.8 Publication Related Tables
    • 9.9 Supporting Resources
    • 9.10 Isotope Data Management
  • 10 Contact and Individual Related Tables
    • 10.1 collectors
      • 10.1.1 SQL Example
    • 10.2 contacts
    • 10.3 contactstatuses
  • 11 Dataset & Collection Related Tables
    • 11.1 accumulationrates
    • 11.2 aggregatedatasets
    • 11.3 aggregateordertypes
    • 11.4 collectiontypes
    • 11.5 collectionunits
    • 11.6 contextsdatasettypes
    • 11.7 datasetdatabases
    • 11.8 datasetdoi
    • 11.9 datasetpis
    • 11.10 datasetpublications
    • 11.11 datasets
      • 11.11.1 SQL Example
      • 11.11.2 SQL Example
    • 11.12 datasetsubmissions
    • 11.13 datasetsubmissiontypes
      • 11.13.1 SQL Example
    • 11.14 datasettaxagrouptypes
    • 11.15 datasettaxonnotes
    • 11.16 datasettypes
    • 11.17 datasetvariables
    • 11.18 depenvttypes
      • 11.18.1 SQL Example
      • 11.18.2 SQL Example
      • 11.18.3 SQL Example
  • 12 Publication Related Tables
    • 12.1 publicationauthors
      • 12.1.1 SQL Example
    • 12.2 publicationeditors
    • 12.3 publications
    • 12.4 publicationtypes
      • 12.4.1 Legacy
      • 12.4.2 Journal Article
      • 12.4.3 Book Chapter
      • 12.4.4 Authored Book
      • 12.4.5 Edited Book
      • 12.4.6 Master’s Thesis
      • 12.4.7 Doctoral Dissertation
      • 12.4.8 Authored Report
      • 12.4.9 Edited Report
      • 12.4.10 Other Authored Publication
      • 12.4.11 Other Edited Publication
  • 13 Sample Related Tables
    • 13.1 aggregatesamples
    • 13.2 analysisunits
    • 13.3 data
      • 13.3.1 SQL Example
    • 13.4 depagents
    • 13.5 depagenttypes
    • 13.6 faciestypes
    • 13.7 keywords
    • 13.8 lithology
    • 13.9 sampleages
      • 13.9.1 SQL Example
    • 13.10 sampleanalysts
    • 13.11 samplekeywords
      • 13.11.1 SQL Example
    • 13.12 samples
      • 13.12.1 SQL Example
  • 14 Site Related Tables
    • 14.1 geopoliticalunits
      • 14.1.1 SQL Example
      • 14.1.2 SQL Example
    • 14.2 lakeparameters
      • 14.2.1 SQL Example
    • 14.3 lakeparametertypes
      • 14.3.1 SQL Example
    • 14.4 sitegeopolitical
      • 14.4.1 SQL Example
    • 14.5 siteimages
    • 14.6 sites
      • 14.6.1 SQL Example
  • 15 Taxonomy Related Tables
    • 15.1 ecolgroups
      • 15.1.1 SQL Example
      • 15.1.2 SQL Example
    • 15.2 ecolgrouptypes
    • 15.3 ecolsettypes
    • 15.4 synonyms
    • 15.5 synonymtypes
      • 15.5.1 SQL Example
    • 15.6 taxa
    • 15.7 taxagrouptypes
    • 15.8 variables
      • 15.8.1 SQL Example
      • 15.8.2 SQL Example
      • 15.8.3 SQL Example
    • 15.9 variablecontexts
    • 15.10 variableelements
    • 15.11 variablemodifications
    • 15.12 variableunits
    • 15.13 repositoryinstitutions
    • 15.14 repositoryspecimens
      • 15.14.1 SQL Example
    • 15.15 specimendates
  • 16 Chronology & Age Related Tables
    • 16.1 agetypes
    • 16.2 aggregatechronologies
    • 16.3 chroncontrols
    • 16.4 chroncontroltypes
    • 16.5 chronologies
      • 16.5.1 SQL Example
      • 16.5.2 SQL Example
    • 16.6 aggregatesampleages
      • 16.6.1 SQL Example
      • 16.6.2 SQL Example
    • 16.7 geochronology
      • 16.7.1 SQL Example
    • 16.8 geochronpublications
    • 16.9 geochrontypes
    • 16.10 relativeagepublications
    • 16.11 relativeages
      • 16.11.1 SQL Example
    • 16.12 radiocarboncalibration
    • 16.13 relativeagescales
    • 16.14 relativeageunits
    • 16.15 relativechronology
    • 16.16 tephrachronology
    • 16.17 tephras
  • 17 Views and Materialized Views
    • 17.1 About Views (Briefly)
    • 17.2 Neotoma Views
      • 17.2.1 Schema ap
      • 17.2.2 Schema da
      • 17.2.3 Schema ‘db’
      • 17.2.4 Schema ndb
      • 17.2.5 Schema ti
      • 17.2.6 Schema ts
    • 17.3 Neotoma Materialized Views
      • 17.3.1 Schema ap
  • 18 References

Neotoma Paleoecology Manual v2.0

13 Sample Related Tables

13.1 aggregatesamples

This table stores the samples in Aggregate Datasets.

  • aggregatedatasetid (primary key, foreign key): An arbitrary Aggregate Dataset identification number. Field links to the AggregateDatasets table.
  • sampleid (primary key, foreign key): Sample ID number. Field links to the Samples table.

13.2 analysisunits

This table stores the data for Analysis Units.

  • analysisunitid (primary key): An arbitrary Analysis Unit identification number.
  • collectionunitid (foreign key): Collection Unit ID number. Field links to CollectionUnits table. Every Analysis Unit belongs to a Collection Unit.
  • analysisunitname: Optional name for an Analysis Unit. Analysis Units are usually designated with either a depth or a name, sometimes both.
  • depth: Optional depth of the Analysis Unit in cm. Depths are typically designated for Analysis Units from cores and for Analysis Units excavated in arbitrary (e.g. 10 cm) levels. Depths are normally the midpoints of arbitrary levels. For example, for a level excavated from 10 to 20 cm or for a core section from 10 to 15 cm, the depth is 15. Designating depths as midpoints and thicknesses facilitates calculation of ages from age models that utilize single midpoint depths for Analysis Units rather than top and bottom depths. Of course, top and bottom depths can be calculated from midpoint depths and thicknesses. For many microfossil core samples, only the midpoint depths are known or published; the diameter or width of the sampling device is often not given.
  • thickness: Optional thickness of the Analysis Unit in cm. For many microfossil core samples, the depths are treated as points, and the thicknesses are not given in the publications, although 0.5 to 1.0 cm would be typical.
  • faciesid: Sedimentary facies of the Analysis Unit. Field links to the FaciesTypes table.
  • mixed: Indicates whether specimens in the Analysis Unit are of mixed ages, for example Pleistocene fossils occurring with late Holocene fossils. Although Analysis Units may be mixed, samples from the Analysis Unit may not be, for example individually radiocarbon dated specimens.
  • igsn: International Geo Sample Number. The IGSN is a unique identifier for a Geoscience sample. They are assigned by the SESAR, the System for Earth Sample Registration (www.geosamples.org), which is a registry that provides and administers the unique identifiers. IGSN’s may be assigned to all types of geoscience samples, including cores, rocks, minerals, and even fluids. Their purpose is to facilitate sharing and correlation of samples and sample-based data. For data in Neotoma, their primary value would be for correlation various samples from the same Analysis Units, for example pollen, charcoal, diatoms, and geochemical analyses. Conceivably, the AnalysisUnitID could be used for this purpose; however, IGSN’s could be assigned by projects before their data are submitted to the database. Moreover, AnalysisUnitID’s are intended to be internal to the database. Although IGSN’s could be assigned to Neotoma Collection Units and Samples, their primary value lies in their assignment to Analysis Units. IGSN’s are not yet assigned to Neotoma Analysis Units; however, that may change after consultation with SESAR.
  • notes: Free form notes or comments about the Analysis Unit.

13.3 data

The primary data table in the database. Each occurrence of a Variable in a sample comprises a record in the Data table.

  • sampleid (primary key, foreign key): Sample identification number. Field links to Samples table.
  • variableid (primary key, foreign key): Variable identification number. Field links to Variables table.
  • value: The value of the variable.

13.3.1 SQL Example

The following SQL example gives a list of vertebrate taxa by Analysis Unit for all sites. Also listed are Variable Measurement Units and Values.

SELECT
  au.analysisunitname,
  tx.taxonname,
  vu.variableunits,
  data.value
FROM ndb.datasettypes AS dst
  INNER JOIN ndb.datasets AS ds ON ds.datasettypeid = dst.datasettypeid
  INNER JOIN ndb.samples AS smp ON smp.datasetid = ds.datasetid
  INNER JOIN ndb.data AS data ON smp.sampleid = data.sampleid
  INNER JOIN ndb.variables AS var ON var.variableid = data.variableid
  INNER JOIN ndb.analysisunits AS au ON au.analysisunitid = smp.analysisunitid
  INNER JOIN ndb.taxa AS tx ON tx.taxonid = var.taxonid
  INNER JOIN ndb.variableunits AS vu ON vu.variableunitsid = var.variableunitsid
WHERE dst.datasettype = 'vertebrate fauna'
LIMIT 5;
Table 13.1: 5 records
analysisunitname taxonname variableunits value
Assemblage Antilocapra americana present/absent 1
Assemblage Erethizon dorsata present/absent 1
Assemblage Odocoileus hemionus present/absent 1
Assemblage Spermophilus sp. present/absent 1
Component III Antilocapra americana MNI 1

13.4 depagents

Deposition Agents for Analysis Units. Individual Analysis Units may be listed multiple times with different Deposition Agents.

  • analysisunitid (primary key): Analysis Unit identification number. Field links to AnalysisUnits table.
  • depagentid: Deposition Agent identification number. Field links to DepAgentTypes table.

13.5 depagenttypes

Lookup table of Depositional Agents. Table is referenced by the DepAgents table.

  • depagentid (primary key): An arbitrary Depositional Agent identification number.
  • depagent: Depostional Agent.

13.6 faciestypes

Lookup table of Facies Types. Table is referenced by the AnalysisUnits table.

  • faciesid (primary key): An arbitrary Facies identification number.
  • facies: Short Facies description.

13.7 keywords

Lookup table of keywords referenced by the samplekeywords table. The table provides a means to identify samples sharing a common attribute. For example, the keyword “modern sample” identifies modern surface samples in the database. These samples include individual surface samples, as well as core tops. Although not implemented, a “pre-European settlement” keyword would be a means to identify samples just predating European settlement.

  • keywordid (primary key): An arbitrary Keyword identification number.
  • keyword: A keyword for identifying samples sharing a common attribute.

13.8 lithology

This table stores the lithologic descriptions of Collection Units.

  • lithologyid (primary key): An arbitrary identification number for a lithologic unit.
  • collectionunitid (foreign key): Collection Unit identification number. Field links to the CollectionUnits table.
  • depthtop: Depth of the top of the lithologic unit in cm.
  • depthbottom: Depth of the bottom of the lithologic unit in cm.
  • lowerboundary: Description of the nature of the lower boundary of the lithologic unit, e.g. «gradual, over ca. 10 cm».
  • description: Description of the lithologic unit. These can be quite detailed, with Munsell color or Troels-Smith descriptions. Some examples:
    • interbedded gray silt and peat
    • marly fine-detritus copropel
    • humified sedge and Sphagnum peat
    • sedge peat 5YR 5/4
    • gray sandy loam with mammoth and other animal bones
    • grey-green gyttja, oxidizing to gray-brown
    • Ag 3, Ga 1, medium gray, firm, elastic
    • nig3, strf0, elas2, sicc0; Th2 T12 Tb+
    • Ld°4, , Dg+, Dh+

13.9 sampleages

This table stores sample ages. Ages are assigned to a Chronology. Because there may be more than one Chronology for a Collection Unit, samples may be assigned different ages for different Chronologies. A simple example is one sample age in radiocarbon years and another in calibrated radiocarbon years. The age units are an attribute of the Chronology.

  • sampleageid (primary key): An arbitrary Sample Age identification number.
  • sampleid (foreign key): Sample identification number. Field links to the Samples table.
  • chronologyid (foreign key): Chronology identification number. Field links to the Chronologies table.
  • age: Age of the sample
  • ageyounger: Younger error estimate of the age. The definition of this estimate is an attribute of the Chronology. Many ages do not have explicit error estimates assigned.
  • ageolder: Older error estimate of the age.

13.9.1 SQL Example

This query lists the Sample Ages for the default Chronologies for Muskox Lake. The CollectionUnit.Handle indicates that there is only one Collection Unit from this site. There are two default Chronologies, one in «Radiocarbon years BP» and the other in «Calibrated radiocarbon years BP».

SELECT
   st.sitename,
   cu.handle,
   array_agg(sa.age),
   at.agetype
FROM ndb.sites AS st
INNER JOIN ndb.collectionunits AS cu ON cu.siteid = st.siteid
INNER JOIN ndb.chronologies AS ch ON ch.collectionunitid = cu.collectionunitid
INNER JOIN ndb.sampleages AS sa ON sa.chronologyid = ch.chronologyid
INNER JOIN ndb.agetypes AS at ON at.agetypeid = ch.agetypeid
WHERE st.sitename = 'Muskox Lake'
  AND ch.isdefault = TRUE
GROUP BY st.sitename, cu.handle, at.agetype;
Table 13.2: 2 records
sitename handle array_agg agetype
Muskox Lake MUSKOX {10200,-50,604,1258,1912,2567,3221,3875,4529,5183,5838,6492,7146,7800,8280,8760,9240,9720} Calibrated radiocarbon years BP
Muskox Lake MUSKOX {-50,538,1125,1712,2300,2888,3475,4062,4650,5238,5825,6412,7000,7400,7800,8200,8600,9000} Radiocarbon years BP

13.10 sampleanalysts

This table lists the Sample Analysts.

  • analystid (primary key): An arbitrary Sample Analyst identification number.
  • sampleid (foreign key): Sample identification number. Field links to the Samples table.
  • contactid (foreign key): Contact identification number. Field links to the Contacts table.
  • analystorder: Order in which Sample Analysts are listed if more than one (rare).

13.11 samplekeywords

This table links keywords to samples. For example, it identifies modern pollen surface samples.

  • sampleid (primary key, foreign key): Sample identification number. Field links to the Samples table.
  • keywordid (primary key, foreign key): Keyword identification number. Field links to the Keywords lookup table.

13.11.1 SQL Example

This query provides a list of modern pollen surface samples from the United States. We listed the site name, collection type, the contact person, and the depositional environment from which the sample was obtained. We link sites and geopoliticalunits using the sitegeopolitical table, then we link to the samples to find samples with the modern keyword.

SELECT smp.sampleid,
  st.sitename,
  cut.colltype,
  ct.contactname,
  dpv.depenvt,
  kw.keyword
FROM ndb.geopoliticalunits AS gpu
  INNER JOIN ndb.sitegeopolitical AS sgp ON sgp.geopoliticalid = gpu.geopoliticalid
  INNER JOIN ndb.sites AS st ON st.siteid = sgp.siteid
  INNER JOIN ndb.collectionunits AS cu ON cu.siteid = st.siteid
  INNER JOIN ndb.collectiontypes AS cut ON cut.colltypeid = cu.colltypeid
  INNER JOIN ndb.depenvttypes AS dpv ON dpv.depenvtid = cu.depenvtid
  INNER JOIN ndb.datasets AS ds ON ds.collectionunitid = cu.collectionunitid
  INNER JOIN ndb.datasettypes AS dst ON dst.datasettypeid = ds.datasettypeid
  INNER JOIN ndb.samples AS smp ON smp.datasetid = ds.datasetid
  INNER JOIN ndb.samplekeywords AS skw ON skw.sampleid = smp.sampleid
  INNER JOIN ndb.keywords AS kw ON kw.keywordid = skw.keywordid
  INNER JOIN ndb.datasetpis AS dspi ON dspi.datasetid = ds.datasetid
  INNER JOIN ndb.contacts AS ct ON ct.contactid = dspi.contactid
WHERE  dst.datasettype = 'pollen' 
  AND kw.keyword = 'modern'
  AND gpu.geopoliticalname = 'United States'
ORDER BY cu.colltypeid
LIMIT 10;
Table 13.3: Displaying records 1 - 10
sampleid sitename colltype contactname depenvt keyword
39641 Ped Pond Composite Brubaker, Linda Beck Fluvial Origin Lake modern
60471 McNearney Lake Composite Shane, Linda C. K. Glacial Origin Lake modern
60505 McNearney Lake Composite Shane, Linda C. K. Glacial Origin Lake modern
38409 Otto Mielke Lake Composite Shane, Linda C. K. Glacial Origin Lake modern
139933 Little Horseshoe Lake Composite Cushing, Edward J., Jr. Kettle Lake modern
36864 Nels Lake Composite Shane, Linda C. K. Glacial Origin Lake modern
34528 McNearney Lake Composite Shane, Linda C. K. Glacial Origin Lake modern
1581 Angal Lake Core Brubaker, Linda Beck Cirque Lake modern
1476 Andrus Lake Core Shane, Linda C. K. Glacial Origin Lake modern
58893 Hells Kitchen Lake Core Swain, Albert M. Kettle Lake modern

13.12 samples

This table stores sample data. Samples are the link between analysisunits (which belong to the collectionunit) and a dataset of a particular datasettype. Thus, there can be multiple samples from the same analysis unit, one for each datasettype within the collection unit. For example one sample in an analysis unit at a collection unit might belong to a pollen dataset, the other to a plant macrofossil dataset.

  • sampleid (primary key): An arbitrary Sample identification number.
  • analysisunitid (foreign key): Analysis Unit identification number. Field links to the AnalysisUnits table.
  • datasetid (foreign key): Dataset identification number. Field links to the Datasets table.
  • samplename: Sample name if any.
  • analysisdate: Date of analysis.
  • labnumber: Laboratory number for the sample. A special case regards geochronologic samples, for which the LabNumber is the number, if any, assigned by the submitter, not the number assigned by the radiocarbon laboratory, which is in the Geochronology table.
  • preparationmethod: Description, notes, or comments on preparation methods. For faunal samples, notes on screening methods or screen size are stored here.
  • notes: Free form note or comments about the sample.

13.12.1 SQL Example

This query provides a list of samples from the site Canaleja. The collection unit name, analysis unit name, dataset type, and preparation methods are listed.

SELECT DISTINCT cu.collunitname,
  au.analysisunitname,
  dst.datasettype,
  smp.preparationmethod
FROM ndb.sites AS st
INNER JOIN ndb.collectionunits AS cu ON cu.siteid = st.siteid
INNER JOIN ndb.analysisunits AS au ON au.collectionunitid = cu.collectionunitid
INNER JOIN ndb.datasets AS ds ON ds.collectionunitid = cu.collectionunitid
INNER JOIN ndb.datasettypes AS dst ON dst.datasettypeid = ds.datasettypeid
INNER JOIN ndb.samples AS smp ON smp.analysisunitid = au.analysisunitid
WHERE st.sitename = 'Canaleja' AND dst.datasettype = 'pollen'
ORDER BY cu.collunitname, au.analysisunitname
Table 13.4: 7 records
collunitname analysisunitname datasettype preparationmethod
Cueva II NA pollen sample -0.89, level III, Old Neolithic
Cueva II NA pollen sample -1.10, level V, Epipalaeolithic
Cueva II NA pollen sample perfil A2, level III, Old Neolithic
Cueva II NA pollen sample perfil A2, level IV, Old Neolithic
Cueva II NA pollen sample perfil A2, level V, Epipalaeolithic
Cueva II NA pollen sample perfil A2, level VI
Cueva II NA pollen NA