13 Sample Related Tables

13.1 `aggregatesamples`

This table stores the samples in Aggregate Datasets.

aggregatedatasetid (primary key, foreign key): An arbitrary Aggregate Dataset identification number. Field links to the AggregateDatasets table.
sampleid (primary key, foreign key): Sample ID number. Field links to the Samples table.

13.2 `analysisunits`

This table stores the data for Analysis Units.

analysisunitid (primary key): An arbitrary Analysis Unit identification number.
collectionunitid (foreign key): Collection Unit ID number. Field links to CollectionUnits table. Every Analysis Unit belongs to a Collection Unit.
analysisunitname: Optional name for an Analysis Unit. Analysis Units are usually designated with either a depth or a name, sometimes both.
depth: Optional depth of the Analysis Unit in cm. Depths are typically designated for Analysis Units from cores and for Analysis Units excavated in arbitrary (e.g. 10 cm) levels. Depths are normally the midpoints of arbitrary levels. For example, for a level excavated from 10 to 20 cm or for a core section from 10 to 15 cm, the depth is 15. Designating depths as midpoints and thicknesses facilitates calculation of ages from age models that utilize single midpoint depths for Analysis Units rather than top and bottom depths. Of course, top and bottom depths can be calculated from midpoint depths and thicknesses. For many microfossil core samples, only the midpoint depths are known or published; the diameter or width of the sampling device is often not given.
thickness: Optional thickness of the Analysis Unit in cm. For many microfossil core samples, the depths are treated as points, and the thicknesses are not given in the publications, although 0.5 to 1.0 cm would be typical.
faciesid: Sedimentary facies of the Analysis Unit. Field links to the FaciesTypes table.
mixed: Indicates whether specimens in the Analysis Unit are of mixed ages, for example Pleistocene fossils occurring with late Holocene fossils. Although Analysis Units may be mixed, samples from the Analysis Unit may not be, for example individually radiocarbon dated specimens.
igsn: International Geo Sample Number. The IGSN is a unique identifier for a Geoscience sample. They are assigned by the SESAR, the System for Earth Sample Registration (www.geosamples.org), which is a registry that provides and administers the unique identifiers. IGSN’s may be assigned to all types of geoscience samples, including cores, rocks, minerals, and even fluids. Their purpose is to facilitate sharing and correlation of samples and sample-based data. For data in Neotoma, their primary value would be for correlation various samples from the same Analysis Units, for example pollen, charcoal, diatoms, and geochemical analyses. Conceivably, the AnalysisUnitID could be used for this purpose; however, IGSN’s could be assigned by projects before their data are submitted to the database. Moreover, AnalysisUnitID’s are intended to be internal to the database. Although IGSN’s could be assigned to Neotoma Collection Units and Samples, their primary value lies in their assignment to Analysis Units. IGSN’s are not yet assigned to Neotoma Analysis Units; however, that may change after consultation with SESAR.
notes: Free form notes or comments about the Analysis Unit.

13.3 `data`

The primary data table in the database. Each occurrence of a Variable in a sample comprises a record in the Data table.

sampleid (primary key, foreign key): Sample identification number. Field links to Samples table.
variableid (primary key, foreign key): Variable identification number. Field links to Variables table.
value: The value of the variable.

13.3.1 SQL Example

The following SQL example gives a list of vertebrate taxa by Analysis Unit for all sites. Also listed are Variable Measurement Units and Values.

SELECT
  au.analysisunitname,
  tx.taxonname,
  vu.variableunits,
  data.value
FROM ndb.datasettypes AS dst
  INNER JOIN ndb.datasets AS ds ON ds.datasettypeid = dst.datasettypeid
  INNER JOIN ndb.samples AS smp ON smp.datasetid = ds.datasetid
  INNER JOIN ndb.data AS data ON smp.sampleid = data.sampleid
  INNER JOIN ndb.variables AS var ON var.variableid = data.variableid
  INNER JOIN ndb.analysisunits AS au ON au.analysisunitid = smp.analysisunitid
  INNER JOIN ndb.taxa AS tx ON tx.taxonid = var.taxonid
  INNER JOIN ndb.variableunits AS vu ON vu.variableunitsid = var.variableunitsid
WHERE dst.datasettype = 'vertebrate fauna'
LIMIT 5;

Table 13.1: 5 records
analysisunitname	taxonname	variableunits	value
Assemblage	Antilocapra americana	present/absent	1
Assemblage	Erethizon dorsata	present/absent	1
Assemblage	Odocoileus hemionus	present/absent	1
Assemblage	Spermophilus sp.	present/absent	1
Component III	Antilocapra americana	MNI	1

13.4 `depagents`

Deposition Agents for Analysis Units. Individual Analysis Units may be listed multiple times with different Deposition Agents.

analysisunitid (primary key): Analysis Unit identification number. Field links to AnalysisUnits table.
depagentid: Deposition Agent identification number. Field links to DepAgentTypes table.

13.5 `depagenttypes`

Lookup table of Depositional Agents. Table is referenced by the DepAgents table.

depagentid (primary key): An arbitrary Depositional Agent identification number.
depagent: Depostional Agent.

13.6 `faciestypes`

Lookup table of Facies Types. Table is referenced by the AnalysisUnits table.

faciesid (primary key): An arbitrary Facies identification number.
facies: Short Facies description.

13.7 `keywords`

Lookup table of keywords referenced by the samplekeywords table. The table provides a means to identify samples sharing a common attribute. For example, the keyword “modern sample” identifies modern surface samples in the database. These samples include individual surface samples, as well as core tops. Although not implemented, a “pre-European settlement” keyword would be a means to identify samples just predating European settlement.

keywordid (primary key): An arbitrary Keyword identification number.
keyword: A keyword for identifying samples sharing a common attribute.

13.8 `lithology`

This table stores the lithologic descriptions of Collection Units.

lithologyid (primary key): An arbitrary identification number for a lithologic unit.
collectionunitid (foreign key): Collection Unit identification number. Field links to the CollectionUnits table.
depthtop: Depth of the top of the lithologic unit in cm.
depthbottom: Depth of the bottom of the lithologic unit in cm.
lowerboundary: Description of the nature of the lower boundary of the lithologic unit, e.g. «gradual, over ca. 10 cm».
description: Description of the lithologic unit. These can be quite detailed, with Munsell color or Troels-Smith descriptions. Some examples:
- interbedded gray silt and peat
- marly fine-detritus copropel
- humified sedge and Sphagnum peat
- sedge peat 5YR 5/4
- gray sandy loam with mammoth and other animal bones
- grey-green gyttja, oxidizing to gray-brown
- Ag 3, Ga 1, medium gray, firm, elastic
- nig3, strf0, elas2, sicc0; Th2 T12 Tb+
- Ld°4, , Dg+, Dh+

13.9 `sampleages`

This table stores sample ages. Ages are assigned to a Chronology. Because there may be more than one Chronology for a Collection Unit, samples may be assigned different ages for different Chronologies. A simple example is one sample age in radiocarbon years and another in calibrated radiocarbon years. The age units are an attribute of the Chronology.

sampleageid (primary key): An arbitrary Sample Age identification number.
sampleid (foreign key): Sample identification number. Field links to the Samples table.
chronologyid (foreign key): Chronology identification number. Field links to the Chronologies table.
age: Age of the sample
ageyounger: Younger error estimate of the age. The definition of this estimate is an attribute of the Chronology. Many ages do not have explicit error estimates assigned.
ageolder: Older error estimate of the age.

13.9.1 SQL Example

This query lists the Sample Ages for the default Chronologies for Muskox Lake. The CollectionUnit.Handle indicates that there is only one Collection Unit from this site. There are two default Chronologies, one in «Radiocarbon years BP» and the other in «Calibrated radiocarbon years BP».

SELECT
   st.sitename,
   cu.handle,
   array_agg(sa.age),
   at.agetype
FROM ndb.sites AS st
INNER JOIN ndb.collectionunits AS cu ON cu.siteid = st.siteid
INNER JOIN ndb.chronologies AS ch ON ch.collectionunitid = cu.collectionunitid
INNER JOIN ndb.sampleages AS sa ON sa.chronologyid = ch.chronologyid
INNER JOIN ndb.agetypes AS at ON at.agetypeid = ch.agetypeid
WHERE st.sitename = 'Muskox Lake'
  AND ch.isdefault = TRUE
GROUP BY st.sitename, cu.handle, at.agetype;

Table 13.2: 2 records
sitename	handle	array_agg	agetype
Muskox Lake	MUSKOX	{10200,-50,604,1258,1912,2567,3221,3875,4529,5183,5838,6492,7146,7800,8280,8760,9240,9720}	Calibrated radiocarbon years BP
Muskox Lake	MUSKOX	{-50,538,1125,1712,2300,2888,3475,4062,4650,5238,5825,6412,7000,7400,7800,8200,8600,9000}	Radiocarbon years BP

13.10 `sampleanalysts`

This table lists the Sample Analysts.

analystid (primary key): An arbitrary Sample Analyst identification number.
sampleid (foreign key): Sample identification number. Field links to the Samples table.
contactid (foreign key): Contact identification number. Field links to the Contacts table.
analystorder: Order in which Sample Analysts are listed if more than one (rare).

13.11 `samplekeywords`

This table links keywords to samples. For example, it identifies modern pollen surface samples.

sampleid (primary key, foreign key): Sample identification number. Field links to the Samples table.
keywordid (primary key, foreign key): Keyword identification number. Field links to the Keywords lookup table.

13.11.1 SQL Example

This query provides a list of modern pollen surface samples from the United States. We listed the site name, collection type, the contact person, and the depositional environment from which the sample was obtained. We link sites and geopoliticalunits using the sitegeopolitical table, then we link to the samples to find samples with the modern keyword.

SELECT smp.sampleid,
  st.sitename,
  cut.colltype,
  ct.contactname,
  dpv.depenvt,
  kw.keyword
FROM ndb.geopoliticalunits AS gpu
  INNER JOIN ndb.sitegeopolitical AS sgp ON sgp.geopoliticalid = gpu.geopoliticalid
  INNER JOIN ndb.sites AS st ON st.siteid = sgp.siteid
  INNER JOIN ndb.collectionunits AS cu ON cu.siteid = st.siteid
  INNER JOIN ndb.collectiontypes AS cut ON cut.colltypeid = cu.colltypeid
  INNER JOIN ndb.depenvttypes AS dpv ON dpv.depenvtid = cu.depenvtid
  INNER JOIN ndb.datasets AS ds ON ds.collectionunitid = cu.collectionunitid
  INNER JOIN ndb.datasettypes AS dst ON dst.datasettypeid = ds.datasettypeid
  INNER JOIN ndb.samples AS smp ON smp.datasetid = ds.datasetid
  INNER JOIN ndb.samplekeywords AS skw ON skw.sampleid = smp.sampleid
  INNER JOIN ndb.keywords AS kw ON kw.keywordid = skw.keywordid
  INNER JOIN ndb.datasetpis AS dspi ON dspi.datasetid = ds.datasetid
  INNER JOIN ndb.contacts AS ct ON ct.contactid = dspi.contactid
WHERE  dst.datasettype = 'pollen' 
  AND kw.keyword = 'modern'
  AND gpu.geopoliticalname = 'United States'
ORDER BY cu.colltypeid
LIMIT 10;

Table 13.3: Displaying records 1 - 10
sampleid	sitename	colltype	contactname	depenvt	keyword
39641	Ped Pond	Composite	Brubaker, Linda Beck	Fluvial Origin Lake	modern
60471	McNearney Lake	Composite	Shane, Linda C. K.	Glacial Origin Lake	modern
60505	McNearney Lake	Composite	Shane, Linda C. K.	Glacial Origin Lake	modern
38409	Otto Mielke Lake	Composite	Shane, Linda C. K.	Glacial Origin Lake	modern
139933	Little Horseshoe Lake	Composite	Cushing, Edward J., Jr.	Kettle Lake	modern
36864	Nels Lake	Composite	Shane, Linda C. K.	Glacial Origin Lake	modern
34528	McNearney Lake	Composite	Shane, Linda C. K.	Glacial Origin Lake	modern
1581	Angal Lake	Core	Brubaker, Linda Beck	Cirque Lake	modern
1476	Andrus Lake	Core	Shane, Linda C. K.	Glacial Origin Lake	modern
58893	Hells Kitchen Lake	Core	Swain, Albert M.	Kettle Lake	modern

13.12 `samples`

This table stores sample data. Samples are the link between analysisunits (which belong to the collectionunit) and a dataset of a particular datasettype. Thus, there can be multiple samples from the same analysis unit, one for each datasettype within the collection unit. For example one sample in an analysis unit at a collection unit might belong to a pollen dataset, the other to a plant macrofossil dataset.

sampleid (primary key): An arbitrary Sample identification number.
analysisunitid (foreign key): Analysis Unit identification number. Field links to the AnalysisUnits table.
datasetid (foreign key): Dataset identification number. Field links to the Datasets table.
samplename: Sample name if any.
analysisdate: Date of analysis.
labnumber: Laboratory number for the sample. A special case regards geochronologic samples, for which the LabNumber is the number, if any, assigned by the submitter, not the number assigned by the radiocarbon laboratory, which is in the Geochronology table.
preparationmethod: Description, notes, or comments on preparation methods. For faunal samples, notes on screening methods or screen size are stored here.
notes: Free form note or comments about the sample.

13.12.1 SQL Example

This query provides a list of samples from the site Canaleja. The collection unit name, analysis unit name, dataset type, and preparation methods are listed.

SELECT DISTINCT cu.collunitname,
  au.analysisunitname,
  dst.datasettype,
  smp.preparationmethod
FROM ndb.sites AS st
INNER JOIN ndb.collectionunits AS cu ON cu.siteid = st.siteid
INNER JOIN ndb.analysisunits AS au ON au.collectionunitid = cu.collectionunitid
INNER JOIN ndb.datasets AS ds ON ds.collectionunitid = cu.collectionunitid
INNER JOIN ndb.datasettypes AS dst ON dst.datasettypeid = ds.datasettypeid
INNER JOIN ndb.samples AS smp ON smp.analysisunitid = au.analysisunitid
WHERE st.sitename = 'Canaleja' AND dst.datasettype = 'pollen'
ORDER BY cu.collunitname, au.analysisunitname

Table 13.4: 7 records
collunitname	analysisunitname	datasettype	preparationmethod
Cueva II	NA	pollen	sample -0.89, level III, Old Neolithic
Cueva II	NA	pollen	sample -1.10, level V, Epipalaeolithic
Cueva II	NA	pollen	sample perfil A2, level III, Old Neolithic
Cueva II	NA	pollen	sample perfil A2, level IV, Old Neolithic
Cueva II	NA	pollen	sample perfil A2, level V, Epipalaeolithic
Cueva II	NA	pollen	sample perfil A2, level VI
Cueva II	NA	pollen	NA

13 Sample Related Tables

13.1 aggregatesamples

13.2 analysisunits

13.3 data

13.3.1 SQL Example

13.4 depagents

13.5 depagenttypes

13.6 faciestypes

13.7 keywords

13.8 lithology

13.9 sampleages