• 1 Acknowledgements
  • 2 Introduction
    • 2.1 Whence Neotoma
    • 2.2 Rationale
    • 2.3 History of the Constituent Databases
      • 2.3.1 Global Pollen Database
      • 2.3.2 North American Plant Macrofossil Database
      • 2.3.3 FAUNMAP
      • 2.3.4 BEETLE
    • 2.4 Who Will Use Neotoma?
  • 3 Working with the Raw Data
    • 3.1 Using the Database Locally
  • 4 Using R
  • 5 Searching for Sites
    • 5.1 Site names: sitename="%Lago%"
      • 5.1.1 Code
      • 5.1.2 3.1.1.2. Result
    • 5.2 3.1.2. Location: loc=c()
      • 5.2.1 3.1.2.1. Code
      • 5.2.2 3.1.2.2. Result
  • 6 SQL Quickly
    • 6.1 SQL Example
      • 6.1.1 SQL Query
    • 6.2 Table Keys
    • 6.3 Data Types
      • 6.3.1 Query
  • 7 Database Design Concepts
    • 7.1 Sites, Collection Units, Analysis Units, Samples, and Datasets
    • 7.2 Taxa and Variables
    • 7.3 Taxonomy and Synonymy
    • 7.4 Taxa and Ecological Groups
    • 7.5 Chronology
    • 7.6 Sediment and Depositional Environments
    • 7.7 Date Fields
  • 8 Anatomy of a Neotoma Dataset
    • 8.1 The Minimum Object
  • 9 Neotoma Tables
    • 9.1 Site Related Tables
    • 9.2 Dataset & Collection Related Tables
    • 9.3 Chronology & Age Related Tables
    • 9.4 Sample Related Tables
    • 9.5 Specimen Related Tables
    • 9.6 Taxonomy Related Tables
    • 9.7 Individual Related Tables
    • 9.8 Publication Related Tables
    • 9.9 Supporting Resources
    • 9.10 Isotope Data Management
  • 10 Contact and Individual Related Tables
    • 10.1 collectors
      • 10.1.1 SQL Example
    • 10.2 contacts
    • 10.3 contactstatuses
  • 11 Dataset & Collection Related Tables
    • 11.1 accumulationrates
    • 11.2 aggregatedatasets
    • 11.3 aggregateordertypes
    • 11.4 collectiontypes
    • 11.5 collectionunits
    • 11.6 contextsdatasettypes
    • 11.7 datasetdatabases
    • 11.8 datasetdoi
    • 11.9 datasetpis
    • 11.10 datasetpublications
    • 11.11 datasets
      • 11.11.1 SQL Example
      • 11.11.2 SQL Example
    • 11.12 datasetsubmissions
    • 11.13 datasetsubmissiontypes
      • 11.13.1 SQL Example
    • 11.14 datasettaxagrouptypes
    • 11.15 datasettaxonnotes
    • 11.16 datasettypes
    • 11.17 datasetvariables
    • 11.18 depenvttypes
      • 11.18.1 SQL Example
      • 11.18.2 SQL Example
      • 11.18.3 SQL Example
  • 12 Publication Related Tables
    • 12.1 publicationauthors
      • 12.1.1 SQL Example
    • 12.2 publicationeditors
    • 12.3 publications
    • 12.4 publicationtypes
      • 12.4.1 Legacy
      • 12.4.2 Journal Article
      • 12.4.3 Book Chapter
      • 12.4.4 Authored Book
      • 12.4.5 Edited Book
      • 12.4.6 Master’s Thesis
      • 12.4.7 Doctoral Dissertation
      • 12.4.8 Authored Report
      • 12.4.9 Edited Report
      • 12.4.10 Other Authored Publication
      • 12.4.11 Other Edited Publication
  • 13 Sample Related Tables
    • 13.1 aggregatesamples
    • 13.2 analysisunits
    • 13.3 data
      • 13.3.1 SQL Example
    • 13.4 depagents
    • 13.5 depagenttypes
    • 13.6 faciestypes
    • 13.7 keywords
    • 13.8 lithology
    • 13.9 sampleages
      • 13.9.1 SQL Example
    • 13.10 sampleanalysts
    • 13.11 samplekeywords
      • 13.11.1 SQL Example
    • 13.12 samples
      • 13.12.1 SQL Example
  • 14 Site Related Tables
    • 14.1 geopoliticalunits
      • 14.1.1 SQL Example
      • 14.1.2 SQL Example
    • 14.2 lakeparameters
      • 14.2.1 SQL Example
    • 14.3 lakeparametertypes
      • 14.3.1 SQL Example
    • 14.4 sitegeopolitical
      • 14.4.1 SQL Example
    • 14.5 siteimages
    • 14.6 sites
      • 14.6.1 SQL Example
  • 15 Taxonomy Related Tables
    • 15.1 ecolgroups
      • 15.1.1 SQL Example
      • 15.1.2 SQL Example
    • 15.2 ecolgrouptypes
    • 15.3 ecolsettypes
    • 15.4 synonyms
    • 15.5 synonymtypes
      • 15.5.1 SQL Example
    • 15.6 taxa
    • 15.7 taxagrouptypes
    • 15.8 variables
      • 15.8.1 SQL Example
      • 15.8.2 SQL Example
      • 15.8.3 SQL Example
    • 15.9 variablecontexts
    • 15.10 variableelements
    • 15.11 variablemodifications
    • 15.12 variableunits
    • 15.13 repositoryinstitutions
    • 15.14 repositoryspecimens
      • 15.14.1 SQL Example
    • 15.15 specimendates
  • 16 Chronology & Age Related Tables
    • 16.1 agetypes
    • 16.2 aggregatechronologies
    • 16.3 chroncontrols
    • 16.4 chroncontroltypes
    • 16.5 chronologies
      • 16.5.1 SQL Example
      • 16.5.2 SQL Example
    • 16.6 aggregatesampleages
      • 16.6.1 SQL Example
      • 16.6.2 SQL Example
    • 16.7 geochronology
      • 16.7.1 SQL Example
    • 16.8 geochronpublications
    • 16.9 geochrontypes
    • 16.10 relativeagepublications
    • 16.11 relativeages
      • 16.11.1 SQL Example
    • 16.12 radiocarboncalibration
    • 16.13 relativeagescales
    • 16.14 relativeageunits
    • 16.15 relativechronology
    • 16.16 tephrachronology
    • 16.17 tephras
  • 17 Views and Materialized Views
    • 17.1 About Views (Briefly)
    • 17.2 Neotoma Views
      • 17.2.1 Schema ap
      • 17.2.2 Schema da
      • 17.2.3 Schema ‘db’
      • 17.2.4 Schema ndb
      • 17.2.5 Schema ti
      • 17.2.6 Schema ts
    • 17.3 Neotoma Materialized Views
      • 17.3.1 Schema ap
  • 18 References

Neotoma Paleoecology Manual v2.0

15 Taxonomy Related Tables

15.1 ecolgroups

Ecological groups represent a method for organizing or representing taxa within particular ecological settings defined within the ecolsetname table. Some taxa have a single associated ecological group, others are associated with multiple groups. The taxon Abies, a tree genus, has only a single taxon group (TRSH, trees and shrubs), while the taxon cf. Larix, also a tree genus, is associated with TRSH group as well as the ecological group UNID. UNID represents unidentified taxa, since the term cf, as indicated in the taxa table means that the identified pollen resemble, but is not neccessarily Larix.

This table is the JOIN table between the ecolgrouptypes which contain the full descriptive name of the ecological groups, and the taxa and ecological setting within which the taxa are found, with respect to the samples.

  • taxonid (primary key, foreign key): Taxon identification number. The field links to the taxa table.
  • ecolsetid (primary key, foreign key): Ecological Set identification number. Field links to the ecolsettypes table.
  • ecolgroupid (foreign key): A four-letter Ecological Group identification code. Field links to the ecolgrouptypes table.

15.1.1 SQL Example

The following query produces a list of the Ecological Groups for all vascular plants (VPL) within the database.

SELECT DISTINCT
   tx.taxagroupid,
   eg.ecolgroupid,
   est.ecolsetname,
   egt.ecolgroup
FROM ndb.ecolgroups AS eg
INNER JOIN   ndb.ecolsettypes AS est ON   est.ecolsetid = eg.ecolsetid
INNER JOIN ndb.ecolgrouptypes AS egt ON egt.ecolgroupid = eg.ecolgroupid
INNER JOIN           ndb.taxa AS tx  ON      tx.taxonid = eg.taxonid
WHERE
   tx.taxagroupid = 'VPL'
GROUP BY
   tx.taxagroupid, eg.ecolgroupid, est.ecolsetname, egt.ecolgroup;
Table 15.1: Displaying records 1 - 10
taxagroupid ecolgroupid ecolsetname ecolgroup
VPL PALM Default plant Palms
VPL VACR Default plant Terrestrial Vascular Cryptogams
VPL PCON Default plant form taxa Pan-Coniferae
VPL SEED Default plant Spermatophyte rank or clade above order
VPL ANAC Default plant Anachronic
VPL EMBR Default plant Embryophyta
VPL EUDI Default plant form taxa Eudicotyledoneae
VPL TRSH Default plant form taxa Trees and Shrubs
VPL UPHE Default plant Upland Herbs
VPL PLNT Default plant Plant

15.1.2 SQL Example

This query lists all the taxa in the Ecological Group Sirenia.

SELECT
   egt.ecolgroup,
   tx.taxonname
FROM
   ndb.taxa AS tx
   INNER JOIN     ndb.ecolgroups AS  eg ON      tx.taxonid = eg.taxonid
   INNER JOIN ndb.ecolgrouptypes AS egt ON egt.ecolgroupid = eg.ecolgroupid
WHERE egt.ecolgroup = 'Sirenia';
Table 15.2: 9 records
ecolgroup taxonname
Sirenia Dugongidae
Sirenia Hydrodamalis gigas
Sirenia Sirenia
Sirenia Trichechidae
Sirenia Trichechus manatus
Sirenia Hydrodamalis
Sirenia Trichechus
Sirenia Dugong
Sirenia ?Dugong sp.

15.2 ecolgrouptypes

Lookup table of Ecological Group Types. The table is referenced by the ecolgroups table.

  • ecolgroupid (primary key): An arbitrary Ecological Group identification number.
  • ecolgroup: Ecological Group.

15.3 ecolsettypes

Lookup table of Ecological Set Types. The table is referenced by the ecolgroups table.

  • ecolsetid (primary key): An arbitrary Ecological Set identification number.
  • ecolsetname: The Ecological Set name.

15.4 synonyms

This table lists common synonyms for taxa in the taxa table. No effort has been made to provide a complete taxonomic synonymy, but rather to list synonyms commonly used in recent literature.

  • synonymid (primary key): An arbitrary synonym identification number.
  • synonymname: Name of the synonym.
  • taxonid (foreign key): The accepted taxon name in Neotoma. This field links to the taxa table.
  • publicationid (foreign key): Published authority for synonymy. Field links to publications table.
  • synonymtypeid (foreign key): Type of synonym. Field links to the synonymtypes lookup table.
  • notes: Free form notes or comments about the synonymy.

15.5 synonymtypes

Lookup table of Synonym Types. Table is referenced by the synonyms table.

  • synonymtypeid (primary key): An arbitrary Synonym Type identification number.
  • synonymtype: Synonym type. Below are some examples:
    • nomenclatural, homotypic, or objective synonym: a synonym that unambiguously refers to the same taxon, particularly one with the same description or type specimen. These synonyms are particularly common above the species level. For example, Gramineae = Poaceae, Clethrionomys gapperi = Myodes gapperi. The term objective is used in zoology, whereas nomenclatural or homotypic is used in botany.
    • taxonomic, heterotypic, or subjective synonym: a synonym typically based on a different type specimen, but which is now regarded as the same taxon as the senior synonym. For example, Iva ciliata = Iva annua. The term subjective is used in zoology, whereas taxonomic or heterotypic is used in botany.
    • genus merged into another genus: heterotypic or subjective synonym; a genus has been merged into another genus and has not been retained at a subgeneric rank. This synonymy may apply to either the generic or specific level, for example: Petalostemon = Dalea, Petalostemon purpureus = Dalea purpurea.
    • family merged into another family: Heterotypic or subjective synonym; a family has been merged into another family and has not been retained at a subfamilial rank. For example, the Taxodiaceae has been merged with the Cupressaceae. This synonymy creates issues for data entry, because palynologically the Taxodiaceae sensu stricto is sometimes distinguishable from the Cupressaceae sensu stricto. If a pollen type was identified as Cupressaceae/Taxodiaceae, then synonymizing to Cupressaceae results in no loss of information. However, synonymizing Taxodiaceae to Cupressaceae potentially does. In this case, consultation with the original literature or knowledge of the local biogeography may point to a logical name change that will retain the precision of the original identification. For example, in the southeastern , Taxodiaceae can be changed to Taxodium or **Taxodium-type in most situations. If Cupressaceae was also identified, then it should be changed to Cupressaceae undiff. or possibly Juniperus-type if other Cupressaceae such as Chamaecyperus are unlikely.
    • rank change: species reduced to subspecific rank: heterotypic or subjective synonym; a species has been reduced to a subspecies or variety of another species. These synonyms may be treated in two different ways, depending on the situation or protocols of the contributing data cooperative: (1) The taxon is reduced to the subspecific rank (e.g. Alnus fruticosa = Alnus viridis subsp. fruticosa, Canis familiaris = Canis lupus familiaris), either because the fossils can be assigned to the subspecies based on morphology, as is likely the case with the domestic dog, Canis lupus familiaris, or because the subspecies can be assigned confidently based on biogeography. (2) The taxon is changed to the new taxon and the subspecific rank is dropped because the fossil is not distinguishable at the subspecific level. For example, Alnus rugosa = Alnus incana subsp. rugosa, but may simply be changed to Alnus incana because the pollen of A. incana subsp. rugosa and A. incana subsp. incana are indistinguishable morphologically.
    • rank change: genus reduced to subgenus: heterotypic or subjective synonym; a genus has been reduced to subgeneric rank in another family. At the generic level, this synonymy is clear from the naming conventions, e.g. Mictomys = Synaptomys (Mictomys); however, at the species level it is not, e.g. Mictomys borealis = Synaptomys borealis.
    • rank change: family reduced to subfamily: heterotypic or subjective synonym; a family has been reduced to subfamily rank in another family. By botanical convention the family name is retained, e.g. Pyrolaceae = Ericaceae subf. Monotropoideae; whereas by zoological convention it is not, e.g. Desmodontidae = Desmodontinae.
    • rank change: subspecific rank elevated to species: heterotypic or subjective synonym; a subspecies or variety has been raised to the species rank, e.g. Ephedra fragilis subsp. campylopoda = Ephedra foeminea.
    • rank change: subgeneric rank elevated to genus: heterotypic or subjective synonym; a subgenus or other subgeneric rank has been raised to the generic rank. At the subgeneric level, this synonymy is clear from the naming conventions, e.g. Potamogeton subg. Coleogeton = Stuckenia; however, at the species level it is not, e.g. Potamogeton pectinatus = Stuckenia pectinata.
    • rank change: subfamily elevated to family: heterotypic or subjective synonym; a subfamily has been raised to the family rank, e.g. Liliaceae subf. Amaryllidoideae = Amaryllidaceae, Pampatheriinae = Pampatheriidae.
    • rank elevated because of taxonomic uncertainty: because the precise taxonomic identification is uncertain, the rank has been raised to a level that includes the universe of possible taxa. A common cause of such uncertainty is taxonomic splitting subsequent to the original identification, in which case the originally identified taxon is now a much smaller group. For example, the genus Psoralea has been divided into several genera; the genus Psoralea still exists, but now includes a much smaller number of species. Consequently, in the database Psoralea has been synonymized with Fabaceae tribe Psoraleeae, which includes the former Psoralea sensu lato. A zoological example is Mustela sp. The genus Mustela formerly included the minks, which have now been separated into the genus Neovison. Consequently, Mustela sp. = Mustela/Neovison sp.
    • globally monospecific genus: Although identified at the genus level, specimens assigned to this genus can be further assigned to the species level because the genus is monospecific.
    • globally monogeneric family: although identified at the family level, specimens assigned to this family can be further assigned to the genus level because the family is monogeneric.

15.5.1 SQL Example

This query provides the preferred synonym in the database for Bison alleni along with the published authority for the synonymy and the notes in the database on the rationale for the synonymy. The notes indicate some potential problems with this synonymy.

SELECT 
    syntx.taxonname AS preferred,
       tx.taxonname AS original,
       pub.citation AS publication,
          syn.notes AS synonymnotes
FROM ndb.taxa AS tx
  INNER JOIN     ndb.synonymy AS syn   ON       syn.taxonid = tx.taxonid
  INNER JOIN         ndb.taxa AS syntx ON    syn.reftaxonid = syntx.taxonid
  INNER JOIN ndb.publications AS pub   ON syn.publicationid = pub.publicationid
WHERE syntx.taxonname = 'Bison alleni';
Table 15.3: 5 records
preferred original publication synonymnotes
Bison alleni Bison latifrons Hill, M.E., Jr., and J.L. Hofman. 1997. The Waugh Site: a Folsom-age bison bonebed in northwestern Oklahoma. Plains Anthropologist 42(159, Mem):63-83. NA
Bison alleni Bison latifrons Dalquest, W.W., and G.E. Schultz. 1992. Ice Age mammals of northwestern Texas. Midwestern State University Press, Wichita Falls, Texas, USA. NA
Bison alleni Bison latifrons Hill, M.E. 1996. Paleoindian bison remains from 12 Mile Creek site in western Kansas. Plains Anthropologist 41(158):359-372. NA
Bison alleni Bison latifrons Slaughter, B.H., W.W. Crook, Jr., R.K. Harris, D.C. Allen, and M. Seifert. 1962. The Hill-Shuler local faunas of the upper Trinity River, Dallas and Denton counties, Texas. Report of Investigations 48, Bureau of Economic Geology, University of Texas, Austin, Texas, USA. [DOI: 10.23967/RI0048D.RI0048D] NA
Bison alleni Bison latifrons Slaughter, B.H., W.W. Crook, Jr., R.K. Harris, D.C. Allen, and M. Seifert. 1962. The Hill-Shuler local faunas of the upper Trinity River, Dallas and Denton counties, Texas. Report of Investigations 48, Bureau of Economic Geology, University of Texas, Austin, Texas, USA. [DOI: 10.23967/RI0048D.RI0048D] NA

15.6 taxa

This table lists all taxa in the database. Most taxa are biological taxa; however, some are biometric measures and some are physical parameters.

  • taxonid (primary key): An arbitrary Taxon identification number.
  • taxoncode: A code for the Taxon. These codes are useful for other software or output for which the complete name is too long. Because of the very large number of taxa, codes can be duplicated for different Taxa Groups. In general, these various Taxa Groups are analyzed separately, and no duplication will occur within a dataset. However, if Taxa Groups are combined, unique codes can be generated by prefixing with the TaxaGroupID, For example, VPL:Cle (Clethra) and MAM:Cle (Clethrionomys). A set of conventions has been established for codes. In some cases conventions differ depending on whether the organism is covered by rules of botanical nomenclature (BN) or zoological nomenclature (ZN).
    • Genus: Three-letter code, first letter capitalized, generally the first three unless already used: Ace (Acer) or Cle (Clethrionomys).
    • Subgenus: The genus code plus a two-letter subgenus code, first letter capitalized, separated by a period: Pin.Pi (Pinus subg. Pinus) or Syn.Mi (Synaptomys (Mictomys)).
    • Species: The genus code plus a two-letter, lower-case species code, separated by a period: Ace.sa (Acer saccharum), Ace.sc (Acer saccharinum), or Cle.ga (Clethrionomys gapperi)
    • Subspecies or variety: The species code a two-letter, lower-case subspecies code, separated by a period: Aln.vi.si (Alnus viridis subsp. sinuata), or Bis.bi.an (Bison bison antiquus)
    • Family: Six-letter code, first letter capitalized, consisting of three letters followed by eae (BN) or dae (ZN): Roseae (Rosaceae), or Bovdae (Bovidae)
    • Subfamily or tribe: (BN) Family code plus two-letter subfamily code, first letter capitalized, separated by a period. (ZN) Six-letter code, first letter capitalized, consisting of three letters followed by nae: Asteae.As (Asteraceae subf. Asteroideae), Asteae.Cy (Asteraceae tribe Cynarea), or Arvnae Arvicolinae.
    • Order: (BN) Six-letter code, first letter capitalized, consisting of three letters followed by les. (ZN) Six-letter code, first letter capitalized, consisting of three letters, followed by the last three letters of the order name, unless the order name is ≤6 letters long, in which case the code = the order name. Zoological orders do not have a common ending: Ercles (Ericales), Artyla (Artiodactyla), or Rodtia (Rodentia).
    • Taxonomic levels higher than order: Six-letter code, first letter capitalized, consisting of three letters, followed by the last three letters of the order name, unless the order name is ≤6 letters long, in which case the code = the order name: Magida (Magnoliopsida), Magyta (Magnoliophyta), or Mamlia (Mammalia).
    • Types: The conventional taxon code followed by -type: Aln.in-t (Alnus incana-type), Amb-t (Ambrosia-type)
    • cf.: cf. is placed in the proper position: Odc.cf.he (Odocoileus cf. O. hemionus), cf.Odc.he (cf. Odocoileus hemionus), or cf.Odc (cf. Odocoileus).
    • aff.: aff. is abbreviated to af.: af.Can.di (aff. Canis dirus)
    • ?: ? is placed in the proper position. ?Pro.lo (?Procyon lotor)
    • Alternative names: A slash is placed between the conventional abbreviations for the alternative taxa: Ost/Cpn (Ostrya/Carpinus), or Mstdae/Mepdae (Mustelidae/Mephitidae)
    • Undifferentiated taxa: (BN) .ud is added to the code. (ZN) .sp is added to the code: Aln.ud (Alnus undiff.), Roseae.ud (Rosaceae undiff.), Mms.sp (Mammuthus sp.), or Taydae.sp (Tayassuidae sp.).
    • Parenthetic modifiers: The conventional taxon code with an appropriate abbreviation for the modifier separated by periods. Multiple modifiers also separated by periods. Abbreviations for pollen morphological modifiers follow Iversen and Troels-Smith (1950): Raneae.C3 (Ranunculaceae (tricolpate)), Raneae.Cperi (Ranunculaceae (pericolpate)), Pineae.ves.ud (Pinaceae (vesiculate) undiff.), Myteae.Csyn.psi (Myrtaceae (syncolpate, psilate)), Bet.>20µ (Betula (>20 µm))
    • Non-biological taxa: Use appropriate abbreviations: bulk.dens (Bulk density), LOI Loss-on-ignition, Bet.pol.diam (Betula mean pollen-grain diameter).
  • taxonname: Name of the taxon. Most TaxonNames are biological taxa; however, some are biometric measures and some are physical parameters. In addition, some biological taxa may have parenthetic non-Latin modifers, e.g. **Betula* (>20 µm)* for Betula pollen grains >20 µm in diameter. In general, the names used in Neotoma are those used by the original investigator. In particular, identifications are not changed, although Dataset notes can be added to the database regarding particular identifications. However, some corrections and synonymizations are made.
    • These include:

      • Misspellings are corrected.
      • Nomenclatural, homotypic, or objective synonyms may be applied. Because these synonyms unambiguously refer to the same taxon, no change in identification is implied. For example, the old family name for the grasses Gramineae is changed to Poaceae.
      • Taxonomic, heterotypic, or subjective synonyms may be applied if the change does not effectively assign the specimen to a different taxon. Although two names may have been based on different type specimens, if further research has shown that these are in fact the same taxon, the name is changed to the accepted name. These synonymizations should not cause confusion. However, uncritical synonymization, although taxonomically correct, can result in loss of information, and should be avoided. For example, although a number of recent studies have shown that the Taxodiaceae should be merged with the Cupressaceae, simply synonymizing Taxodiaceae with Cupressaceae may expand the universe of taxa beyond that implied by the original investigator. For example, a palynologist in the southeastern United States may have used Taxodiaceae to imply Taxodium, which is the only genus of the family that has occurred in the region since the Pliocene, but used the the family name because, palynologically, Taxodiuim cannot be differentiated from other Taxodiaceae. However, well preserved Taxodium pollen grains can be differentiated from the other Cupressaceous genera in the regin, Juniperus and Chamaecyperus. Thus, the appropriate synonymization for Taxodiaceae in this region would be Taxodium or **Taxodium-type, which would retain the original taxonomic precision. On the other hand, the old TCT shorthand for Taxodiaceae/Cupressaceae/Taxaceae now becomes Cupressaceae/Taxaceae with no loss of information.
      • For alternative taxonomic desginations, the order may be changed. For example, Ostrya/Carpinus would be substituted for Carpinus/Ostrya.

      The database has a number of conventions for uncertainty in identification. The uncertainty is included in the taxon name. Thus, Acer pensylvanicum and Acer cf. A. pensylvanicum are two different taxa.

      • cf.: Latin confer, which means compare. In taxonomy cf. generally means that the specimen compares well to or is similar to the type referred, but the identification is uncertain. Uncertainty may arise for a number of reasons. The specimen may not be well preserved. It may be nondescript. There may be other similar taxa that can not be ruled out. The analyst may not have access to a complete reference or comparative collection for the group, so other related taxa cannot be excluded with certainty. For uncertainty at the species level, the convention in Neotoma is, for example, Odocoileus cf. O. hemionus, not Odocoileus cf. hemionus. Placement of cf. is important, because it indicates the taxonomic level of uncertaintly. For example, Odocoileus cf. O. hemionus implies that the identification of Odocoileus is secure, but that the species identification is not; whereas cf. Odocoileus hemionus implies that not even the genus identification is certain. A further implication in the latter example is that if the genus identification is correct, then the the specimen must also be that species, perhaps because of biogeographic considerations. Although commonly overlooked, it is also important to indicate the proper level of uncertainly in family-genus identifications. For example, Brassicaceae cf. Brassica implies that assignment to the Brassicaceae is secure; whereas cf. Brassica does not indicate that even the family identification is certain. In FAUNMAP, the uncertainty is recorded in a separate field from the taxon name, and for species it is not discernable whether the uncertainty is at the genus or species level. When data were imported from FAUNMAP, the cf. uncertainty was conservatively assigned to the genus level. Thus, if Bison bison was indicated to have cf. uncertainty, this record was imported as cf. *Bison bison** rather than Bison cf. B. bison. However, in many cases, the uncertainty in the original data was probably at the species level.
      • aff.: aff., Latin affinis, which means having affinity with, but distinct from, the referred taxon. This desgination is often applied to a taxon thought to be undescribed. Thus, aff. Canis dirus implies an affinity to Canus dirus, but the specimen is likely from another species.
      • ?: ? is used to designate a questionable identification. It may indicate even less certainty than cf.. An example is ?Procyon lotor.
      • Types: Many pollen taxa are designated as types, e.g. Ambrosia-type. A type denotes a morphological type that is consistent with the referred taxon, but also includes other taxa that are palynologically indistinguishable. For example, Ambrosia-type includes Ambrosia and Iva axillaris. The referred name commonly indicates the sporophyte taxon thought to be the most probable source of the pollen. An analyst may choose a -type designation referring to a lower taxonomic rank rather than an inclusive higher taxonomic rank because the referred taxon is thought to be the source taxon with very high probability. For example, in eastern North America, Pinus strobus is the only species of Pinus subg. Strobus, although several other species of this subgenus occur in western North America. Consequently, some analysts refer to **Pinus strobus-type rather than Pinus* subg. Strobus. Ideally, a -type would comprise a well defined universe of taxa, but in practice types are often vaguely defined. For example, in eastern North America, Populus balsamifera-type includes a large proportion of P. balsamifera and probably smaller proportions of P. tremuloides, P. grandidentata, and P. deltoides; whereas Populus tremuloides-type includes larger proportions of these latter three species and a smaller proportion of P. balsamifera. However, these proportions are ill-defined.
      • Alternative taxonomic designations: In some cases, fossil specimens of two taxa are indistinguishable and are more-or-less equally likely. The names can then be separated by a slash, e.g. Ostrya/Carpinus, Mustelidae/Mephitidae. If one taxon is more likely, the analyst may choose to use a -type designation instead, e.g. **Ostrya-type. Although the order of alternative names may be changed by the database, a -type designation is not substituted for alternatives. However, the use of more two alternatives is discouraged. In cases in which taxonomic revisions have reduced the number of speices within a taxon, the original universe of species may be retained with the slash designation. An example is Mustelidae, which in older literature included the skunks, which have now been placed in their own family the Mephitidae; thus Mustelidae/Mephitidae retains the original set of possible taxa.
      • Undifferentiated taxa: Lower taxonomic ranks may not be differentiated. The convention among palynologists is to specify these by the suffix «undiff. ». Thus, Rosaceae undiff. designates undifferentiated Rosaceae. However, palynologists have inconsistently applied the undiff. appellation, and the pollen databases established a convention that taxa must be mutually exclusive within a dataset. Thus, if a higher-rank taxon is present in a dataset, the undiff. suffix is applied only if lower-rank taxa are also present. For example, if Spiraea occurs in a dataset, Rosaceae would be changed to Rosaceae undiff., because Spiraea is a genus in the family Rosaceae. On the other hand, if Rosaceae undiff. occurs with no other Rosaceae, then Rosaceae undiff. is changed to simply Rosaceae; it is implicit that the family is not differentiated.

      Faunal analysts customarily use the appellation sp. to designate undifferentiated taxa. Thus, **Microtus* sp.* indicates undifferentiated Microtus. In addition, faunal analysts regularly use the sp. designation even when no lower-rank taxa are identified. The sp. appellation is most frequently used with genera. The principle of taxonomic mutual exclusivity has not been applied to fauanl datasets, although it should probably be considered.

  • author: Author(s) of the name. Neither the pollen database nor FAUNMAP stored author names, so these do not currently exist in Neotoma for plant and mammal names. These databases follow standard taxonomic references (e.g. Flora of North America, Flora Europaea, Wilson and Reeder's Mammal Species of the World), which, of course, do cite the original authors. However, for beetles, the standard practice is to cite original author names; therefore, this field was added to Neotoma.
  • highertaxonid: Thetaxonid of the next higher taxonomic rank, for example, the highertaxonid for Bison is the taxonid for Bovidae. For cf.'s and -types, the next higher rank may be much higher owing to the uncertainty of the identification; the highertaxonid for cf. Bison bison is the taxonid for Mammalia. The highertaxonid implements the taxonomic hierarchy in Neotoma.
  • extinct: Boolean (True/False) variable. The value is True if the taxon is extinct, False if extant.
  • taxagroupid (foreign key): The TaxaGroupID facilitates rapid extraction of taxa groups that are typically grouped together for analysis. Some of these groups contain taxa in different classes or phyla. For example, vascular plants include the Spermatophyta and Pteridophyta; the herps include Reptilia and Amphibia; the testate amoebae include taxa from different phyla. Field links to the TaxaGroupTypes table.
  • publicationid (foreign key): Publication identification number. Field links to the Publications table.
  • notes: Free form notes or comments about the Taxon.

15.7 taxagrouptypes

Lookup table for Taxa Group Types. This table is referenced by the taxa table.

  • taxagroupid (primary key): A three-letter Taxa Group code.
  • taxagroup: The taxa group. Below are some examples:
SELECT * FROM ndb.taxagrouptypes
LIMIT 5;
Table 15.4: 5 records
taxagroupid taxagroup recdatecreated recdatemodified
ACR Acritarchs 2013-09-30 14:03:04 2013-09-30 14:03:04
ALG Algae 2013-09-30 14:03:04 2013-09-30 14:03:04
AMB Ambiguous names 2013-09-30 14:03:04 2013-09-30 14:03:04
ANL Annelids 2013-09-30 14:03:04 2013-09-30 14:03:04
ANM Animals undiff. 2013-09-30 14:03:04 2013-09-30 14:03:04

15.8 variables

This table lists variables, which always consist of a taxon (linked by a taxonid) and units of measurement. Variables can also have variable elements, variable contexts, and variable modifications. Thus, the same taxon with different measurement units (e.g. present/absent, NISP, MNI) are different Variables.

  • variableid (primary key): An arbitrary Variable identification number.
  • taxonid (foreign key): Taxon identification number. Field links to the Taxa table.
  • variableelementid (foreign key): Variable Element identification number. Field links to the VariableElements lookup table.
  • variableunitsid (foreign key): Variable Units identification number. Field links to the VariableUnits lookup table.
  • variablecontextid (foreign key): Variable Context identification number. Field links to the VariableContexts lookup table.
  • variablemodificationid (foreign key): Variable Modification identification number. Field links to the VariableModifications lookup table.

15.8.1 SQL Example

This query lists the different variables expressions for Zea mays with elements and measurement units:

SELECT 
    tx.taxonname,
    ve.variableelement,
    vu.variableunits
FROM 
               ndb.taxa             AS tx
    INNER JOIN ndb.variables        AS var ON          var.taxonid = tx.taxonid
    INNER JOIN ndb.variableunits    AS vu  ON   vu.variableunitsid = var.variableunitsid
    INNER JOIN ndb.variableelements AS ve  ON ve.variableelementid = var.variableelementid
WHERE tx.taxonname = 'Zea mays'
GROUP BY tx.taxonname, 
         ve.variableelement,
         vu.variableunits;
Table 15.5: 5 records
taxonname variableelement variableunits
Zea mays cob NISP
Zea mays glume NISP
Zea mays kernel NISP
Zea mays pollen NISP
Zea mays stalk fiber present/absent

15.8.2 SQL Example

This query lists all sites with Zea mays pollen by designating the VariableElement as pollen. This helps disambiguate sites where we may find plant macrofossil samples, or perhaps environmental DNA. Similarly, we may search for charcoal with size constraints (variableelement='>100 µm').

SELECT DISTINCT tx.taxonname, 
       ve.variableelement, 
       st.sitename
FROM 
    ndb.taxa AS tx
    INNER JOIN ndb.variables        AS var ON          var.taxonid = tx.taxonid
    INNER JOIN ndb.variableelements AS ve  ON ve.variableelementid = var.variableelementid
    INNER JOIN ndb.data             AS dt  ON var.variableid = dt.variableid
    INNER JOIN ndb.samples          AS smp ON smp.sampleid = dt.sampleid
    INNER JOIN ndb.datasets         AS ds  ON ds.datasetid = smp.datasetid
    INNER JOIN ndb.collectionunits  AS cu  ON cu.collectionunitid = ds.collectionunitid
    INNER JOIN ndb.sites            AS st  ON st.siteid = cu.siteid
WHERE tx.taxonname = 'Zea mays'
  AND ve.variableelement = 'pollen'
LIMIT 10;
Table 15.6: Displaying records 1 - 10
taxonname variableelement sitename
Zea mays pollen Etang de la Gruère
Zea mays pollen Piano
Zea mays pollen Bouara
Zea mays pollen Laguna Cerritos
Zea mays pollen Tatli Gölü
Zea mays pollen Lago Lungo
Zea mays pollen Lake Caranã
Zea mays pollen Segna
Zea mays pollen Solum Lake
Zea mays pollen Ayauchi122

We can obtain a similar result if we search instead for all records from pollen datasets, as below. However, it may be the case that pollen is observed and recorded in datasets that are not, strictly speaking, pollen datasets.

SELECT DISTINCT tx.taxonname, 
       dst.datasettype, 
       st.sitename
FROM 
    ndb.taxa AS tx
    INNER JOIN ndb.variables        AS var ON          var.taxonid = tx.taxonid
    INNER JOIN ndb.data             AS dt  ON var.variableid = dt.variableid
    INNER JOIN ndb.samples          AS smp ON smp.sampleid = dt.sampleid
    INNER JOIN ndb.datasets         AS ds  ON ds.datasetid = smp.datasetid
    INNER JOIN ndb.datasettypes     AS dst ON dst.datasettypeid = ds.datasettypeid
    INNER JOIN ndb.collectionunits  AS cu  ON cu.collectionunitid = ds.collectionunitid
    INNER JOIN ndb.sites            AS st  ON st.siteid = cu.siteid
WHERE tx.taxonname = 'Zea mays'
  AND dst.datasettype = 'pollen'
LIMIT 10;
Table 15.7: Displaying records 1 - 10
taxonname datasettype sitename
Zea mays pollen Aalkistensee
Zea mays pollen Adriatic Sea 108MAY90
Zea mays pollen Älbi Flue
Zea mays pollen Aletschwald
Zea mays pollen Alter Rachelsee
Zea mays pollen Anteojos Valley
Zea mays pollen Bachalpsee
Zea mays pollen Baldeggersee
Zea mays pollen Balikh
Zea mays pollen Ballyduff

15.8.3 SQL Example

This example gives a list of all sites with samples that have been ascribed some form of taphonomic change.

SELECT DISTINCT tx.taxonname, 
  tt.taphonomictype,
  st.sitename
FROM         ndb.taphonomictypes AS tt
INNER JOIN ndb.specimentaphonomy AS spet ON spet.taphonomictypeid = tt.taphonomictypeid
INNER JOIN         ndb.specimens AS spe  ON       spe.specimenid = spet.specimenid
INNER JOIN              ndb.data AS dt   ON            dt.dataid = spe.dataid
INNER JOIN         ndb.variables AS var  ON       var.variableid = dt.variableid
INNER JOIN              ndb.taxa AS tx   ON          var.taxonid = tx.taxonid
INNER JOIN           ndb.samples AS smp  ON         smp.sampleid = dt.sampleid
INNER JOIN          ndb.datasets AS ds   ON         ds.datasetid = smp.datasetid
INNER JOIN   ndb.collectionunits AS cu   ON  cu.collectionunitid = ds.collectionunitid
INNER JOIN             ndb.sites AS st   ON            st.siteid = cu.siteid;
Table 15.8: Displaying records 1 - 10
taxonname taphonomictype sitename
Artiodactyla fragment Blaine site [39CU1144]
Bison fragment Blaine site [39CU1144]
Bison bison cancellous Wold Bison Jump [48JO966]
Bison bison decorated Bison Alcove [42GR538]
Eremophila alpestris frozen carcass Belaya Gora Horned Lark
Leporidae thermally altered Atkinson site [DiMe-27]
Mammalia burned Blaine site [39CU1144]
Mammalia fragment Blaine site [39CU1144]
Mammalia (medium) burned Atkinson site [DiMe-27]
Mammalia (small) thermally altered Atkinson site [DiMe-27]

15.9 variablecontexts

Variable Contexts lookup table. Table is referenced by the variables table.

  • variablecontextid (primary key): An arbitrary Variable Context identification number.
  • variablecontext: Depositional context. Examples are:
    • anachronic: A specimen older than the primary deposit, e.g. a Paleozoic spore in a Holocene deposit. The specimen may be redeposited from the catchment, or may be derived from long distance, e.g. Tertiary pollen grains in Quaternary sediments with no local Tertiary source. A Pleistocene specimen in a Holocene archaeological deposit, possibly resulting from aboriginal fossil collecting, would also be anachronic.
    • intrusive: A specimen, generally younger than the primary deposit, e.g. a domestic pig in an otherwise Pleistocene deposit.
    • redeposited: A specimen older than the primary deposit and assumed to have been redeposited from a local source by natural causes.
    • articulated: An articulated skeleton
    • clump: A clump, esp. of pollen grains

15.10 variableelements

Lookup table of Variable Elements. Table is referenced by the variables table.

  • variableelementid (primary key) An arbitrary Variable Element identification number.
  • variableelement: The element, part, or organ of the taxon identified. For plants, these include pollen, spores, and various macrofossil organs, such as seed, twig, cone, and cone bract. Thus, Betula pollen and Betula seeds are two different Variables. For mammals, variableelements include the bone or tooth identified, e.g. tibia. tibia, distal, left, M2, lower, left. Some more unusual elements are Neotoma fecal pellets and Erethizon dorsata quills. If no element is indicated for mammalian fauna, then the genric element bone/tooth is assigned. Elements were not assigned in FAUNMAP, so all Variables ingested from FAUNMAP were assigned the bone/tooth element. Physical Variables may also have elements. For example, the Loss-on-ignition Variables have Loss-on-ignition as a Taxon, and temperature of analysis as an element, e.g. 500°C, 900°C. Charcoal Variables have the size fragments as elements, e.g. 75-100 µm, 100-125 µm.

15.11 variablemodifications

Lookup table of Variable Modifications. Table is referenced by the variables table.

  • variablemodificationid (primary key): An arbitrary Variable Modification identification number.
  • variablemodification: Modification to a specimen. Examples of modifications to bones include carnivore gnawed, rodent gnawed, burned, human butchering. Modifications to pollen grains include various preservation states, e.g. 1/2 grains, degraded, corroded, broken. Most Variables do not have a modification assigned.

15.12 variableunits

Lookup table of Variable Units. Table is referenced by the variables table.

  • variableunitsid (primary key): An arbitrary Variable Units identification number.
  • variableunit: The units of measurement. For fauna, these are present/absent, NISP (Number of Individual Specimens), and MNI (Minimum Number of Individals). For pollen, these are NISP (pollen counts) and percent. Units for plant macrofossils include present/abesnt and NISP, as well as a number of quantitative concentration measurements and semi-quantitative abundance measurements such as 1-5 scale. Examples of charcoal measurement units are fragments/ml and µm^2/ml.

15.13 repositoryinstitutions

A lookup table of institutions that are repositories for fossil specimens. Table is referenced by the repositoryspecimens table.

  • repositoryid (primarykey): An arbitrary Repository identification number. Repositories include museums, university departments, and various governmental agencies.
  • acronym: A unique acronym for the repository. Many repositories have well-established acronyms (e.g. AMNH = of Natural History); however, there is no official list. Various acronyms have been used for some institutions, and in some cases the same acronym has been used for different institutions. Consequently, the database acronym may differ from the acronym used in some publications. For example, CMNH has been used for the Carnegie Museum of Natural History, the Cleveland Museum of Natural History, and the Cincinnati Museum of Natural History. In Neotoma, two of these institutions were assigned different acronyms, ones that have been used for them in other publications: CM – Carnegie Museum of Natural History, CLM – Cleveland Museum of Natural History.
  • repository: The full name of the physical sample repository.
  • notes: Free form notes or comments about the physical sample repository, especially notes about name changes, closures, and specimen transfers. In some cases, it is known that the specimens were transferred, but their current disposition may be uncertain.

15.14 repositoryspecimens

This table lists the repositories in which fossil specimens have been accessioned or reposited. The specimens in Neotoma are linked to the dataset, the collection of specimens of a single datasettype within a collectionunit. When specimens from a single dataset have been reposited at several institutions, there will be multiple records for that dataset in the repositoryspecimens table.

  • datasetid (primary key, foreign key): Dataset identification number. Field links to the datasets table.
  • repositoryid (primary key, foreign key): Repository identification number. Field links to the repositoryinstitutions lookup table.
  • notes: Free form notes or comments about the disposition of the specimens.

15.14.1 SQL Example

This query lists the physical sample repositories for all specimens from the Kimmswick site, and gives us the number of speciments at each repository institution.

SELECT st.sitename, 
  cu.collunitname,
  COUNT(*) AS specimens,
  ri.repository
FROM ndb.sites AS st
INNER JOIN        ndb.collectionunits AS cu ON           cu.siteid = st.siteid
INNER JOIN               ndb.datasets AS ds ON ds.collectionunitid = cu.collectionunitid
INNER JOIN    ndb.repositoryspecimens AS rs ON        rs.datasetid = ds.datasetid
INNER JOIN ndb.repositoryinstitutions AS ri ON     ri.repositoryid = rs.repositoryid
WHERE st.sitename = 'Kimmswick'
GROUP BY st.sitename,
         cu.collunitname,
         ri.repository;
Table 15.9: 2 records
sitename collunitname specimens repository
Kimmswick Locality 1 Illinois State Museum
Kimmswick Locality 1 Mastodon State Historic Site, Missouri

15.15 specimendates

This table enables queries for dated specimens of indivual taxa. Although the materialdated field in the geochronology table may list the taxa dated, this protocol is not enforced, and the field is not linked to the taxa table.

  • specimendateid (primary key): An arbitrary specicimen date ID.
  • geochronid (foreign key): Geochronologic identification number. Field links to the geochronology table.
  • taxonid (foreign key): Accepted name in Neotoma. Field links to taxa table.
  • variableelementid (foreign key): Variable Element identification number. Field links to the variableelements lookup table.
  • sampleid (primary key, foreign key): Sample ID number. Field links to the samples table.
  • notes: Free form notes or comments about dated specimen.