SpecieScan: semi-automated taxonomic identification of bone collagen peptides from MALDI-ToF-MS
More On Article
- HEAS member Gerhard Weber starts a new FWF Project to study the 3D morphology of human postcanine teeth
- Gradual exacerbation of obstetric constraints during hominoid evolution implied by re-evaluation of cephalopelvic fit in chimpanzees
- 20th anniversary of the Laboratory for scanning electron microscopy at the Vienna Institute for Archaeological Science (VIAS), University Vienna, 14.11.2024, 15:00
- Datenkontrolle, -aufbereitung und -auswertung portabler Röntgenfluoreszenzanalysen (p-RFA) mit dem Bruker Tracer 5i No 900F398 an silikatischem Material des Brandopferplatzes bei Farchant, Lkr. Garmisch-Partenkirchen
- Deep genetic substructure within bonobos
Végh, E.I., Douka, K., 2024. SpecieScan: semi-automated taxonomic identification of bone collagen peptides from MALDI-ToF-MS. Bioinformatics 40.
Abstract
Motivation
Zooarchaeology by Mass Spectrometry (ZooMS) is a palaeoproteomics method for the taxonomic determination of collagen, which traditionally involves challenging manual spectra analysis with limitations in quantitative results. As the ZooMS reference database expands, a faster and reproducible identification tool is necessary. Here we present SpecieScan, an open-access algorithm for automating taxa identification from raw MALDI-ToF mass spectrometry (MS) data.
Results
SpecieScan was developed using R (pre-processing) and Python (automation). The algorithm’s output includes identified peptide markers, closest matching taxonomic group (taxon, family, order), correlation scores with the reference databases, and contaminant peaks present in the spectra. Testing on original MS data from bones discovered at Palaeothic archaeological sites, including Denisova Cave in Russia, as well as using publicly-available, externally produced data, we achieved >90% accuracy at the genus-level and ∼92% accuracy at the family-level for mammalian bone collagen previously analysed manually.
Availability and implementation
The SpecieScan algorithm, along with the raw data used in testing, results, reference database, and common contaminants lists are freely available on Github (https://github.com/mesve/SpecieScan).