Lab Tools
You can checkout the Wang Bioinformatics Lab code at GitHub.
Mass Spectrometry Query Language¶
The mass spectrometry query language (MassQL) is a domain specific language that specifically aims to express patterns in mass spectrometry data and empower chemists and bioinformaticians to query raw mass spectrometry data. MassQL is designed to be simple, flexible, scalable, and shareable. It aims to make it simple to express a wide range of mass spec data patterns and search across all public mass spectrometry data available, i.e. billions of compounds in hundreds of thousands of samples.
Check it out here.
GNPS¶
GNPS is an entire analysis ecosystem that is comprises computational workflows, community aggregated knowledge, public repository data, and data visualization tools. Summarized below are just a small set of the tools available in GNPS.
GNPS Spectral Libraries¶
Spectral libraries -- a collection of reference tandem mass spectrometry data from known compounds -- are a principal unit of knowledge within the mass spectrometry community. Using spectral libraries of reference compounds is the most common method to identify known compounds in untargeted experiments. However, spectral libraries traditionally were fragmented across the community, silo'd in individual labs. GNPS spectral libraries created the infrastructure to crowd source and enpower the community to deposit their spectral libraries in a centralized location. This has enable the growth of spectral libraries from a few thousand in 2014 to over 500K in 2022.
Check it out here.
GNPS Classical Molecular Networking¶
Molecular Networking is a computational tool that groups up similar MS/MS spectra based upon their fragmentation. The entire pipeline also features multivariate statistics, and spectral library search.
Check it out here.
GNPS Feature Based Molecular Networking¶
Feature Based Molecular Networking is a technique that integrates quantitative feature finding tools with molecular networkings. This transforms the qualitative comparisons between conditions with Classical Molecular Networking into a quantitative comparison, where relative abudance can be used to prioritize identification efforts. This is a broad community effort that uniquely is widely compatible across the most popular open source software and many proprietary vendor software.
GNPS Dashboard¶
The GNPS Dashboard is the only fully web based mass spectrometry interactive visualization tool that enables google-docs like collaboration and sharing of results. It is deeply integrated into community resources, including all proteomics, metabolomics, and glycomics public data repositories as well as online analysis systems such as GNPS. GNPS Dashboard drastically lowers the barrier to entry to visualize and interrogate mass spec data from nearly all instruments with a single click, without the need for proprietary software or any local software installation. This makes it perfect for classroom teaching and data transparency for manuscript reviews and post publication inspection.
Check it out here.
ReDU¶
ReDU is a crowd sourcing tool that has uniquely facilitated the annotation of over 50K public metabolomics analyses with sample information using controlled vocabularies. By aggregating all this information, ReDU empowers the community to seemlessly select subsets of public data for reanalysis and metaanalysis.
Check it out here.
MASST¶
The Mass Spectrometry Search Tool (MASST) is the first tool to enable data driven searching of all public metabolomics data at a repository scale. MASST is the analog to BLAST in the sequencing world that enabled the searching of NCBI and SRA with a nucleotide sequence. Here, MASST enables the searching of molecules using their tandem mass spectra across all public mass spectrometry data from all major repositories: MassIVE, Metabolights, and Metabolomics Workbench.
Check it out here.