Welcome to the home of lipyd, a Python module for lipidomics LC MS/MS data analysis. lipyd aims to cover the entire workflow of lipidomics data analysis starting from preprocessing raw MS data, followed by lookup of masses in metabolite databases, identification based on MS2 spectra, and analysis of higher level patterns across multiple scans, runs and experiments. lipyd is modular which means you can use individual modules and integrate in your framework e.g. only for database lookup. lipyd does not intend to be a one click solution but a software library anyone can use to build their own pipelines. In each module and class we try to set reasonable defaults and we provide tutorials to show how a workflow builds up. As of December 2019 certain key elements are not yet complete, but we are in active development as we are planning to publish a paper in the next few months and we want to provide complete workflow at that point. See details below.
Overview of the lipyd Python module. Boxes roughly correspond to modules and classes. See PDF version here.
At reading raw mass spec data from mzML files, peak picking and feature detection we rely on the OpenMS library. This ensures computationally efficient processing by well established methods. In 2019 we completed our OpenMS metabolomics preprocessing pipeline. However the parametrization of the processing algorithms still needs a thorough benchmark and quality control. Hence we can not recommend to use it for data analysis. Over the next half year we will test it on a wide variety of data and compare the results to free and commercial software. Once the testing period is completed this will be the recommended method for MS data preprocessing and lipyd will be able to do everything from raw MS data until lipid identification. In addition we are investigating the possibility to process Thermo RAW files directly. we provide a temporary solution to read already preprocessed features from CSV files exported by the PEAKS software. We are not comfortable with the idea of building on expensive proprietary software and in the near future we will provide complete integration with OpenMS.
The lipyd.moldb module provides an unified interface to standard databases like SwissLipids and LipidMaps. In addition it is able to generate custom metabolite masses. With the default settings the database consists of more than 100 thousands of lipid species. The lipyd.lipid module contains more than 150 predefined lipid classes and it's easy to define new ones. The Sample and SampleSet objects in lipyd.sample, which represent a series of features, support the automatic lookup in the databases.
The lipyd.ms2 module contains generic classes to support the analysis and identification of MS2 spectra. Based on around 50 standards run by our group and reviewing many spectra from publications and databases we created built in rules for identification of more than 80 lipid classes. You can modify the methods or create new ones by writing Python methods. However we are working on MFQL or LDA format integration to provide a more standard way of defining rules. Also we will introduce similarity search against spectrum databases.
The lipyd.sample and lipyd.feature modules provide classes for analysis of features optionally in relation to other variables and filter them according to custom rules. Analysis and filtering of the features can be done before or after the lipid identification. Doing it before reduces the number of MS2 spectra to be analyzed this way saving time. In the future we will add more utilities to build arrays of features and also MS2 fragments across arbitrary number of experiments to provide opportunities for higher level analysis.
Forks, pull requests, bug reports and feature requests are welcome!
See more details here.
Chemical calculator, molecule and fragment databases, MS2 identification
If you are experiencing problems or having questions please file an issue on github. We prefer this way as public questions and answers might help later others. Of course you can also write me private mails: