--- title: "Discovery, Identification, and Screening of Lipids and Oxylipins in HPLC-MS Datasets Using LOBSTAHS" author: "James Collins" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Discovery, Identification, and Screening of Lipids and Oxylipins in HPLC-MS Datasets Using LOBSTAHS} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ### Introduction This document describes the purpose and use of the R-package **LOBSTAHS: Lipid and Oxylipin Biomarker Screening through Adduct Hierarchy Sequences**. The package is described in Collins et al. 2016.^[1](#note1) In the sections below, LOBSTAHS package functions are applied to a model dataset using example code. LOBSTAHS requires the additional packages [xcms](https://bioconductor.org/packages/release/bioc/html/xcms.html)^[2](#note2) and [CAMERA](https://bioconductor.org/packages/release/bioc/html/CAMERA.html)^[3](#note3); the model dataset, consisting of lipid data from cultures of the marine diatom *Phaeodactylum tricornutum*, is contained in the R data package [PtH2O2lipids](https://github.com/vanmooylipidomics/PtH2O2lipids). ### Purpose LOBSTAHS contains several functions to help scientists discover and identify lipid and oxidized lipid biomarkers in HPLC-MS data that have been pre-processed with the popular R packages [xcms](https://bioconductor.org/packages/release/bioc/html/xcms.html) and [CAMERA](https://bioconductor.org/packages/release/bioc/html/CAMERA.html). First, LOBSTAHS uses exact mass to make initial compound assignments from a set of customizable onboard databases. Then, a series of orthogonal screening criteria are applied to refine and winnow the list of assignments. A basic workflow based on xcms, CAMERA, and LOBSTAHS is illustrated in the schematic. Each step in the figure is described in detail in the following paragraphs.

### Installing the LOBSTAHS package ##### Install dependencies: ```{r, eval = FALSE} source("http://bioconductor.org/biocLite.R") biocLite("CAMERA") biocLite("xcms") ``` ##### Install RTools: For windows: Download and install RTools from [http://cran.r-project.org/bin/windows/Rtools/](http://cran.r-project.org/bin/windows/Rtools/) For Unix: Install the R-development-packages (r-devel or r-base-dev) ##### Install packages needed for installation from Github: ```{r, eval = FALSE} install.packages("devtools") ``` ##### Install LOBSTAHS: ```{r, eval = FALSE} library("devtools") install_github("vanmooylipidomics/LOBSTAHS") ``` ##### Install 'PtH2O2lipids,' containing example data & precursor xsAnnotate object: ```{r, eval = FALSE} ## install dataset 'PtH2O2lipids' ## see LOBSTAHS documentation for examples install_github("vanmooylipidomics/PtH2O2lipids") ``` ### Acquisition of HPLC-MS data suitable for LOBSTAHS Data for LOBSTAHS should be acquired using a mass spectrometer with sufficiently high mass accuracy and resolution. Suitable MS acquisition platforms with which LOBSTAHS has been tested include Fourier transform ion cyclotron resonance (FT-ICR) and Orbitrap instruments. A quadrupole time-of-flight (Q-TOF) instrument could also be used if it possessed sufficient mass accuracy (i.e., < 5-7 ppm). While the software will accept any HPLC-MS data as input, insufficient mass accuracy and resolution will produce results with large numbers of database matches for each feature that cannot be distinguished from each other. ### File conversion After acquisition, each data file should be converted from manufacturer to open source (.mzXML) format in centroid (not profile) mode. If data were acquired on the instrument using ion mode switching, the data from each ionization mode (i.e., polarity) must also be extracted into a separate file. The [msConvert tool](http://proteowizard.sourceforge.net/tools/msconvert.html) (part of the ProteoWizard toolbox) can be used to accomplish these tasks. msConvert commands can be executed with either the provided GUI or at the command line; however, conversion of manufacuter file formats can only be accomplished using the Windows installation. An R script ([Exactive_full_scan_process_ms1+.r](https://github.com/vanmooylipidomics/LipidomicsToolbox/blob/master/Exactive_full_scan_process_ms1%2B.r)) for batch conversion and extraction of data files acquired on a Thermo Exactive Orbitrap instrument is provided as part of the [Van Mooy Lab Lipidomics Toolbox](https://github.com/vanmooylipidomics/LipidomicsToolbox). This vignette is not intended as a comprehensive guide to mass spectrometer file conversion; users are encouraged to digest the [msConvert documentation](http://proteowizard.sourceforge.net/tools/msconvert.html). However, assuming a hypothetical data file "Exactive_data.raw" was acquired on an Thermo Orbitrap instrument with ion mode switching, the following code could be within R used to convert and then extract positive and negative mode scans to separate files. (Code presumes you've already installed msConvert.) ##### Initial file conversion; saves converted file to a directory "mzXML_ms1_two_mode": ```{r, eval = FALSE} system(paste("msconvert Exactive_data.raw --mzXML --filter \"peakPicking true 1-\" -o mzXML_ms1_two_mode -v")) ``` ##### Extract positive, negative mode scans, then save in separate directories: ```{r, eval = FALSE} system(paste("msconvert mzXML_ms1_two_mode/Exactive_data.mzXML --mzXML --filter \"polarity positive\" -o mzXML_ms1_pos_mode -v")) system(paste("msconvert mzXML_ms1_two_mode/Exactive_data.mzXML --mzXML --filter \"polarity negative\" -o mzXML_ms1_neg_mode -v")) ``` ### Pre-processing with xcms After all data files in a particular dataset have been converted and extracted into files of like polarity, the R-package [xcms](https://bioconductor.org/packages/release/bioc/html/xcms.html) can then be used to perform feature detection, retention time correction, and peak grouping. While the paragraphs below contain basic instructions for preparation of data, this vignette is not intended as ae manual or guide to the complex world of mass spectrometer data processing in xcms and CAMERA; users should acquaint themselves with the manuals ([here](https://bioconductor.org/packages/release/bioc/manuals/xcms/man/xcms.pdf) and [here](https://bioconductor.org/packages/release/bioc/manuals/CAMERA/man/CAMERA.pdf)) and very helpful vignettes ([here](https://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcmsPreprocess.pdf) and [here](https://bioconductor.org/packages/release/bioc/vignettes/CAMERA/inst/doc/CAMERA.pdf)) for the two packages. First, data files should be given intuitive file names (containing, e.g., a sample ID along with information on experimental timepoint and treatment) and placed into a directory structure according to an index variable; the xcms vignette [LC/MS Preprocessing and Analysis with xcms](https://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcmsPreprocess.pdf) contains a detailed explanation. Feature detection, retention time correction, and peak grouping should then be performed. Values of parameters for xcms functions can be obtained from several sources: * Package default values can be used (not recommended) * Values may be obtained from the literature^[4](#note4) * The R-package [IPO](https://github.com/glibiseller/IPO)^[5](#note5) can be used to optimize the values of many of the required parameters The R script [prepOrbidata.R](https://github.com/vanmooylipidomics/LipidomicsToolbox/blob/master/prepOrbidata.R) (part of the [Van Mooy Lab Lipidomics Toolbox](https://github.com/vanmooylipidomics/LipidomicsToolbox)) contains code for complete preparation in xcms of the PtH2O2lipids (or similar) dataset. The script allows the user to apply parameter values assembled from the literature or obtain them from IPO optimization of a subset of samples. The final parameter values used in Collins et al. 2016 for analysis of the PtH2O2lipids dataset -- obtained using HPLC-ESI-MS on an Exactive Orbitrap instrument -- are given in Table S5 of the electronic supplement. The settings and values listed in the table could be used as a starting point for analysis of similar data. The xcmsSet produced using these settings from the *P. tricornutum* data is stored in the [PtH2O2lipids](https://github.com/vanmooylipidomics/PtH2O2lipids) package: ```{r, warning = FALSE, message = FALSE} library(PtH2O2lipids) ptH2O2lipids$xsAnnotate@xcmsSet ``` Whatever settings are used in xcms, the processed data should contain high-quality features which have been aligned across samples; the quality of the processed data should be verified by individual inspection of a subset of features. At the conclusion of processing with xcms, the user should have a single `xcmsSet` object containing the dataset. ### Final pre-processing with CAMERA In the final pre-processing step, the R-package [CAMERA](https://bioconductor.org/packages/release/bioc/html/CAMERA.html) should be used to (1) aggregate peak groups in the `xcmsSet` object into pseudospectra and (2) identify features in the data representing likely isotope peaks. Use of CAMERA to aggregate the peak groups into pseudospectra is critical because LOBSTAHS applies its various orthogonal screening criteria to only those peak groups within each pseudospectrum. CAMERA should not be used to eliminate adduct ions since the adduct ion hierarchy screening function in LOBSTAHS presumes that all adduct ions for a given analyte will be present in the dataset. We can create an `xsAnnotate` object from the *P. tricornutum* data by applying the wrapper function `annotate()` to the `xcmsSet` we created in the previous step: ```{r, warning = FALSE, message = FALSE, eval = FALSE} library(xcms) library(CAMERA) library(LOBSTAHS) # first, a necessary workaround to avoid a import error; see # https://support.bioconductor.org/p/69414/ imports = parent.env(getNamespace("CAMERA")) unlockBinding("groups", imports) imports[["groups"]] = xcms::groups lockBinding("groups", imports) # create annotated xset using wrapper annotate(), allowing us to perform all # CAMERA tasks at once xsA = annotate(ptH2O2lipids$xsAnnotate@xcmsSet, quick=FALSE, sample=NA, # use all samples nSlaves=1, # set to number of available cores or processors if # > 1 # group FWHM settings (defaults) sigma=6, perfwhm=0.6, # groupCorr settings (defaults) cor_eic_th=0.75, graphMethod="hcs", pval=0.05, calcCiS=TRUE, calcIso=TRUE, calcCaS=FALSE, # findIsotopes settings maxcharge=4, maxiso=4, minfrac=0.5, # adduct annotation settings psg_list=NULL, rules=NULL, polarity="positive", # the PtH2O2lipids xcmsSet contains # positive-mode data multiplier=3, max_peaks=100, # common to multiple tasks intval="into", ppm=2.5, mzabs=0.0015 ) #> Start grouping after retention time. #> Created 113 pseudospectra. #> Generating peak matrix! #> Run isotope peak annotation #> % finished: 10 20 30 40 50 60 70 80 90 100 #> Found isotopes: 5692 #> Start grouping after correlation. #> Generating EIC's .. #> #> Calculating peak correlations in 113 Groups... #> % finished: 10 20 30 40 50 60 70 80 90 100 #> #> Calculating isotope assignments in 113 Groups... #> % finished: 10 20 30 40 50 60 70 80 90 100 #> Calculating graph cross linking in 113 Groups... #> % finished: 10 20 30 40 50 60 70 80 90 100 #> New number of ps-groups: 5080 #> xsAnnotate has now 5080 groups, instead of 113 #> Generating peak matrix for peak annotation! #> #> Calculating possible adducts in 5080 Groups... #> % finished: 10 20 30 40 50 60 70 80 90 100 ``` We now have an `xsAnnotate` object "xsA" to which we will next apply the screening and identification functions of LOBSTAHS. In the `annotate()` call, we set `quick = FALSE` because we want to run `groupCorr()`. This will also cause CAMERA to perform internal adduct annotation. While we will perform our own adduct annotation later with LOBSTAHS, allowing CAMERA to identify its own adducts doesn't hurt, particularly if it leads to the creation of better pseudospectra. ### The LOBSTAHS database: Use the default, or easily create your own Before screening a dataset with LOBSTAHS, you should first decide whewther to use one of two default databases, or generate your own from the templates provided. LOBSTAHS databases contain a mixture of *in silico* and empirical data for the different adduct ions of a wide range of intact polar diacylglycerols (IP-DAG), triacylglycerols (TAG), polyunsaturated aldehydes (PUAs), free fatty acids (FFA), and common photosynthetic pigments. In addition, the latest LOBSTAHS release includes support for lyso lipids under an "IP_MAG" species class. The default databases (as of August 26, 2016) include 14,068 and 11,408 unique compounds that can be identifed in positive and negative ionization mode data, respectively. The databases can be easily customized (see below) if the user wishes to identify additional lipids in new lipid classes. LOBSTAHS databases are contained in S4 `LOBdbase` objects that are generated or accessed by various package functions. Each `LOBdbase` is specific to a particular polarity (i.e., ion mode); if evaluating positive-mode data, you will use a positive mode database (and vice versa). We can access the default databases to examine their scope: ```{r, warning = FALSE, message = FALSE} library(LOBSTAHS) data(default.LOBdbase) default.LOBdbase$positive # default positive mode database default.LOBdbase$negative # default negative mode database ``` ### Input tables required to generate databases LOBSTAHS generates databases based on values of parameters defined in simple tables. The values in these tables define the range of molecular properties within each lipid class for which database entries will be created. Database generation is accomplished with the function `generateLOBdbase()`. Default values can be viewed by loading the default tables into the R workspace (directions below); the default input data are also given in Tables 1 and 2 of Collins et al. 2016. A first table, the `componentCompTable`, defines the base elemental compositions of each molecule or lipid class that will be considered when the databsse is created. If a user wishes to add additional molecules or classes of molecules to the default set of compounds, he or she will need to add additional entrires to this table. Two other user-editable tables define ranges of chemical properties for which databsase entires are created within each lipid class specified in the `componentCompTable`: * `acylRanges`: total acyl carbon chain length (i.e., the total number of carbon atoms in the fatty acids that make up TAG, PUA, IP-DAG, and IP-MAG) and degree of acyl carbon chain unsaturation (i.e., the number of possible carbon-carbon fatty acid double bonds) considered for each lipid class * `oxyRanges`: possible oxidation state (i.e., the number of additional oxygen atoms to be considered on each compound) A fourth table, `adductHierarchies`, contains empirical data on the relative abundances of various adduct ions formed by the compounds that belong to each lipid class. This table is also user-editable, but any additions or changes should first be confirmed by empirical analyusis. Users can load the default tables into the R workspace (and subsequently view them) at any time: ```{r, warning = FALSE, message = FALSE, eval = FALSE} data(default.acylRanges) data(default.oxyRanges) data(default.componentCompTable) data(default.adductHierarchies) ``` ### Customization of database inputs Users can easily customize the values of the input parameters used to create databases in LOBSTAHS. This can be accomplished by modidying one or more of the Microsoft Excel templates that are provided with the package. The templates are automatically installed with LOBSTAHS into the /LOBSTAHS/doc/ subdirectory of your R library path. You can use `.libPaths()` to find the location of your R library. Alternatively, the latest versions of the templates can dopwnloaded from the [LOBSTAHS GitHub repository](https://github.com/vanmooylipidomics/LOBSTAHS/tree/master/inst/doc/xlsx). What modifications must be made to which table(s) will depend on the user's goal. For example, if the user wishes only to expand (or constrain) the range of molecular properties for which database entries are to be created within an existing lipid class, only the `acylRanges` and/or `oxyRanges` tables need be modified. However, a user seeking to add a new molecules or lipid class to the database(s) will need to first define the new molecule/class in both the `componentCompTable` and `adductHierarchies` tables. If the new entry is a single molecule (e.g., a new pigment), the new data in these two tables will be sufficient. However, if a new class of acyl lipid is defined, corresponding data must also be added to the `acylRanges` and (if necessary) `oxyRanges` tables. Once any modificatons are made, the user should save the table(s) to text file(s) in comma-separated values (.csv) format; these can then be imported into R when the `generateLOBdbase()` function is called. ### Database generation using generateLOBdbase LOBSTAHS databases are generated using the `generateLOBdbase()` function. `generateLOBdbase` uses an *in silico* simulation to create database entries defined by parameters in the input tables. One entry is created for each possible adduct ion of a parent compound. `generateLOBdbase` can create databases for one or both ion modes. Paths to any .csv files containing user-customized input data should be specified during the call to `generateLOBdbase`; if NULL or no value is specified for any of the input tables, the defaults (see above) will be used. Finally, the user can specify whether a .csv file containing the new database should be created in addition to the `LOBdbase` object. To recreate the default databases and store them to an object called "LOBdb", the user would run the following: ```{r, eval = FALSE} LOBdb = generateLOBdbase(polarity = c("positive","negative"), gen.csv = FALSE, component.defs = NULL, AIH.defs = NULL, acyl.ranges = NULL, oxy.ranges = NULL) ``` ### Compound/biomarker identification and screening using doLOBscreen: The "meat" of LOBSTAHS Once the user has created his/her database (or has decided to use the appropriate onboard default), compound identification and screening can be performed using the function `doLOBscreen()`. At this point, the user should have in hand an `xsAnnotate` object containing data which have processed in xcms and CAMERA. Working sequentially within each CAMERA pseudospectrum, `doLOBscreen` accomplishes the following (see also the schematic): 1. First, if user elected `remove.iso = TRUE`, any secondary isotope peaks identified by CAMERA are removed from the dataset. 1. Using a matching tolerance (`match.ppm`) and database specified by the user, putative (initial) compound assignments are applied to features in the dataset. The `match.ppm` should reflect the accuracy of the mass spectrometer used to acquire the data. If no value is given for `database`, the default database of the appropriate polarity will be used. `polarity = positive` or `polarity = negative` should be specified. If no polarity is given, `doLOBscreen` will attempt to detect it from the dataset... but detection is not perfect. At this point in the screening process, the current version of LOBSTAHS discards any features for which a match was not found in the database.^[6](#note6) 1. If user elected `rt.restrict = TRUE`, the (corrected) retention time of each feature to which an assignment has just been made is compared against the retention time range expected for the assignment's parent lipid class. These expected retention time ranges, or "windows," are specified in a table. The default values (specific to the chromatography under which LOBSTAHS was developed at Woods Hole Oceanographic Institution^[7](#note7)) can be accessed in a manner similar to the database generation input defaults: ```{r, eval = FALSE} data(default.rt.windows) ``` The default retention time windows are also given in Table S2 of the electronic supplement to Collins et al. 2016. If the user wishes to use the retention time restriction feature with his/her own retention time data (highly recommended) a .csv table can be created from an Excel template available in the same subdirectory (/LOBSTAHS/doc/) of the R library path where the database generation templates reside. The template can also be downloaded from the [LOBSTAHS GitHub repository](https://github.com/vanmooylipidomics/LOBSTAHS/tree/master/inst/doc/xlsx). **Important note:** To account for shifts in retention time that occur during chromatographic alignment in xcms, LOBSTAHS automatically expands the retention time ranges given in the `rt.windows` table by 10% at each extreme. If xcms retention time correction results in large differences between raw and corrected retention times, the user should elect **not** to apply retention time restriction in LOBSTAHS (i.e., set `rt.restrict = FALSE`) since valid feartures could be lost. The extent of deviation between raw and corrected retention times can be diagnosed using the retention time correction profile plot, obtained with `plottype = "mdevden"` when calling `rector` in xcms. A future version of LOBSTAHS will allow the user to the set the factor by which the lipid class retention time windows should be expanded. 1. Next, assignments with an odd total number of acyl carbon atoms can be eliminated (achieved by setting `exclude.oddFA = TRUE`). Applies only to acyl lipids (i.e., IP-DAG, TAG, PUA, or free fatty acids). Useful if data are (or are believed to be) of exclusively eukaryotic origin, since synthesis of fatty acids with odd numbers of carbon atoms is not known in eukaryotes.^[8](#note8) 1. A series of adduct ion hierarchy rules are then applied to the remaining assignments. The theory and development of these rules is described in Collins et al 2016. and summarized in the [schematic above](#schematic). During the adduct ion screening process, a [series of codes](#schematic) are applied to each assignment to indicate the degree to which it satisfied the hierarchy rules Unlike the other screening features, application of the adduct ion hierarchy rules is not optional. 1. Once the list of compound assignments has been screened by pseudospectrum according to the user's spefifications, the assignments are then evaluated as a single group to identify possible isomers and isobars (compounds having distinct but very similar *m/z*). These isomers and isobars are annotated with [additional codes](#schematic) so the user can examine them in subsequent analysis. Once done, `doLOBscreen` returns a `LOBSet` object containing the fully screened dataset. To screen the PtH2O2lipids `xsAnnotate` object using the same settings that produced the results in Collins et al. 2016, the user would run: ```{r, eval = FALSE} myPtH2O2LOBSet = doLOBscreen(ptH2O2lipids$xsAnnotate, polarity = "positive", database = NULL, remove.iso = TRUE, rt.restrict = TRUE, rt.windows = NULL, exclude.oddFA = TRUE, match.ppm = 2.5) ``` In this example, the object "myPtH2O2LOBSet" would be identical to the screened `LOBSet` in the PtH2O2lipids package (`ptH2O2lipids$LOBSet`). Information about a `LOBSet` can be viewed by calling the object at the R prompt. For example: ```{r, warning = FALSE, message = FALSE, eval = FALSE} ptH2O2lipids$LOBSet #> A positive polarity "LOBSet" containing LC-MS peak data. Compound assignments and adduct ion hierarchy screening annotations applied to 16 samples using the "LOBSTAHS" package. #> #> Individual peaks: 21869 #> Peak groups: 1595 #> Compound assignments: 1969 #> m/z range: 551.425088845409-1269.09515435315 #> #> Peak groups having possible regisomers: 556 #> Peak groups having possible structural functional isomers: 375 #> Peak groups having isobars indistinguishable within ppm matching tolerance: 84 #> #> Restrictions applied prior to conducting adduct ion hierarchy screening: remove.iso, rt.restrict, exclude.oddFA #> #> Match tolerance used for database assignments: 2.5 ppm #> #> Memory usage: 1.26 MB ``` More detailed diagnostic information can also be obtained from the `LOBSet`. The effectiveness of the various screening criteria are recorded in a data frame `LOBscreen_diagnostics`: ```{r, warning = FALSE, message = FALSE, eval = FALSE} LOBscreen_diagnostics(ptH2O2lipids$LOBSet) #> peakgroups peaks assignments parent_compounds #> initial 18314 251545 NA NA #> post_remove_iso 12146 163938 NA NA #> initial_assignments 5077 67862 15929 14076 #> post_rt_restrict 4451 60070 13504 11779 #> post_exclude_oddFA 3871 52337 7458 6283 #> post_AIHscreen 1595 21869 2056 1969 ``` The numbers of isomers/isboars identified, and the number of assignments/compounds affected by these identifications, are recorded in `LOBisoID_diagnostics`: ```{r, warning = FALSE, message = FALSE, eval = FALSE} LOBisoID_diagnostics(ptH2O2lipids$LOBSet) #> peakgroups parent_compounds assignments features #> C3r_regio.iso 556 352 750 7591 #> C3f_funct.struct.iso 375 577 752 5057 #> C3c_isobars 84 162 195 1137 ``` ### Follow-on analysis of screened data With the set of screened compound assignments in hand, the user has several options. Users familiar with R can extract data for further analysis or screening directly from the `LOBSet` object. Alternatively, the function `getLOBpeaklist` can be used to extract a table of results from the `LOBSet`. Options in `getLOBpeaklist` allow the user to (1) include isomer and isobar cross-references (the default; recommended) and (2) simultaneously generate a .csv file with the results. The .csv file is exported with a unique timestamp to the R working directory. ### Package updates and improvements Users are encouraged to submit issues with LOBSTAHS via the [package GitHub site](https://github.com/vanmooylipidomics/LOBSTAHS/issues). Several improvements to the package are planned; the package will be updated as these are incorporated. Notice will be posted to the package [releases page](https://github.com/vanmooylipidomics/LOBSTAHS/releases) when updates are published. ### Copyright LOBSTAHS is copyright (c) 2015-2016, by the following members of the Van Mooy Laboratory group at Woods Hole Oceanographic Institution: James R. Collins, Bethanie R. Edwards, Helen F. Fredricks, and Benjamin A.S. Van Mooy. All accompanying written materials, including this vignette, are copyright (c) 2015-2016, James R. Collins. LOBSTAHS is provided under the [GNU Public License](https://github.com/vanmooylipidomics/LOBSTAHS/blob/master/LICENSE) and subject to terms of reuse as specified therein. ### References Benton, H. P., Want, E. J., and Ebbels, T. M. D. 2010. Correction of mass calibration gaps in liquid chromatography-mass spectrometry metabolomics data. *Bioinformatics* **26**: 2488-2489 Collins, J.R., Edwards, B.R., Fredricks, H.F., and Van Mooy, B.A.S. 2016. LOBSTAHS: An adduct-based lipidomics strategy for discovery and identification of oxidative stress biomarkers. *Anal. Chem.* **88**: 7154-7162; [doi:10.1021/acs.analchem.6b01260](http://dx.doi.org/10.1021/acs.analchem.6b01260) Hummel, J., Segu, S., Li, Y., Irgang, S., Jueppner, J., and Giavalisco, P. 2011. Ultra performance liquid chromatography and high resolution mass spectrometry for the analysis of plant lipids. *Front Plant Sci* **2** Kuhl, C., Tautenhahn, R., Bottcher, C., Larson, T. R., and Neumann, S. 2012. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. *Anal. Chem.* **84**: 283-289 Libiseller, G., Dvorzak, M., Kleb, U., Gander, E., Eisenberg, T., Madeo, F., Neumann, S., Trausinger, G., Sinner, F., Pieber, T., and Magnes, C. 2015. IPO: a tool for automated optimization of XCMS parameters. *BMC Bioinformatics* **16**: 118 Patti, G.J., Tautenhahn, R., and Siuzdak, G. 2012. Meta-analysis of untargeted metabolomic data from multiple profiling experiments. *Nat. Protocols* **7**: 508-516 Pearson, A. 2014. Lipidomics for geochemistry. In *Treatise on Geochemistry*, Holland, H. D., and Turekian, K. K. eds., 2nd Ed., pp. 291-336. Elsevier: Oxford. Smith, C. A., Want, E. J., O'Maille, G., Abagyan, R., and Siuzdak, G. 2006. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. *Anal. Chem.* **78**: 779–787 Tautenhahn, R., Boettcher, C., and Neumann, S. 2008. Highly sensitive feature detection for high resolution LC/MS. *BMC Bioinformatics* **9**: 504 ### Notes 1. Collins, J.R., Edwards, B.R., Fredricks, H.F., and Van Mooy, B.A.S., 2016, "LOBSTAHS: An adduct-based lipidomics strategy for discovery and identification of oxidative stress biomarkers," *Anal. Chem.* **88**: 7154-7162; [doi:10.1021/acs.analchem.6b01260](http://dx.doi.org/10.1021/acs.analchem.6b01260) 1. Smith et al., 2006, "XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification," *Anal. Chem.* **78**: 779–787; Tautenhahn et al., 2008, "Highly sensitive feature detection for high resolution LC/MS," *BMC Bioinformatics* **9**: 504; Benton et al., 2010, "Correction of mass calibration gaps in liquid chromatography-mass spectrometry metabolomics data," *Bioinformatics* **26**: 2488-2489 1. Kuhl et al., 2012, "CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets," *Anal. Chem.* **84**: 283-289 1. See, e.g., Patti et al., 2012, "Meta-analysis of untargeted metabolomic data from multiple profiling experiment," *Nature Protocols* **7**: 508-516 1. Libiseller et al., 2015, "IPO: a tool for automated optimization of XCMS parameters," *BMC Bioinformatics* **16**: 118 1. A planned improvement, currently under development, will allow the user to retain unidentified features in a separate data object. 1. We used a modified version of the chromatography presented in Hummel et al., 2011, "Ultra performance liquid chromatography and high resolution mass spectrometry for the analysis of plant lipids," *Front Plant Sci* **2**; see electronic supplement to Collins et al. 2016 for full specification. 1. See A. Pearson, 2014, "Lipidomics for geochemistry" in *Treatise on Geochemistry* (Holland, H. D., and Turekian, K. K. eds.), 2nd Ed., Elsevier, Oxford, pp. 291-336.