Title: | Binning and Visualizing NMR Spectra in Environmental Samples |
---|---|
Description: | A reproducible workflow for binning and visualizing NMR (nuclear magnetic resonance) spectra from environmental samples. The 'nmrrr' package is intended for post-processing of NMR data, including importing, merging and, cleaning data from multiple files, visualizing NMR spectra, performing binning/integrations for compound classes, and relative abundance calculations. This package can be easily inserted into existing analysis workflows by users to help with analyzing and interpreting NMR data. |
Authors: | Kaizad Patel [aut, cre] |
Maintainer: | Kaizad Patel <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2025-02-16 04:32:04 UTC |
Source: | https://github.com/kaizadp/nmrrr |
NMR grouping bins from Cade-Menun (2015), for 31P, using D2O as a solvent. (1) polyphosphate (-20 to -4); (2) diester (-1.5 to 2.0); (3) monoester (3.0 to 5.5); (4) orthophosphate (5.5 to 9.0); (5) phosphate (9.0 to 40.0)
bins_CadeMenun2015
bins_CadeMenun2015
A data frame with 5 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
B. Cade-Menun. "Improved peak identification in 31P-NMR spectra of environmental samples with a standardized method and peak library". Geoderma. doi:10.1016/j.geoderma.2014.12.016
bins_Clemente2012
bins_Lynch2019
bins_Mitchell2018
NMR grouping bins from Clemente et al. (2012), using DMSO-D6 as solvent. (1) aliphatic polymethylene and methyl groups (0.6–1.3 ppm, “aliphatic1”); (2) aliphatic methyl and methylene near O and N (1.3–2.9 ppm, “aliphatic2”); (3) O-alkyl, mainly from carbohydrates and lignin (2.9–4.1 ppm); (4) alpha-proton of peptides (4.1–4.8 ppm); (5) aromatic and phenolic (6.2–7.8 ppm); and (6) amide, from proteins (7.8–8.4 ppm).
bins_Clemente2012
bins_Clemente2012
A data frame with 6 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
JS Clemente et al. 2012. “Comparison of Nuclear Magnetic Resonance Methods for the Analysis of Organic Matter Composition from Soil Density and Particle Fractions.” Environmental Chemistry doi:10.1071/EN11096
bins_Lynch2019
bins_Mitchell2018
bins_Hertkorn2013
NMR grouping bins from Hertkorn et al. (2013), using MeOD as solvent. (1) aliphatics, HCCC (0.0-1.9); (2) acetate analogs and CRAM (carboxyl-rich alicyclic materials), HCX (1.9-3.1); (3) carbohydrate-like and methoxy, HCO (3.1-4.9); (4) olefinic HC=C (5.3-7.0); (5) aromatic (7.0-10.0).
bins_Hertkorn2013
bins_Hertkorn2013
A data frame with 5 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
N. Hertkorn et al. 2013. "High-field NMR spectroscopy and FTICR mass spectrometry: powerful discovery tools for the molecular level characterization of marine dissolved organic matter" Biogeosciences doi:10.5194/bg-10-1583-2013
bins_Clemente2012
bins_Lynch2019
bins_Mitchell2018
NMR grouping bins from Lynch et al. (2019), using D2O as solvent. (1) methyl, methylene, and methane bearing protons (0.6–1.6 ppm) ; (2) unsaturated functional groups (1.6–3.2 ppm), including ketone, benzylic, and alicyclic-bearing protons; (3) unsaturated, heteroatomic compounds, including O-bearing carbohydrates, ethers, and alcohols (3.2–4.5 ppm); (4) conjugated, double bond functionalities, including aromatic, amide, and phenolic structures (6.5–8.5 ppm).
bins_Lynch2019
bins_Lynch2019
A data frame with 4 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
LM Lynch et al. 2019. “Dissolved Organic Matter Chemistry and Transport along an Arctic Tundra Hillslope.” Global Biogeochemical Cycles doi:10.1029/2018GB006030
bins_Clemente2012
bins_Mitchell2018
bins_Hertkorn2013
NMR grouping bins from Mitchell et al. (2018), using DMSO-D6 as solvent. (1) aliphatic polymethylene and methyl groups (0.6–1.3 ppm); (2) N- and O-substituted aliphatic (1.3–2.9 ppm); (3) O-alkyl (2.9–4.1 ppm); (4) alpha-proton of peptides (4.1–4.8 ppm); (5) anomeric proton of carbohydrates (4.8–5.2 ppm); (6) aromatic and phenolic (6.2–7.8 ppm); (7) amide (7.8–8.4 ppm).
bins_Mitchell2018
bins_Mitchell2018
A data frame with 7 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
P Mitchell et al. 2018. “Nuclear Magnetic Resonance Analysis of Changes in Dissolved Organic Matter Composition with Successive Layering on Clay Mineral Surfaces.” Soil Systems doi:10.3390/soils2010008
bins_Clemente2012
bins_Lynch2019
bins_Hertkorn2013
NMR grouping bins from Baldock et al. (2004), for solid-state NMR. (1) alkyl C (0-45); (2) methoxyl C and N-alkyl C (45-60); (3) O-alkyl C (60-95); (4) di-O-alkyl C (95-110); (5) aromatic C (110-145); (6) phenolic C (145-165); (7) amide and carboxyl C (165-215)
bins_ss_Baldock2004
bins_ss_Baldock2004
A data frame with 5 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
J. Baldock et al. "Cycling and composition of organic matter in terrestrial and marine ecosystems". Marine Chemistry. doi:10.1016/j.marchem.2004.06.016
bins_Clemente2012
bins_Lynch2019
bins_Mitchell2018
NMR grouping bins from Clemente et al. (2012), for solid-state NMR. (1) alkyl C (0-50); (2) O-alkyl C (60-93); (3) anomeric C (95-110); (4) aromatic C (110-160); (5) carboxyl-carbonyl C (160-200)
bins_ss_Clemente2012
bins_ss_Clemente2012
A data frame with 5 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
JS Clemente et al. 2012. “Comparison of Nuclear Magnetic Resonance Methods for the Analysis of Organic Matter Composition from Soil Density and Particle Fractions.” Environmental Chemistry doi:10.1071/EN11096
bins_Clemente2012
bins_ss_Preston2009
bins_Mitchell2018
NMR grouping bins from Preston et al. (2009), for solid-state NMR. (1) alkyl C (0-50); (2) methoxyl C (50-60); (3) O-alkyl C (60-93); (4) di-O-alkyl C (93-112); (5) aromatic C (112-140); (6) phenolic C (140-165); (7) carboxyl C (165-190)
bins_ss_Preston2009
bins_ss_Preston2009
A data frame with 5 rows and 5 variables:
Bin number
Name of bin group
ppm shift range, lower limit
ppm shift range, upper limit
Description of the bin group
The NMR spectrum can be split into several bins, based on chemical shift (ppm). Binsets are specific to nuclei and solvents and by definition are open on the left and closed on the right; for example, a bin of (0,1) includes 1 but not 0.
C. Preston et al. 2009. "Chemical Changes During 6 Years of Decomposition of 11 Litters in Some Canadian Forest Sites. Part 1. Elemental Composition, Tannins, Phenolics, and Proximate Fractions". Ecosystems. doi:10.1007/s10021-009-9266-0
bins_Clemente2012
bins_Lynch2019
bins_Mitchell2018
Assign group (bin name) to each row of the data based on
the ppm
column.
nmr_assign_bins(dat, binset)
nmr_assign_bins(dat, binset)
dat |
Input dataframe. This could be spectral data, or peak picked data. Must include a 'ppm' column for compound class assignment |
binset |
A binset; e.g. |
The input data with a new group
column whose entries
are drawn from the binset. Entries will be NA
if a ppm
value does not fall into any group.
Kaizad Patel
sdir <- system.file("extdata", "kfp_hysteresis", "spectra_mnova", package = "nmrrr") spec <- nmr_import_spectra(path = sdir, method = "mnova") nmr_assign_bins(spec, bins_Clemente2012)
sdir <- system.file("extdata", "kfp_hysteresis", "spectra_mnova", package = "nmrrr") spec <- nmr_import_spectra(path = sdir, method = "mnova") nmr_assign_bins(spec, bins_Clemente2012)
Process data of peaks picked with NMR software.
nmr_import_peaks(path, method, pattern = "*.csv$", quiet = FALSE)
nmr_import_peaks(path, method, pattern = "*.csv$", quiet = FALSE)
path |
Directory where the peaks data are saved |
method |
Format of input data, depending on how the data were exported. "multiple columns": data are in split-column format, obtained by pasting "peaks table" in MNova. "single column": data are in single-column format, exported from MNova as "peaks script". |
pattern |
Filename pattern to search for (by default "*.csv$") |
quiet |
Print diagnostic messages? Logical |
A dataframe with columns describing sample ID, ppm, intensity, area, group name.
Kaizad Patel
sdir <- system.file("extdata", "kfp_hysteresis", "peaks_mnova_multiple", package = "nmrrr") nmr_import_peaks(path = sdir, method = "multiple columns")
sdir <- system.file("extdata", "kfp_hysteresis", "peaks_mnova_multiple", package = "nmrrr") nmr_import_peaks(path = sdir, method = "multiple columns")
Imports multiple spectra files and then combines and cleans the data.
nmr_import_spectra(path, method, pattern = "*.csv$", quiet = FALSE)
nmr_import_spectra(path, method, pattern = "*.csv$", quiet = FALSE)
path |
Directory where the spectra files are saved |
method |
Software used for initial processing of NMR spectra (before using this package). Available options include "mnova" and "topspin". |
pattern |
Filename pattern to search for (by default "*.csv$") |
quiet |
Print diagnostic messages? Logical |
A link{data.frame}
with data from all files found,
concatenated and sorted.
Kaizad Patel
sdir <- system.file("extdata", "kfp_hysteresis", "spectra_mnova", package = "nmrrr") nmr_import_spectra(path = sdir, method = "mnova")
sdir <- system.file("extdata", "kfp_hysteresis", "spectra_mnova", package = "nmrrr") nmr_import_spectra(path = sdir, method = "mnova")
Plot NMR spectra, with line-brackets denoting binned regions. Uses spectra data processed in MestreNova or TopSpin.
nmr_plot_spectra( dat, binset, label_position = 100, mapping = aes(x = ppm, y = intensity), stagger = 10 )
nmr_plot_spectra( dat, binset, label_position = 100, mapping = aes(x = ppm, y = intensity), stagger = 10 )
dat |
Processed spectral data, output from (a) |
binset |
A binset; e.g. |
label_position |
y-axis position for bin labels |
mapping |
An aesthetic mapping generated by |
stagger |
How much to stagger the labels, numeric;
same units as |
A ggplot
object.
Kaizad Patel
sdir <- system.file("extdata", "kfp_hysteresis", "spectra_mnova", package = "nmrrr") spec <- nmr_import_spectra(path = sdir, method = "mnova") library(ggplot2) p_aes <- aes(x = ppm, y = intensity) p <- nmr_plot_spectra(spec, bins_Clemente2012, 5, p_aes, stagger = 0.5) p + ylim(0, 6)
sdir <- system.file("extdata", "kfp_hysteresis", "spectra_mnova", package = "nmrrr") spec <- nmr_import_spectra(path = sdir, method = "mnova") library(ggplot2) p_aes <- aes(x = ppm, y = intensity) p <- nmr_plot_spectra(spec, bins_Clemente2012, 5, p_aes, stagger = 0.5) p + ylim(0, 6)
Compute relative abundance of compound classes for each sample.
nmr_relabund(dat, method)
nmr_relabund(dat, method)
dat |
Processed spectral data, output from (a) |
method |
The method for calculating relative abundance. Options include (a) "AUC", integrating the spectral region within each bin; (b) "peaks", adding areas of peaks if a peak-picked file is provided. |
A data.frame
with columns describing relative contributions of
compound classes. Compound classes are determined by selecting the desired
binset.
Kaizad Patel
sdir <- system.file("extdata", "kfp_hysteresis", "peaks_mnova_multiple", package = "nmrrr") peaks <- nmr_import_peaks(path = sdir, method = "multiple columns") peaks <- nmr_assign_bins(peaks, bins_Clemente2012) nmr_relabund(peaks, "peaks")
sdir <- system.file("extdata", "kfp_hysteresis", "peaks_mnova_multiple", package = "nmrrr") peaks <- nmr_import_peaks(path = sdir, method = "multiple columns") peaks <- nmr_assign_bins(peaks, bins_Clemente2012) nmr_relabund(peaks, "peaks")