This script performs metabolite clustering analysis and computes clusters of metabolites based on regulatory rules between Intracellular and culture media metabolomics (CoRe experiment).
Source:R/MetaboliteClusteringAnalysis.R
MCA_CoRe.Rd
This script performs metabolite clustering analysis and computes clusters of metabolites based on regulatory rules between Intracellular and culture media metabolomics (CoRe experiment).
Usage
MCA_CoRe(
InputData_Intra,
InputData_CoRe,
SettingsInfo_Intra = c(ValueCol = "Log2FC", StatCol = "p.adj", StatCutoff = 0.05,
ValueCutoff = 1),
SettingsInfo_CoRe = c(DirectionCol = "CoRe", ValueCol = "Log2(Distance)", StatCol =
"p.adj", StatCutoff = 0.05, ValueCutoff = 1),
FeatureID = "Metabolite",
SaveAs_Table = "csv",
BackgroundMethod = "Intra&CoRe",
FolderPath = NULL
)
Arguments
- InputData_Intra
DF for your data (results from e.g. DMA) containing metabolites in rows with corresponding Log2FC and stat (p-value, p.adjusted) value columns.
- InputData_CoRe
DF for your data (results from e.g. DMA) containing metabolites in rows with corresponding Log2FC and stat (p-value, p.adjusted) value columns. Here we additionally require
- SettingsInfo_Intra
Optional: Pass ColumnNames and Cutoffs for the intracellular metabolomics including the value column (e.g. Log2FC, Log2Diff, t.val, etc) and the stats column (e.g. p.adj, p.val). This must include: c(ValueCol=ColumnName_InputData_Intra,StatCol=ColumnName_InputData_Intra, StatCutoff= NumericValue, ValueCutoff=NumericValue) Default=c(ValueCol="Log2FC",StatCol="p.adj", StatCutoff= 0.05, ValueCutoff=1)
- SettingsInfo_CoRe
Optional: Pass ColumnNames and Cutoffs for the consumption-release metabolomics including the direction column, the value column (e.g. Log2Diff, t.val, etc) and the stats column (e.g. p.adj, p.val). This must include: c(DirectionCol= ColumnName_InputData_CoRe,ValueCol=ColumnName_InputData_CoRe,StatCol=ColumnName_InputData_CoRe, StatCutoff= NumericValue, ValueCutoff=NumericValue)Default=c(DirectionCol="CoRe", ValueCol="Log2(Distance)",StatCol="p.adj", StatCutoff= 0.05, ValueCutoff=1)
- FeatureID
Optional: Column name of Column including the Metabolite identifiers. This MUST BE THE SAME in each of your Input files. Default="Metabolite"
- SaveAs_Table
Optional: File types for the analysis results are: "csv", "xlsx", "txt" default: "csv"
- BackgroundMethod
Optional: Background method `Intra|CoRe, Intra&CoRe, CoRe, Intra or * Default="Intra&CoRe"
- FolderPath
Optional: Path to the folder the results should be saved at. default: NULL
Value
List of two DFs: 1. Summary of the cluster count and 2. the detailed information of each metabolites in the clusters.
Examples
Media <- MetaProViz::ToyData("CultureMedia_Raw")
ResM <- MetaProViz::PreProcessing(InputData = Media[-c(40:45) ,-c(1:3)],
SettingsFile_Sample = Media[-c(40:45) ,c(1:3)] ,
SettingsInfo = c(Conditions = "Conditions", Biological_Replicates = "Biological_Replicates", CoRe_norm_factor = "GrowthFactor", CoRe_media = "blank"),
CoRe=TRUE)
#> For Consumption Release experiment we are using the method from Jain M. REF: Jain et. al, (2012), Science 336(6084):1040-4, doi: 10.1126/science.1218595.
#> FeatureFiltering: Here we apply the modified 80%-filtering rule that takes the class information (Column `Conditions`) into account, which additionally reduces the effect of missing values (REF: Yang et. al., (2015), doi: 10.3389/fmolb.2015.00004). Filtering value selected: 0.8
#> 3 metabolites where removed: N-acetylaspartylglutamate, hypotaurine, S-(2-succinyl)cysteine
#> Missing Value Imputation: Missing value imputation is performed, as a complementary approach to address the missing value problem, where the missing values are imputing using the `half minimum value`. REF: Wei et. al., (2018), Reports, 8, 663, doi:https://doi.org/10.1038/s41598-017-19120-0
#> NA values were found in Control_media samples for metabolites. For metabolites including NAs MVI is performed unless all samples of a metabolite are NA.
#> Metabolites with high NA load (>20%) in Control_media samples are: dihydroorotate.
#> Metabolites with only NAs (=100%) in Control_media samples are: hydroxyphenylpyruvate. Those NAs are set zero as we consider them true zeros
#> Total Ion Count (TIC) normalization: Total Ion Count (TIC) normalization is used to reduce the variation from non-biological sources, while maintaining the biological variation. REF: Wulff et. al., (2018), Advances in Bioscience and Biotechnology, 9, 339-351, doi:https://doi.org/10.4236/abb.2018.98022
#> Error in ggplot2::autoplot(stats::prcomp(as.matrix(InputData), scale. = as.logical(Scaling)), data = InputPCA, colour = Param_Col, fill = Param_Col, shape = Param_Sha, size = 3, alpha = 0.8, label = T, label.size = 2.5, label.repel = TRUE, loadings = as.logical(ShowLoadings), loadings.label = as.logical(ShowLoadings), loadings.label.vjust = 1.2, loadings.label.size = 2.5, loadings.colour = "grey10", loadings.label.colour = "grey10"): Objects of class <prcomp> are not supported by autoplot.
#> ℹ Have you loaded the required package?
MediaDMA <- MetaProViz::DMA(InputData=ResM[["DF"]][["Preprocessing_output"]][ ,-c(1:4)],
SettingsFile_Sample=ResM[["DF"]][["Preprocessing_output"]][ , c(1:4)],
SettingsInfo = c(Conditions = "Conditions", Numerator = NULL, Denominator = "HK2"),
StatPval ="aov",
CoRe=TRUE)
#> Error: object 'ResM' not found
IntraDMA <- MetaProViz::ToyData(Data="IntraCells_DMA")
Res <- MetaProViz::MCA_CoRe(InputData_Intra = IntraDMA%>%tibble::rownames_to_column("Metabolite"),
InputData_CoRe = MediaDMA[["DMA"]][["786-M1A_vs_HK2"]])
#> Error: object 'MediaDMA' not found