funki.pipelines

funki.pipelines.differential_expression(data, design_factor, contrast_var, ref_var, logfc_thr=1.0, fdr_thr=0.05, n_cpus=8)

Computes differential expression analysis on the provided data based on a given design factor and both the contrast and reference variables (e.g. treatment and control). Then generates the resulting volcano plot based on the desired thresholds.

Parameters:
  • data (funki.input.DataSet) – The data from which to compute the differential expression

  • design_factor (str) – Name of the column containing the variables which the contrasting samples are assigned. The column must be present in the data.obs table

  • contrast_var (any | list[any]) – The variable value(s) that defines the samples that are to be contrasted against the reference (e.g. 'treatment'). The value must be present in the specified design_factor column

  • ref_var (any | list[any]) – The variable value(s) that defines the refence samples (e.g. 'control'). The value must be present in the specified design_factor column

  • logfc_thr (float, optional) – Threshold for signifacnce based on the log2(FC) value, defaults to 1.0

  • fdr_thr (float, optional) – Threshold for signifacnce based on the FDR value, defaults to 0.05

  • n_cpus (int, optional) – Number of CPUs used for the calculation, defaults to 8

Returns:

The figure contataining the resulting scatter plot

Return type:

plotly.graph_objs.Figure

funki.pipelines.enrichment_analysis(data, net, methods=None, source=None, target=None, weight=None, top=10, **kwargs)

Performs enrichment analysis using Decoupler based on a given network (e.g. gene set collection) and statistical method(s) and returns a figure with the consensus score across methods for the enrichment results.

Parameters:
  • data (funki.input.DataSet) – The data set from which to perform the enrichment

  • net (pandas.DataFrame) – The network linking the features of the data to the attributes (e.g. pathways, gene sets, transcription factors, etc.)

  • methods (NoneType | str | list[str]) – Which statistical method(s) to use in order to compute the enrichment, defaults to None. If none is provided, uses 'mlm', 'ulm' and 'wsum'. The option 'all' performs all methods. To see all the available methods, you can run decoupler.show_methods() function

  • source (str) – Column name from the provided net containing the gene sets to enrich for.

  • target (str) – Column name from the provided net containing the gene set components (e.g. gene/protein names) that can be mapped back to the data set variable names.

  • weight (NoneType | str) – Defines the column in the network containing the weights to use in the enrichment, defaults to None.

  • top (int) – Number of top enriched gene sets to display based on their consensus score. If a negative number is provided, the bottom ones will be displayed instead.

  • **kwargs (optional) – Other keyword arguments that passed to decoupler.decouple() function

Returns:

None, results are stored inplace of the passed data object, which is a funki.input.DataSet instance. Estimates, p-values and consensus scores (in case of multiple methods) are stored as part of the obsm attribute of the object.

Return type:

NoneType

funki.pipelines.sc_quality_control(data)

Computes QC metrics on a single-cell data set and generates several plots to visualize them. Generates a multipanel figure with the follwoing plots:

  • Box plot with highest expression genes

  • Violin plot with number of genes per cell

  • Violin plot with total counts per gene

  • Violin plot with the percentage of mitochondrial genes per cell

  • Scatter plot of total counts vs. percentage of mitochondrial genes

  • Scatter plot of total counts vs. number of genes

Parameters:

data (funki.input.DataSet) – The data set from which to compute the QC metrics

Returns:

The figure contataining the resulting plot with multiple panels for different metrics and comparisons

Return type:

plotly.graph_objs.Figure