funki.pipelines

funki.pipelines.differential_expression(data, design_factor, contrast_var, ref_var, logfc_thr=1.0, fdr_thr=0.05, method='pydeseq2', ax=None)

Computes differential expression analysis on the provided data based on a given design factor and both the contrast and reference variables (e.g. treatment and control). Then generates the resulting volcano plot based on the desired thresholds.

Parameters:
  • data (funki.input.DataSet) – The data from which to compute the differential expression

  • design_factor (str) – Name of the column containing the variables which the contrasting samples are assigned. The column must be present in the data.obs table

  • contrast_var (any | list[any]) – The variable value(s) that defines the samples that are to be contrasted against the reference (e.g. 'treatment'). The value must be present in the specified design_factor column

  • ref_var (any | list[any]) – The variable value(s) that defines the refence samples (e.g. 'control'). The value must be present in the specified design_factor column

  • logfc_thr (float, optional) – Threshold for signifacnce based on the log2(FC) value, defaults to 1.0

  • fdr_thr (float, optional) – Threshold for signifacnce based on the FDR value, defaults to 0.05

  • method (str) – Which method to use for computing the differential expression. Available methods are 'pydeseq2' or 'limma', defaults to 'pydeseq2'.

  • ax (matplotlib.axes.Axes) – Matplotlib Axes instance where to draw the plot. Defaults to None, meaning a new figure and axes will be generated.

Returns:

The figure contataining the resulting scatter plot. If an axes is passed, nothing is returned.

Return type:

matplotlib.figure.Figure | None

funki.pipelines.enrichment_analysis(data, net, contrast=None, method=None, source='source', target='target', weight=None, top=10, ax=None, **kwargs)

Performs enrichment analysis using Decoupler based on a given network (e.g. gene set collection) and statistical method(s) and returns a figure with the consensus score across methods for the enrichment results.

Parameters:
  • data (funki.input.DataSet) – The data set from which to perform the enrichment

  • net (pandas.DataFrame) – The network linking the features of the data to the attributes (e.g. pathways, gene sets, transcription factors, etc.)

  • contrast (str) – Which result of the differential expression to use for the enrichment. Must be present in data.varm_keys named with the format '{contrast_var}_vs_{ref_var}'. Defaults to None.

  • method (NoneType | str) – Which statistical method to use in order to compute the enrichment, defaults to None. If none is provided, uses 'ulm'. To see all the available methods, you can run decoupler.mt.show() function.

  • source (str) – Column name from the provided net containing the gene sets to enrich for. Defaults to 'source'.

  • target (str) – Column name from the provided net containing the gene set components (e.g. gene/protein names) that can be mapped back to the data set variable names. Defaults to 'target'.

  • weight (NoneType | str) – Defines the column in the network containing the weights to use in the enrichment, defaults to None.

  • top (int) – Number of top enriched gene sets to display based on their consensus score. If a negative number is provided, the bottom ones will be displayed instead.

  • ax (matplotlib.axes.Axes) – Matplotlib Axes instance where to draw the plot. Defaults to None, meaning a new figure and axes will be generated.

  • **kwargs (optional) – Other keyword arguments that are passed to the specific method call from decoupler.mt methods

Returns:

None, results are stored inplace of the passed data object, which is a funki.input.DataSet instance. Estimates, p-values and consensus scores (in case of multiple methods) are stored as part of the obsm attribute of the object.

Return type:

NoneType

funki.pipelines.sc_quality_control(data, ax=None)

Computes QC metrics on a single-cell data set and generates several plots to visualize them. Generates a multipanel figure with the follwoing plots:

  • Box plot with highest expression genes

  • Violin plot with number of genes per cell

  • Violin plot with total counts per gene

  • Violin plot with the percentage of mitochondrial genes per cell

  • Scatter plot of total counts vs. percentage of mitochondrial genes

  • Scatter plot of total counts vs. number of genes

Parameters:
  • data (funki.input.DataSet) – The data set from which to compute the QC metrics

  • ax (matplotlib.axes.Axes) – Matplotlib Axes instance where to draw the plots. Defaults to None, meaning a new figure and axes will be generated. If passed, an axes with at least 2 columns and 3 rows is expected.

Returns:

The figure contataining the resulting plot with multiple panels for different metrics and comparisons. If an axes is passed, nothing is returned.

Return type:

matplotlib.figure.Figure | None