Skip to content

Downstream

Tools

mina.down.run_ulm_per_view(view_dict: dict[str, pd.DataFrame], net: pd.DataFrame, **kwargs) -> dict[str, dict[str, pd.DataFrame]]

Run ULM (univariate linear modeling) separately for each view.

Parameters:

Name Type Description Default
view_dict dict[str, DataFrame]

Dictionary mapping view names to expression matrices (e.g., archetypes × genes).

required
net DataFrame

Prior knowledge network in a decoupler-compatible format.

required
**kwargs dict

Additional keyword arguments passed to decoupler.mt.ulm.

{}

Returns:

Type Description
dict[str, dict[str, DataFrame]]

Dictionary mapping view names to result dictionaries containing pathway activities (pw_acts) and adjusted p-values (pw_padj).

mina.down.get_associations(adata, test_variable, test_type=None, random_effect=None)

Test associations between model features and an observation-level covariate.

Using: - For continuous covariates with no random_effect: Pearson correlation. - For categorical covariates with no random_effect: one-way ANOVA (F-test). - If random_effect is given: likelihood-ratio test on linear mixed models.

Parameters:

Name Type Description Default
adata AnnData

Annotated data matrix with features in .X and covariates in .obs.

required
test_variable str

Column in adata.obs to test for association.

required
test_type ``continuous``, ``categorical``

Type of the test variable. If None, inferred from data type.

``continuous``
random_effect str or None

Column in adata.obs specifying grouping for a random intercept.

None

Returns:

Type Description
DataFrame

DataFrame with columns feature, p_value, and adj_p_value.

mina.down.calc_total_variance(adata, associations_df, pval_thrs=0.05)

Compute the total explained variance per view for statistically significant features.

This function aggregates the R² values stored in adata.var by summing them across features that pass a significance threshold in the associations table. Variance is computed separately for each view/group as defined by split_by_view.

Parameters:

Name Type Description Default
adata AnnData

Model AnnData object containing explained variance (R²) values in adata.var.

required
associations_df DataFrame

Output from get_associations containing feature-level p-values and adjusted p-values. Must include columns ['feature', 'adj_p_value'].

required
pval_thrs float, ``optional``

Adjusted p-value threshold used to select significant features. Default is 0.05.

0.05

Returns:

Type Description
dict[str, Series]

Dictionary mapping each view/group name to a Series containing the summed explained variance per factor across significant features.

mina.down.get_pval_matrix(adata, covars)

Compute adjusted p-value associations for multiple covariates in a model AnnData.

For each covariate, this function calls down.get_associations to test its association with model factors and collects the adjusted p-values into a DataFrame (p_df). Each column corresponds to a covariate and each row to a factor (adata.var index).

Parameters:

Name Type Description Default
adata AnnData

AnnData object containing model factors in .var and covariates in .obs.

required
covars list[str]

Covariate names in adata.obs to test.

required

Returns:

Type Description
DataFrame

Matrix of adjusted p-values with factors as rows and covariates as columns.

mina.down.get_loading_gset(col, source_base: str, percentile: float = 0.85) -> pd.DataFrame

Extract a gene set from a vector of loadings using a percentile threshold.

Parameters:

Name Type Description Default
col Series or DataFrame

Loadings for a single factor. Index corresponds to target/features.

required
source_base str

Base name for the gene set (e.g., "Cardiomyocytes").

required
percentile float

Quantile in [0, 1] computed separately for positive and negative loadings. Default is 0.85.

0.85

Returns:

Type Description
DataFrame

DataFrame containing the selected gene set.

mina.down.build_info_networks(multicell_scores: pd.DataFrame, random_effect: pd.Series | pd.Index | pd.Categorical | np.ndarray | None = None, standardize: bool = False, drop_na: bool = True, verbose: bool = True) -> pd.DataFrame

Fit pairwise linear models among columns of an enriched-score matrix to infer directed information networks.

Parameters:

Name Type Description Default
multicell_scores DataFrame

Enriched scores with shape (samples × features).

required
random_effect ``array-like`` or None

Optional grouping vector defining random intercepts. Length must match the number of rows.

None
standardize bool

If True, z-score each feature before fitting.

False
drop_na bool

If True, drop rows containing missing values.

True
verbose bool

If True, emit warnings for skipped model fits.

True

Returns:

Type Description
DataFrame

Table with columns: target, predictor, coef, R2, cor_estimate, n_samples, model_type.

mina.down.get_multicell_net(test_model: ad.AnnData, sel_factor: str, random_effect: pd.Series | pd.Index | pd.Categorical | np.ndarray | None = None, standardize: bool = False, drop_na: bool = True, verbose: bool = True, percentile: float = 0.85) -> dict[str, pd.DataFrame]

Given a factor of interest within a model, we reconstruct multicellular information networks by: 1) Extracting top genes associated with the factor in each view. 2) Enriching these gene sets in the pseudobulk data to get factor-associated scores. 3) Fitting pairwise linear models among the scores to infer directed networks.

The linear models can be controled with random effects, standardization, and NA handling options. The final output is a dictionary containing separate inferred networks for positive and negative associations.

Parameters:

Name Type Description Default
test_model AnnData

AnnData object containing factor scores and associated metadata.

required
sel_factor str

Name of the factor to extract (e.g., "Factor1").

required
random_effect ``array-like`` or None

Optional grouping vector defining random intercepts.

None
standardize bool

If True, z-score features prior to model fitting.

False
drop_na bool

If True, drop rows containing missing values.

True
verbose bool

If True, warn when models are skipped.

True
percentile float

Percentile threshold in [0, 1] for selecting top genes per view.

0.85

Returns:

Type Description
dict[str, DataFrame]

Dictionary mapping interaction direction to inferred network tables.

mina.down.multiview_to_wide(views: dict[str, AnnData], sample_key: str | None = None, *, prefix_features: bool = True, return_dataframe: bool = True) -> tuple[pd.DataFrame | np.ndarray, pd.Index, list[str]]

Build a dense wide matrix (samples × features) from a dict of per-view AnnData. Uses the UNION of samples in first-seen order; rows missing in a view are zero-filled.

Parameters:

Name Type Description Default
views dict[str, AnnData]

Dictionary mapping view names to AnnData objects containing the data.

required
sample_key str or None

Optional column in .obs to use as sample IDs. If None, uses .obs_names.

None
prefix_features bool

If True, prefix feature names with view name (e.g., "view1:geneA"). Default is True.

True
return_dataframe bool

If True, return a pandas DataFrame with sample IDs and feature names. If False, return a NumPy array with separate index and column lists.

True

mina.down.project_wide_to_factors(wide: pd.DataFrame | np.ndarray, W: np.ndarray, model_cols: Iterable[str], factor_names: Iterable[str] | None = None, rcond: float | None = None, center: bool = False, sample_annotations: pd.DataFrame | None = None) -> ad.AnnData

Project a samples × features matrix into latent factor space.

Parameters:

Name Type Description Default
wide DataFrame or ndarray

Matrix with shape (n_samples × n_features_in_wide).

required
W ndarray

Loadings matrix with shape (n_factors × n_features_total).

required
model_cols Iterable[str]

Feature names defining the column order of W.

required
factor_names Iterable[str] or None

Names for output factors. If None, default names are used.

None
rcond float or None

Cutoff for small singular values passed to np.linalg.pinv.

None
center bool

If True, center columns before projection.

False
sample_annotations DataFrame or None

Optional sample-level metadata to add to .obs.

None

Returns:

Type Description
AnnData

AnnData object with projected factor scores in .X.

Utils

mina.down.model_to_anndata(anndata_dict: dict[str, ad.AnnData], metadata: pd.DataFrame, model) -> ad.AnnData

Combine a factor model and multiple pseudobulk views into a single AnnData object.

Parameters:

Name Type Description Default
anndata_dict dict[str, AnnData]

Dictionary of pseudobulk AnnData views. Keys are used to name entries in .obsm.

required
metadata DataFrame

Sample-level metadata indexed by sample ID.

required
model object

Trained factor model exposing get_factors, get_r2, and get_weights methods.

required

Returns:

Type Description
AnnData

AnnData object containing factor scores, metadata, gene loadings, and aligned pseudobulk matrices.

mina.down.split_by_view(arch_gex: pd.DataFrame) -> dict[str, pd.DataFrame]

Split a wide DataFrame into view-specific DataFrames.

Column names are expected to follow the format "view:feature".

Parameters:

Name Type Description Default
arch_gex DataFrame

DataFrame with columns encoded as view:feature.

required

Returns:

Type Description
dict[str, DataFrame]

Dictionary mapping view names to DataFrames containing only features from that view.