Downstream

Tools

`mina.down.run_ulm_per_view(view_dict: dict[str, pd.DataFrame], net: pd.DataFrame, **kwargs) -> dict[str, dict[str, pd.DataFrame]]`

Run ULM (univariate linear modeling) separately for each view.

Parameters:

Name	Type	Description	Default
`view_dict`	`dict[str, DataFrame]`	Dictionary mapping view names to expression matrices (e.g., archetypes × genes).	required
`net`	`DataFrame`	Prior knowledge network in a decoupler-compatible format.	required
`**kwargs`	`dict`	Additional keyword arguments passed to `decoupler.mt.ulm`.	`{}`

Returns:

Type	Description
`dict[str, dict[str, DataFrame]]`	Dictionary mapping view names to result dictionaries containing pathway activities (`pw_acts`) and adjusted p-values (`pw_padj`).

`mina.down.get_associations(adata, test_variable, test_type=None, random_effect=None)`

Test associations between model features and an observation-level covariate.

Using: - For continuous covariates with no random_effect: Pearson correlation. - For categorical covariates with no random_effect: one-way ANOVA (F-test). - If random_effect is given: likelihood-ratio test on linear mixed models.

Parameters:

Name	Type	Description	Default
`adata`	`AnnData`	Annotated data matrix with features in `.X` and covariates in `.obs`.	required
`test_variable`	`str`	Column in `adata.obs` to test for association.	required
`test_type`	``continuous``, ``categorical``	Type of the test variable. If None, inferred from data type.	``continuous``
`random_effect`	`str or None`	Column in `adata.obs` specifying grouping for a random intercept.	`None`

Returns:

Type	Description
`DataFrame`	DataFrame with columns `feature`, `p_value`, and `adj_p_value`.

`mina.down.calc_total_variance(adata, associations_df, pval_thrs=0.05)`

Compute the total explained variance per view for statistically significant features.

This function aggregates the R² values stored in adata.var by summing them across features that pass a significance threshold in the associations table. Variance is computed separately for each view/group as defined by split_by_view.

Parameters:

Name	Type	Description	Default
`adata`	`AnnData`	Model AnnData object containing explained variance (R²) values in `adata.var`.	required
`associations_df`	`DataFrame`	Output from `get_associations` containing feature-level p-values and adjusted p-values. Must include columns `['feature', 'adj_p_value']`.	required
`pval_thrs`	float, ``optional``	Adjusted p-value threshold used to select significant features. Default is 0.05.	`0.05`

Returns:

Type	Description
`dict[str, Series]`	Dictionary mapping each view/group name to a Series containing the summed explained variance per factor across significant features.

`mina.down.get_pval_matrix(adata, covars)`

Compute adjusted p-value associations for multiple covariates in a model AnnData.

For each covariate, this function calls down.get_associations to test its association with model factors and collects the adjusted p-values into a DataFrame (p_df). Each column corresponds to a covariate and each row to a factor (adata.var index).

Parameters:

Name	Type	Description	Default
`adata`	`AnnData`	AnnData object containing model factors in `.var` and covariates in `.obs`.	required
`covars`	`list[str]`	Covariate names in `adata.obs` to test.	required

Returns:

Type	Description
`DataFrame`	Matrix of adjusted p-values with factors as rows and covariates as columns.

`mina.down.get_loading_gset(col, source_base: str, percentile: float = 0.85) -> pd.DataFrame`

Extract a gene set from a vector of loadings using a percentile threshold.

Parameters:

Name	Type	Description	Default
`col`	`Series or DataFrame`	Loadings for a single factor. Index corresponds to target/features.	required
`source_base`	`str`	Base name for the gene set (e.g., "Cardiomyocytes").	required
`percentile`	`float`	Quantile in [0, 1] computed separately for positive and negative loadings. Default is 0.85.	`0.85`

Returns:

Type	Description
`DataFrame`	DataFrame containing the selected gene set.

`mina.down.build_info_networks(multicell_scores: pd.DataFrame, random_effect: pd.Series | pd.Index | pd.Categorical | np.ndarray | None = None, standardize: bool = False, drop_na: bool = True, verbose: bool = True) -> pd.DataFrame`

Fit pairwise linear models among columns of an enriched-score matrix to infer directed information networks.

Parameters:

Name	Type	Description	Default
`multicell_scores`	`DataFrame`	Enriched scores with shape (samples × features).	required
`random_effect`	``array-like`` or None	Optional grouping vector defining random intercepts. Length must match the number of rows.	`None`
`standardize`	`bool`	If True, z-score each feature before fitting.	`False`
`drop_na`	`bool`	If True, drop rows containing missing values.	`True`
`verbose`	`bool`	If True, emit warnings for skipped model fits.	`True`

Returns:

Type	Description
`DataFrame`	Table with columns: `target, predictor, coef, R2, cor_estimate, n_samples, model_type`.

`mina.down.get_multicell_net(test_model: ad.AnnData, sel_factor: str, random_effect: pd.Series | pd.Index | pd.Categorical | np.ndarray | None = None, standardize: bool = False, drop_na: bool = True, verbose: bool = True, percentile: float = 0.85) -> dict[str, pd.DataFrame]`

Given a factor of interest within a model, we reconstruct multicellular information networks by: 1) Extracting top genes associated with the factor in each view. 2) Enriching these gene sets in the pseudobulk data to get factor-associated scores. 3) Fitting pairwise linear models among the scores to infer directed networks.

The linear models can be controled with random effects, standardization, and NA handling options. The final output is a dictionary containing separate inferred networks for positive and negative associations.

Parameters:

Name	Type	Description	Default
`test_model`	`AnnData`	AnnData object containing factor scores and associated metadata.	required
`sel_factor`	`str`	Name of the factor to extract (e.g., "Factor1").	required
`random_effect`	``array-like`` or None	Optional grouping vector defining random intercepts.	`None`
`standardize`	`bool`	If True, z-score features prior to model fitting.	`False`
`drop_na`	`bool`	If True, drop rows containing missing values.	`True`
`verbose`	`bool`	If True, warn when models are skipped.	`True`
`percentile`	`float`	Percentile threshold in [0, 1] for selecting top genes per view.	`0.85`

Returns:

Type	Description
`dict[str, DataFrame]`	Dictionary mapping interaction direction to inferred network tables.

`mina.down.multiview_to_wide(views: dict[str, AnnData], sample_key: str | None = None, *, prefix_features: bool = True, return_dataframe: bool = True) -> tuple[pd.DataFrame | np.ndarray, pd.Index, list[str]]`

Build a dense wide matrix (samples × features) from a dict of per-view AnnData. Uses the UNION of samples in first-seen order; rows missing in a view are zero-filled.

Parameters:

Name	Type	Description	Default
`views`	`dict[str, AnnData]`	Dictionary mapping view names to AnnData objects containing the data.	required
`sample_key`	`str or None`	Optional column in `.obs` to use as sample IDs. If None, uses `.obs_names`.	`None`
`prefix_features`	`bool`	If True, prefix feature names with view name (e.g., "view1:geneA"). Default is True.	`True`
`return_dataframe`	`bool`	If True, return a pandas DataFrame with sample IDs and feature names. If False, return a NumPy array with separate index and column lists.	`True`

`mina.down.project_wide_to_factors(wide: pd.DataFrame | np.ndarray, W: np.ndarray, model_cols: Iterable[str], factor_names: Iterable[str] | None = None, rcond: float | None = None, center: bool = False, sample_annotations: pd.DataFrame | None = None) -> ad.AnnData`

Project a samples × features matrix into latent factor space.

Parameters:

Name	Type	Description	Default
`wide`	`DataFrame or ndarray`	Matrix with shape (n_samples × n_features_in_wide).	required
`W`	`ndarray`	Loadings matrix with shape (n_factors × n_features_total).	required
`model_cols`	`Iterable[str]`	Feature names defining the column order of `W`.	required
`factor_names`	`Iterable[str] or None`	Names for output factors. If None, default names are used.	`None`
`rcond`	`float or None`	Cutoff for small singular values passed to `np.linalg.pinv`.	`None`
`center`	`bool`	If True, center columns before projection.	`False`
`sample_annotations`	`DataFrame or None`	Optional sample-level metadata to add to `.obs`.	`None`

Returns:

Type	Description
`AnnData`	AnnData object with projected factor scores in `.X`.

Utils

`mina.down.model_to_anndata(anndata_dict: dict[str, ad.AnnData], metadata: pd.DataFrame, model) -> ad.AnnData`

Combine a factor model and multiple pseudobulk views into a single AnnData object.

Parameters:

Name	Type	Description	Default
`anndata_dict`	`dict[str, AnnData]`	Dictionary of pseudobulk AnnData views. Keys are used to name entries in `.obsm`.	required
`metadata`	`DataFrame`	Sample-level metadata indexed by sample ID.	required
`model`	`object`	Trained factor model exposing `get_factors`, `get_r2`, and `get_weights` methods.	required

Returns:

Type	Description
`AnnData`	AnnData object containing factor scores, metadata, gene loadings, and aligned pseudobulk matrices.

`mina.down.split_by_view(arch_gex: pd.DataFrame) -> dict[str, pd.DataFrame]`

Split a wide DataFrame into view-specific DataFrames.

Column names are expected to follow the format "view:feature".

Parameters:

Name	Type	Description	Default
`arch_gex`	`DataFrame`	DataFrame with columns encoded as `view:feature`.	required

Returns:

Type	Description
`dict[str, DataFrame]`	Dictionary mapping view names to DataFrames containing only features from that view.