Skip to content

API Reference

COSMOS PKN builders

Build the COSMOS prior-knowledge network from multiple sources.

Calls each requested resource processor, collects the yielded :class:Interaction records, optionally translates IDs to canonical form (ChEBI / UniProt), and returns a :class:CosmosBundle.

When translate_ids is True (default), the bundle's metabolites, proteins, and reactions lists are populated with provenance records linking original source IDs to their canonical counterparts. The network list holds :class:Interaction namedtuples with translated IDs.

The resources dict in the config controls both which resources are active and their parameters. The top-level organism is injected into each resource unless overridden.

Parameters:

Name Type Description Default
*args dict | 'Path' | str

Configuration overrides as dicts or YAML file paths.

()
row_filter

Optional callable (Interaction) → bool. When provided, applied before ID translation so that rows known to be unneeded are dropped early and never translated. This avoids paying the cost of translating metabolic GEM edges when only transporter edges are requested, for example.

None
**kwargs

Top-level config keys. Resource names are accepted as shorthand, e.g.::

build(stitch={'score_threshold': 400})
build(tcdb=False, slc=False)
build(organism=10090)
{}

Returns:

Type Description
CosmosBundle

class:CosmosBundle with all interactions in network.

CosmosBundle

When translate_ids is True, provenance records are

CosmosBundle

populated in metabolites, proteins, and reactions.

Build the transporter subset of the COSMOS PKN.

Convenience wrapper around :func:build that enables only transporter-relevant resources (TCDB, SLC, GEM, Recon3D) and filters to keep only transporter interactions.

STITCH is excluded because it is a general chemical-protein interaction database (last updated 2015) that does not annotate transport specifically. Classifying STITCH interactions as transport by checking whether the protein appears in TCDB is methodologically unsound: a protein can be a transporter for some substrates while acting as a kinase, receptor, or enzyme for others. Dedicated transporter databases (TCDB, SLC, GEM, Recon3D) provide direct, mechanistically-annotated transport interactions and are actively maintained. See ADR 0002 in saezverse.

The filter is applied before ID translation so that metabolic GEM edges (resource='GEM:<gem>') are discarded early and never translated, avoiding redundant computation.

Filter predicate
  • interaction_type in ('transport', 'transporter')
  • resource.startswith('GEM_transporter')

Parameters:

Name Type Description Default
*args

Passed through to :func:build.

()
**kwargs

Passed through to :func:build. brenda and stitch are disabled unless explicitly re-enabled. mrclinksdb is enabled: its transport-classified records (interaction_type='transport') are included. mrclinksdb_transporter is enabled: the dedicated transporter protein file (all records have interaction_type='transport').

{}

Returns:

Type Description
CosmosBundle

class:CosmosBundle containing only transporter interactions,

CosmosBundle

with provenance filtered to the surviving edges.

Build the receptor subset of the COSMOS PKN.

Convenience wrapper around :func:build that enables only receptor-relevant resources (MRCLinksDB, STITCH) and post-filters to keep only receptor/ligand interactions.

Unlike :func:build_transporters, the cell_surface_only filter is applied after ID translation (via :func:_filter_bundle) rather than as a pre-translation row_filter. This is necessary because STITCH proteins have no location data at yield time — locations are assigned post-translation by :func:_enrich_stitch_locations inside :func:build. MRCLinksDB locations are set at yield time and would support pre-filtering, but using a single post-translation pass keeps the implementation uniform.

Post-filter predicate
  • interaction_type == 'ligand_receptor'
  • If cell_surface_only=True: also 'e' in row.locations

Parameters:

Name Type Description Default
*args

Passed through to :func:build.

()
cell_surface_only bool

If True, retain only interactions where the receptor protein is annotated to the plasma membrane / cell surface ('e' in locations). Useful for cell-cell communication models (COSMOS intercellular layer, NicheNet, etc.) where only surface-exposed receptors are relevant. Locations are derived from UniProt subcellular location annotations mapped via the TCDB location table.

False
**kwargs

Passed through to :func:build. tcdb, slc, brenda, gem, and recon3d are disabled unless explicitly re-enabled.

{}

Returns:

Type Description
CosmosBundle

class:CosmosBundle containing only receptor interactions,

CosmosBundle

with provenance filtered to the surviving edges.

Build the allosteric-regulation subset of the COSMOS PKN.

Convenience wrapper around :func:build that enables only allosteric-relevant resources (BRENDA, STITCH) and post-filters to keep only allosteric interactions.

Corresponds to the Metabolite-protein interaction category in the COSMOS PKN planning document: small molecules that activate or inhibit proteins through allosteric binding, distinct from stoichiometric enzymatic metabolism.

Post-filter predicate
  • interaction_type == 'allosteric_regulation' (BRENDA)
  • STITCH rows where interaction_type == 'other'

Parameters:

Name Type Description Default
*args

Passed through to :func:build.

()
**kwargs

Passed through to :func:build. tcdb, slc, mrclinksdb, gem, and recon3d are disabled unless explicitly re-enabled.

{}

Returns:

Type Description
CosmosBundle

class:CosmosBundle containing only allosteric regulation

CosmosBundle

interactions, with provenance filtered to the surviving edges.

Build the enzyme-metabolite (metabolic) subset of the COSMOS PKN.

Convenience wrapper around :func:build that enables only GEM resources and post-filters to keep only stoichiometric enzyme-metabolite interactions from genome-scale metabolic models.

Corresponds to the Enzyme-metabolite category in the COSMOS PKN planning document: direct metabolic reactions where enzymes act on substrates and products, as opposed to allosteric regulation.

Pre-translation filter predicate
  • resource.startswith('GEM:') (metabolic GEM edges only, distinct from 'GEM_transporter:' transport edges)

The filter is applied before ID translation so that transport GEM edges (resource='GEM_transporter:<gem>') are discarded early and never translated — avoiding spurious drop-rate warnings.

Note

'GEM_transporter:...' resources are excluded because 'GEM_transporter:...'.startswith('GEM:') is False. For allosteric regulation (BRENDA, STITCH), use :func:build_allosteric.

Parameters:

Name Type Description Default
*args

Passed through to :func:build.

()
**kwargs

Passed through to :func:build. tcdb, slc, brenda, mrclinksdb, recon3d, and stitch are disabled unless explicitly re-enabled.

{}

Returns:

Type Description
CosmosBundle

class:CosmosBundle containing only stoichiometric

CosmosBundle

enzyme-metabolite interactions from GEMs, with provenance

CosmosBundle

filtered to the surviving edges.

COSMOS PKN formatters

Format the transporter category of a COSMOS PKN bundle.

Convenience wrapper around :func:format_pkn that pre-filters source to transporter rows before formatting. When source already comes from :func:~._build.build_transporters, the filter is a no-op.

Orphan transport reactions included in source (from a build step with include_orphans=True) are formatted as Rxn{N}__<reaction_id> nodes. To exclude them, pass include_orphans=False to the upstream build call before formatting.

Parameters:

Name Type Description Default
source

:class:~._bundle.CosmosBundle or translated PKN DataFrame.

required

Returns:

Type Description
'CosmosBundle'

class:~._bundle.CosmosBundle with COSMOS-formatted transporter edges.

Format the receptor category of a COSMOS PKN bundle.

Convenience wrapper around :func:format_pkn that pre-filters source to receptor rows before formatting. When source already comes from :func:~._build.build_receptors, the filter is a no-op.

Parameters:

Name Type Description Default
source

:class:~._bundle.CosmosBundle or translated PKN DataFrame.

required

Returns:

Type Description
'CosmosBundle'

class:~._bundle.CosmosBundle with COSMOS-formatted receptor edges.

Format the allosteric-regulation category of a COSMOS PKN bundle.

Convenience wrapper around :func:format_pkn that pre-filters source to allosteric rows before formatting. When source already comes from :func:~._build.build_allosteric, the filter is a no-op.

Filter predicate matches :func:~._build.build_allosteric:

  • interaction_type == 'allosteric_regulation' (BRENDA)
  • resource == 'STITCH' and interaction_type == 'other' (STITCH other)

Parameters:

Name Type Description Default
source

:class:~._bundle.CosmosBundle or translated PKN DataFrame.

required

Returns:

Type Description
'CosmosBundle'

class:~._bundle.CosmosBundle with COSMOS-formatted allosteric edges.

Format the enzyme-metabolite (metabolic GEM) category of a COSMOS PKN bundle.

Convenience wrapper around :func:format_pkn that pre-filters source to metabolic GEM rows before formatting. When source already comes from :func:~._build.build_enzyme_metabolite, the filter is a no-op.

Filter predicate matches :func:~._build.build_enzyme_metabolite:

  • resource.startswith('GEM:') — metabolic GEM edges only. 'GEM_transporter:...' resources are excluded because 'GEM_transporter:...'.startswith('GEM:') is False.

Parameters:

Name Type Description Default
source

:class:~._bundle.CosmosBundle or translated PKN DataFrame.

required

Returns:

Type Description
'CosmosBundle'

class:~._bundle.CosmosBundle with COSMOS-formatted enzyme-metabolite edges.

Data classes

Full output of the COSMOS PKN build and format pipeline.

Attributes:

Name Type Description
network list

Formatted PKN edges as :class:~._record.CosmosEdge records. Source and target are COSMOS node ID strings (e.g. 'Metab__CHEBI:15422_c', 'Gene1__P00533').

metabolites list[CosmosMetabolite]

One :class:~._record.CosmosMetabolite per unique metabolite, mapping canonical ChEBI IDs back to original source identifiers.

proteins list[CosmosProtein]

One :class:~._record.CosmosProtein per unique protein, mapping canonical UniProt ACs back to original source identifiers.

reactions list[CosmosReaction]

One :class:~._record.CosmosReaction per unique GEM reaction that contributed edges to the network. Empty for non-GEM resources.

Functions

to_dataframes()

Convert all bundle components to pandas DataFrames.

Returns:

Type Description
dict[str, DataFrame]

Dict with keys 'network', 'metabolites', 'proteins',

dict[str, DataFrame]

'reactions', each holding the corresponding DataFrame.

dict[str, DataFrame]

Columns match the fields of the respective namedtuple class.

Bases: NamedTuple

One edge in the final formatted COSMOS PKN network.

Yielded by :func:~omnipath_metabo.datasets.cosmos._format.format_pkn. Node IDs are COSMOS-formatted strings (e.g. 'Metab__CHEBI:15422_c', 'Gene1__P12345'). Convert a collection to a DataFrame with pd.DataFrame(edges).

Attributes

attrs instance-attribute

Arbitrary metadata inherited from the resource processor.

interaction_type instance-attribute

Interaction type (e.g. 'transport', 'ligand_receptor', 'connector').

locations instance-attribute

Subcellular compartments associated with the edge (may be empty).

mor instance-attribute

Mode of regulation: 1 (activation), -1 (inhibition), 0 (unknown).

resource instance-attribute

Originating resource name (e.g. 'TCDB', 'GEM:Human-GEM').

source instance-attribute

COSMOS-formatted source node ID.

source_type instance-attribute

Entity type of source: 'small_molecule', 'protein', or None for connectors.

target instance-attribute

COSMOS-formatted target node ID.

target_type instance-attribute

Entity type of target.