Skip to content

OmniPath

OmniPath loading helpers from annnet.io.omnipath.

annnet.io.omnipath

Classes

Functions

from_omnipath
from_omnipath(
    df=None,
    *,
    dataset="omnipath",
    include=None,
    exclude=None,
    query=None,
    source_col=None,
    target_col=None,
    directed_col=None,
    weight_col=None,
    edge_id_col=None,
    slice_col=None,
    slice=None,
    default_directed=True,
    edge_attr_cols=None,
    dropna=True,
    annotations_backend=None,
    vertex_annotations_df=None,
    vertex_annotations_path=None,
    vertex_annotation_sources=None,
    load_vertex_annotations=True,
    **graph_kwargs
)

Build an AnnNet from OmniPath interaction data, with edge and vertex annotations.

Fetches a signaling interaction dataset from the OmniPath web service (or accepts a pre-loaded DataFrame), builds the graph structure via bulk operations, and optionally enriches vertices with annotations from the OmniPath annotation archive.

The annotation archive (~114MB) is downloaded once and cached at ~/.cache/annnet/omnipath_annotations.tsv.gz for fast subsequent loads.

Parameters:

Name Type Description Default
df DataFrame - like

If provided, skip the OmniPath network request and build from this table. Must contain at least source and target columns. Accepts Polars, pandas, or any Narwhals-compatible DataFrame.

None
dataset str

OmniPath interaction dataset to fetch. One of: "omnipath" (default, curated core), "all", "posttranslational", "pathwayextra", "kinaseextra", "ligrecextra", "dorothea", "tftarget", "transcriptional", "tfmirna", "mirna", "lncrnamrna", "collectri".

'omnipath'
include optional

Dataset include/exclude filters. Only used when dataset="all" (include/exclude) or dataset="posttranslational" (exclude only).

None
exclude optional

Dataset include/exclude filters. Only used when dataset="all" (include/exclude) or dataset="posttranslational" (exclude only).

None
query dict

Extra query parameters forwarded to the OmniPath web service. Example: {"organism": "human", "genesymbols": True}. Use omnipath.interactions.<Dataset>.params() to inspect available keys.

None
source_col str

Column name for source node identifiers. Auto-detected from common OmniPath field names if omitted (source, source_genesymbol, etc.).

None
target_col str

Column name for target node identifiers. Auto-detected if omitted.

None
directed_col str

Column holding per-edge directedness flags (bool-like). If omitted, default_directed is used for all edges.

None
weight_col str

Column holding edge weights. If omitted, weight defaults to 1.0.

None
edge_id_col str

Column holding stable edge identifiers. If omitted, AnnNet assigns sequential IDs (edge_0, edge_1, ...).

None
slice_col str

Column holding per-edge slice identifiers.

None
slice str

Slice to place all edges into. Ignored if slice_col is provided.

None
default_directed bool

Fallback directedness when directed_col is missing or null. Defaults to True.

True
edge_attr_cols list[str]

Columns to store as edge attributes. If omitted, all non-structural columns are used. Pass [] to skip edge attribute loading entirely.

None
dropna bool

If True (default), silently drop rows with null source or target. If False, raise ValueError on first null endpoint.

True
annotations_backend str

Backend for AnnNet attribute tables. None uses AnnNet's configured dataframe backend default.

None
vertex_annotations_df DataFrame - like

Pre-loaded annotation table in OmniPath long format (genesymbol, source, label, value). Skips all file I/O. Fastest option when rebuilding the graph multiple times in one session — load the archive once and pass it here.

None
vertex_annotations_path str

Path to a local OmniPath annotation file (.tsv or .tsv.gz). Skips the cache-check and download.

None
vertex_annotation_sources list[str]

OmniPath annotation resource names to include as vertex attributes. If omitted, all resources in the annotation table are loaded (254 columns). Recommended subset for signaling graphs::

[
    'HGNC',
    'CancerGeneCensus',
    'SignaLink_function',
    'SignaLink_pathway',
    'UniProt_location',
    'HPA_subcellular',
    'PROGENy',
    'IntOGen',
    'Phosphatome',
    'kinase.com',
]
None
load_vertex_annotations bool

Whether to load vertex annotations at all. Set to False to skip annotation loading entirely and get a structure-only graph. Default True.

True
**graph_kwargs

Additional keyword arguments forwarded to the AnnNet constructor.

{}

Returns:

Type Description
AnnNet

Fully constructed graph with: - Vertices: one per unique gene symbol encountered as source or target. - Edges: one per interaction row, with incidence matrix weights encoding direction (+w source, −w target for directed; +w both for undirected). - edge_attributes: Polars DF with one row per edge and one column per entry in edge_attr_cols. - vertex_attributes: Polars DF with one row per vertex and one column per (source:label) annotation pair from the requested resources.

Notes
  • Edge and vertex attribute tables are populated via bulk operations — no per-row DataFrame allocations occur during loading.
  • History tracking is disabled during construction and re-enabled on return.
  • The annotation archive is ~114MB compressed. First call downloads and caches it; subsequent calls load from ~/.cache/annnet/omnipath_annotations.tsv.gz in ~2–3s.
  • source and target columns from the interaction table are redundant as edge attributes (the graph structure already encodes them). Exclude them via edge_attr_cols if not needed.
See Also

AnnNet, AnnNet.add_edges_bulk, AnnNet.add_vertices_bulk

Examples:

Minimal load (structure only, no annotations)::

G = from_omnipath(load_vertex_annotations=False)

Full load with curated vertex annotation sources::

G = from_omnipath(
    dataset='omnipath',
    query={'organism': 'human', 'genesymbols': True},
    source_col='source_genesymbol',
    target_col='target_genesymbol',
    edge_attr_cols=[
        'is_stimulation',
        'is_inhibition',
        'consensus_direction',
        'n_sources',
        'n_references',
        'sources',
        'references_stripped',
    ],
    vertex_annotation_sources=[
        'HGNC',
        'CancerGeneCensus',
        'SignaLink_function',
        'UniProt_location',
        'HPA_subcellular',
        'IntOGen',
    ],
)

Pass a pre-loaded annotation table to avoid repeated downloads::

ann = pl.read_csv('~/.cache/annnet/omnipath_annotations.tsv.gz', separator='\\t')
G = from_omnipath(vertex_annotations_df=ann)

Build from a custom DataFrame instead of fetching from OmniPath::

import pandas as pd

df = pd.DataFrame(
    {
        'source': ['EGFR', 'TP53'],
        'target': ['STAT3', 'MDM2'],
        'is_directed': [True, True],
    }
)
G = from_omnipath(df=df, load_vertex_annotations=False)