Skip to content

OmniPath

OmniPath loading helpers from annnet.io.read_omnipath.

annnet.io.read_omnipath

Classes

Functions

read_omnipath(df=None, *, dataset='omnipath', include=None, exclude=None, query=None, source_col=None, target_col=None, directed_col=None, weight_col=None, edge_id_col=None, slice_col=None, slice=None, default_directed=True, edge_attr_cols=None, dropna=True, annotations_backend='polars', vertex_annotations_df=None, vertex_annotations_path=None, vertex_annotation_sources=None, load_vertex_annotations=True, **graph_kwargs)

Build an AnnNet from OmniPath interaction data, with edge and vertex annotations.

Fetches a signaling interaction dataset from the OmniPath web service (or accepts a pre-loaded DataFrame), builds the graph structure via bulk operations, and optionally enriches vertices with annotations from the OmniPath annotation archive.

The annotation archive (~114MB) is downloaded once and cached at ~/.cache/annnet/omnipath_annotations.tsv.gz for fast subsequent loads.

Parameters:

Name Type Description Default
df DataFrame - like

If provided, skip the OmniPath network request and build from this table. Must contain at least source and target columns. Accepts Polars, pandas, or any Narwhals-compatible DataFrame.

None
dataset str

OmniPath interaction dataset to fetch. One of: "omnipath" (default, curated core), "all", "posttranslational", "pathwayextra", "kinaseextra", "ligrecextra", "dorothea", "tftarget", "transcriptional", "tfmirna", "mirna", "lncrnamrna", "collectri".

'omnipath'
include optional

Dataset include/exclude filters. Only used when dataset="all" (include/exclude) or dataset="posttranslational" (exclude only).

None
exclude optional

Dataset include/exclude filters. Only used when dataset="all" (include/exclude) or dataset="posttranslational" (exclude only).

None
query dict

Extra query parameters forwarded to the OmniPath web service. Example: {"organism": "human", "genesymbols": True}. Use omnipath.interactions.<Dataset>.params() to inspect available keys.

None
source_col str

Column name for source node identifiers. Auto-detected from common OmniPath field names if omitted (source, source_genesymbol, etc.).

None
target_col str

Column name for target node identifiers. Auto-detected if omitted.

None
directed_col str

Column holding per-edge directedness flags (bool-like). If omitted, default_directed is used for all edges.

None
weight_col str

Column holding edge weights. If omitted, weight defaults to 1.0.

None
edge_id_col str

Column holding stable edge identifiers. If omitted, AnnNet assigns sequential IDs (edge_0, edge_1, ...).

None
slice_col str

Column holding per-edge slice identifiers.

None
slice str

Slice to place all edges into. Ignored if slice_col is provided.

None
default_directed bool

Fallback directedness when directed_col is missing or null. Defaults to True.

True
edge_attr_cols list[str]

Columns to store as edge attributes. If omitted, all non-structural columns are used. Pass [] to skip edge attribute loading entirely.

None
dropna bool

If True (default), silently drop rows with null source or target. If False, raise ValueError on first null endpoint.

True
annotations_backend str

Backend for AnnNet attribute tables. Default "polars".

'polars'
vertex_annotations_df DataFrame - like

Pre-loaded annotation table in OmniPath long format (genesymbol, source, label, value). Skips all file I/O. Fastest option when rebuilding the graph multiple times in one session — load the archive once and pass it here.

None
vertex_annotations_path str

Path to a local OmniPath annotation file (.tsv or .tsv.gz). Skips the cache-check and download.

None
vertex_annotation_sources list[str]

OmniPath annotation resource names to include as vertex attributes. If omitted, all resources in the annotation table are loaded (254 columns). Recommended subset for signaling graphs::

["HGNC", "CancerGeneCensus", "SignaLink_function", "SignaLink_pathway",
 "UniProt_location", "HPA_subcellular", "PROGENy", "IntOGen",
 "Phosphatome", "kinase.com"]
None
load_vertex_annotations bool

Whether to load vertex annotations at all. Set to False to skip annotation loading entirely and get a structure-only graph. Default True.

True
**graph_kwargs

Additional keyword arguments forwarded to the AnnNet constructor.

{}

Returns:

Type Description
AnnNet

Fully constructed graph with: - Vertices: one per unique gene symbol encountered as source or target. - Edges: one per interaction row, with incidence matrix weights encoding direction (+w source, −w target for directed; +w both for undirected). - edge_attributes: Polars DF with one row per edge and one column per entry in edge_attr_cols. - vertex_attributes: Polars DF with one row per vertex and one column per (source:label) annotation pair from the requested resources.

Notes
  • Edge and vertex attribute tables are populated via bulk operations — no per-row DataFrame allocations occur during loading.
  • History tracking is disabled during construction and re-enabled on return.
  • The annotation archive is ~114MB compressed. First call downloads and caches it; subsequent calls load from ~/.cache/annnet/omnipath_annotations.tsv.gz in ~2–3s.
  • source and target columns from the interaction table are redundant as edge attributes (the graph structure already encodes them). Exclude them via edge_attr_cols if not needed.
See Also

AnnNet, AnnNet.add_edges_bulk, AnnNet.add_vertices_bulk

Examples:

Minimal load (structure only, no annotations)::

G = read_omnipath(load_vertex_annotations=False)

Full load with curated vertex annotation sources::

G = read_omnipath(
    dataset="omnipath",
    query={"organism": "human", "genesymbols": True},
    source_col="source_genesymbol",
    target_col="target_genesymbol",
    edge_attr_cols=[
        "is_stimulation", "is_inhibition", "consensus_direction",
        "n_sources", "n_references", "sources", "references_stripped",
    ],
    vertex_annotation_sources=[
        "HGNC", "CancerGeneCensus", "SignaLink_function",
        "UniProt_location", "HPA_subcellular", "IntOGen",
    ],
)

Pass a pre-loaded annotation table to avoid repeated downloads::

ann = pl.read_csv("~/.cache/annnet/omnipath_annotations.tsv.gz", separator="\t")
G = read_omnipath(vertex_annotations_df=ann)

Build from a custom DataFrame instead of fetching from OmniPath::

import pandas as pd
df = pd.DataFrame({
    "source": ["EGFR", "TP53"],
    "target": ["STAT3", "MDM2"],
    "is_directed": [True, True],
})
G = read_omnipath(df=df, load_vertex_annotations=False)