I/O and Interoperability¶

This notebook introduces the main data-export and backend-conversion paths.

AnnNet is the source of truth. External tables and backend graphs are projections of that source of truth.

In [1]:

Copied!





import sys
from pathlib import Path

repo_root = Path.cwd()
if not (repo_root / 'annnet').exists():
    for parent in repo_root.parents:
        if (parent / 'annnet').exists():
            repo_root = parent
            break
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

import annnet as an
import sys
from pathlib import Path

repo_root = Path.cwd()
if not (repo_root / 'annnet').exists():
    for parent in repo_root.parents:
        if (parent / 'annnet').exists():
            repo_root = parent
            break
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

import annnet as an

In [2]:

Copied!





G = an.AnnNet(directed=True)
G.add_vertices(
    [
        ('EGFR', {'kind': 'protein'}),
        ('GRB2', {'kind': 'protein'}),
        ('SOS1', {'kind': 'protein'}),
        ('RAS', {'kind': 'protein'}),
    ]
)
G.add_edges('EGFR', 'GRB2', edge_id='e1', confidence=0.99)
G.add_edges('GRB2', 'SOS1', edge_id='e2', confidence=0.95)
G.add_edges(src=['SOS1', 'RAS', 'EGFR'], edge_id='h1', directed=False, process='complex')
G = an.AnnNet(directed=True)
G.add_vertices(
    [
        ('EGFR', {'kind': 'protein'}),
        ('GRB2', {'kind': 'protein'}),
        ('SOS1', {'kind': 'protein'}),
        ('RAS', {'kind': 'protein'}),
    ]
)
G.add_edges('EGFR', 'GRB2', edge_id='e1', confidence=0.99)
G.add_edges('GRB2', 'SOS1', edge_id='e2', confidence=0.95)
G.add_edges(src=['SOS1', 'RAS', 'EGFR'], edge_id='h1', directed=False, process='complex')

Out[2]:

'h1'

Export to explicit tables¶

to_dataframes(...) is the easiest way to make the graph explicit as separate tables.

In [3]:

Copied!





tables = an.to_dataframes(G)
print(sorted(tables))
print('nodes table:')
print(tables['nodes'])
print('edges table:')
print(tables['edges'])
print('hyperedges table:')
print(tables['hyperedges'])
tables = an.to_dataframes(G)
print(sorted(tables))
print('nodes table:')
print(tables['nodes'])
print('edges table:')
print(tables['edges'])
print('hyperedges table:')
print(tables['hyperedges'])

['edges', 'hyperedges', 'nodes', 'slice_weights', 'slices']
nodes table:
shape: (4, 2)
┌───────────┬─────────┐
│ vertex_id ┆ kind    │
│ ---       ┆ ---     │
│ str       ┆ str     │
╞═══════════╪═════════╡
│ EGFR      ┆ protein │
│ GRB2      ┆ protein │
│ SOS1      ┆ protein │
│ RAS       ┆ protein │
└───────────┴─────────┘
edges table:
shape: (2, 8)
┌─────────┬────────┬────────┬────────┬──────────┬───────────┬────────────┬─────────┐
│ edge_id ┆ source ┆ target ┆ weight ┆ directed ┆ edge_type ┆ confidence ┆ process │
│ ---     ┆ ---    ┆ ---    ┆ ---    ┆ ---      ┆ ---       ┆ ---        ┆ ---     │
│ str     ┆ str    ┆ str    ┆ f64    ┆ bool     ┆ str       ┆ f64        ┆ null    │
╞═════════╪════════╪════════╪════════╪══════════╪═══════════╪════════════╪═════════╡
│ e1      ┆ EGFR   ┆ GRB2   ┆ 1.0    ┆ true     ┆ binary    ┆ 0.99       ┆ null    │
│ e2      ┆ GRB2   ┆ SOS1   ┆ 1.0    ┆ true     ┆ binary    ┆ 0.95       ┆ null    │
└─────────┴────────┴────────┴────────┴──────────┴───────────┴────────────┴─────────┘
hyperedges table:
shape: (1, 8)
┌─────────┬──────────┬────────┬──────┬──────┬─────────────────────────┬────────────┬─────────┐
│ edge_id ┆ directed ┆ weight ┆ head ┆ tail ┆ members                 ┆ confidence ┆ process │
│ ---     ┆ ---      ┆ ---    ┆ ---  ┆ ---  ┆ ---                     ┆ ---        ┆ ---     │
│ str     ┆ bool     ┆ f64    ┆ null ┆ null ┆ list[str]               ┆ null       ┆ str     │
╞═════════╪══════════╪════════╪══════╪══════╪═════════════════════════╪════════════╪═════════╡
│ h1      ┆ false    ┆ 1.0    ┆ null ┆ null ┆ ["SOS1", "EGFR", "RAS"] ┆ null       ┆ complex │
└─────────┴──────────┴────────┴──────┴──────┴─────────────────────────┴────────────┴─────────┘

Backend conversion and algorithm interoperability¶

The lazy backend interoperability accessors live on the graph object itself: G.nx, G.ig, and G.gt.

Use backend() when you want the concrete projected backend graph object. Use G.nx.<function>(G, ...) when you want AnnNet to convert G, replace the graph argument with the NetworkX projection, dispatch the NetworkX function, and return the result.

In [4]:

Copied!

nx_graph = G.nx.backend()
print(type(nx_graph).__name__)
print('networkx nodes / edges:', nx_graph.number_of_nodes(), nx_graph.number_of_edges())
nx_graph = G.nx.backend()
print(type(nx_graph).__name__)
print('networkx nodes / edges:', nx_graph.number_of_nodes(), nx_graph.number_of_edges())

MultiDiGraph
networkx nodes / edges: 4 8

In [5]:

Copied!





# Direct NetworkX interoperability: pass the AnnNet graph as the graph argument.
# The accessor converts G to a NetworkX graph, dispatches the function, and returns the result.
path_length = G.nx.shortest_path_length(G, source='EGFR', target='RAS')
print('EGFR -> RAS shortest path length:', path_length)
# Direct NetworkX interoperability: pass the AnnNet graph as the graph argument.
# The accessor converts G to a NetworkX graph, dispatches the function, and returns the result.
path_length = G.nx.shortest_path_length(G, source='EGFR', target='RAS')
print('EGFR -> RAS shortest path length:', path_length)

EGFR -> RAS shortest path length: 1

Native round-trip¶

The native .annnet format is the high-fidelity persistence format. Use it when AnnNet is the system of record.

In [6]:

Copied!





from pathlib import Path

out = Path('tmp_tutorial_graph.annnet')
G.write(out, overwrite=True)
G2 = an.AnnNet.read(out)
print('round-trip shape:', G2.shape)
print('round-trip vertices:', G2.vertices())
from pathlib import Path

out = Path('tmp_tutorial_graph.annnet')
G.write(out, overwrite=True)
G2 = an.AnnNet.read(out)
print('round-trip shape:', G2.shape)
print('round-trip vertices:', G2.vertices())

round-trip shape: (4, 3)
round-trip vertices: ['EGFR', 'GRB2', 'SOS1', 'RAS']