Attributes, Views, and Indexing¶
This notebook focuses on the metadata and read-oriented parts of the API.
The structural graph lives in the incidence matrix plus canonical entity and edge records. Attributes are stored separately in dataframe-like tables. That separation is one of the core design choices in AnnNet.
In [1]:
Copied!
import sys
from pathlib import Path
repo_root = Path.cwd()
if not (repo_root / 'annnet').exists():
for parent in repo_root.parents:
if (parent / 'annnet').exists():
repo_root = parent
break
if str(repo_root) not in sys.path:
sys.path.insert(0, str(repo_root))
import annnet as an
import sys
from pathlib import Path
repo_root = Path.cwd()
if not (repo_root / 'annnet').exists():
for parent in repo_root.parents:
if (parent / 'annnet').exists():
repo_root = parent
break
if str(repo_root) not in sys.path:
sys.path.insert(0, str(repo_root))
import annnet as an
In [2]:
Copied!
G = an.AnnNet(directed=True)
G.add_vertices_bulk(
[
('EGFR', {'family': 'RTK', 'score': 0.9}),
('GRB2', {'family': 'adapter', 'score': 0.8}),
('SOS1', {'family': 'exchange_factor', 'score': 0.7}),
('RAS', {'family': 'small_gtpase', 'score': 0.95}),
]
)
G.add_edge('EGFR', 'GRB2', edge_id='e1', confidence=0.99)
G.add_edge('GRB2', 'SOS1', edge_id='e2', confidence=0.95)
G.add_edge('SOS1', 'RAS', edge_id='e3', confidence=0.98)
G = an.AnnNet(directed=True)
G.add_vertices_bulk(
[
('EGFR', {'family': 'RTK', 'score': 0.9}),
('GRB2', {'family': 'adapter', 'score': 0.8}),
('SOS1', {'family': 'exchange_factor', 'score': 0.7}),
('RAS', {'family': 'small_gtpase', 'score': 0.95}),
]
)
G.add_edge('EGFR', 'GRB2', edge_id='e1', confidence=0.99)
G.add_edge('GRB2', 'SOS1', edge_id='e2', confidence=0.95)
G.add_edge('SOS1', 'RAS', edge_id='e3', confidence=0.98)
Out[2]:
'e3'
Graph, vertex, and edge attributes¶
Use the typed getters and bulk setters instead of mutating the dataframes directly.
In [3]:
Copied!
G.set_graph_attribute('species', 'human')
G.set_vertex_attrs('EGFR', compartment='membrane', score=1.0)
G.set_vertex_attrs_bulk(
{
'GRB2': {'compartment': 'cytosol'},
'SOS1': {'compartment': 'cytosol'},
}
)
G.set_edge_attrs('e1', evidence='literature')
G.set_edge_attrs_bulk(
{
'e2': {'evidence': 'screen'},
'e3': {'evidence': 'curated'},
}
)
print('graph attrs:', G.get_graph_attributes())
print('EGFR attrs:', G.get_vertex_attrs('EGFR'))
print('e1 attrs:', G.get_edge_attrs('e1'))
G.set_graph_attribute('species', 'human')
G.set_vertex_attrs('EGFR', compartment='membrane', score=1.0)
G.set_vertex_attrs_bulk(
{
'GRB2': {'compartment': 'cytosol'},
'SOS1': {'compartment': 'cytosol'},
}
)
G.set_edge_attrs('e1', evidence='literature')
G.set_edge_attrs_bulk(
{
'e2': {'evidence': 'screen'},
'e3': {'evidence': 'curated'},
}
)
print('graph attrs:', G.get_graph_attributes())
print('EGFR attrs:', G.get_vertex_attrs('EGFR'))
print('e1 attrs:', G.get_edge_attrs('e1'))
graph attrs: {'species': 'human'}
EGFR attrs: {'vertex_id': 'EGFR', 'score': 1.0, 'family': 'RTK', 'compartment': 'membrane'}
e1 attrs: {'edge_id': 'e1', 'confidence': 0.99, 'evidence': 'literature'}
Dataframe-style views¶
obs, var, and the explicit vertices_view() / edges_view() helpers provide tabular snapshots of current metadata.
In [4]:
Copied!
print('obs:')
print(G.obs)
print('var:')
print(G.var)
print('vertices_view:')
print(G.vertices_view())
print('edges_view:')
print(G.edges_view())
print('obs:')
print(G.obs)
print('var:')
print(G.var)
print('vertices_view:')
print(G.vertices_view())
print('edges_view:')
print(G.edges_view())
obs: shape: (4, 4) ┌───────────┬───────┬─────────────────┬─────────────┐ │ vertex_id ┆ score ┆ family ┆ compartment │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ str ┆ str │ ╞═══════════╪═══════╪═════════════════╪═════════════╡ │ EGFR ┆ 1.0 ┆ RTK ┆ membrane │ │ GRB2 ┆ 0.8 ┆ adapter ┆ cytosol │ │ SOS1 ┆ 0.7 ┆ exchange_factor ┆ cytosol │ │ RAS ┆ 0.95 ┆ small_gtpase ┆ null │ └───────────┴───────┴─────────────────┴─────────────┘ var: shape: (3, 3) ┌─────────┬────────────┬────────────┐ │ edge_id ┆ confidence ┆ evidence │ │ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ str │ ╞═════════╪════════════╪════════════╡ │ e1 ┆ 0.99 ┆ literature │ │ e2 ┆ 0.95 ┆ screen │ │ e3 ┆ 0.98 ┆ curated │ └─────────┴────────────┴────────────┘ vertices_view: shape: (4, 4) ┌───────────┬───────┬─────────────────┬─────────────┐ │ vertex_id ┆ score ┆ family ┆ compartment │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ str ┆ str │ ╞═══════════╪═══════╪═════════════════╪═════════════╡ │ EGFR ┆ 1.0 ┆ RTK ┆ membrane │ │ GRB2 ┆ 0.8 ┆ adapter ┆ cytosol │ │ SOS1 ┆ 0.7 ┆ exchange_factor ┆ cytosol │ │ RAS ┆ 0.95 ┆ small_gtpase ┆ null │ └───────────┴───────┴─────────────────┴─────────────┘ edges_view: shape: (3, 13) ┌─────────┬────────┬──────────┬─────────────┬───┬───────────┬────────────┬────────────┬────────────┐ │ edge_id ┆ kind ┆ directed ┆ global_weig ┆ … ┆ members ┆ confidence ┆ evidence ┆ effective_ │ │ --- ┆ --- ┆ --- ┆ ht ┆ ┆ --- ┆ --- ┆ --- ┆ weight │ │ str ┆ str ┆ bool ┆ --- ┆ ┆ list[str] ┆ f64 ┆ str ┆ --- │ │ ┆ ┆ ┆ f64 ┆ ┆ ┆ ┆ ┆ f64 │ ╞═════════╪════════╪══════════╪═════════════╪═══╪═══════════╪════════════╪════════════╪════════════╡ │ e1 ┆ binary ┆ true ┆ 1.0 ┆ … ┆ null ┆ 0.99 ┆ literature ┆ 1.0 │ │ e2 ┆ binary ┆ true ┆ 1.0 ┆ … ┆ null ┆ 0.95 ┆ screen ┆ 1.0 │ │ e3 ┆ binary ┆ true ┆ 1.0 ┆ … ┆ null ┆ 0.98 ┆ curated ┆ 1.0 │ └─────────┴────────┴──────────┴─────────────┴───┴───────────┴────────────┴────────────┴────────────┘
Indexing through G.idx¶
G.idx is the public namespace for translating between external identifiers and incidence coordinates.
In [5]:
Copied!
print('entity_to_row(EGFR):', G.idx.entity_to_row('EGFR'))
print('row_to_entity(0):', G.idx.row_to_entity(0))
print('edge_to_col(e1):', G.idx.edge_to_col('e1'))
print('col_to_edge(0):', G.idx.col_to_edge(0))
print('stats:', G.idx.stats())
print('entity_to_row(EGFR):', G.idx.entity_to_row('EGFR'))
print('row_to_entity(0):', G.idx.row_to_entity(0))
print('edge_to_col(e1):', G.idx.edge_to_col('e1'))
print('col_to_edge(0):', G.idx.col_to_edge(0))
print('stats:', G.idx.stats())
entity_to_row(EGFR): 0
row_to_entity(0): EGFR
edge_to_col(e1): 0
col_to_edge(0): e1
stats: {'n_entities': 4, 'n_vertices': 4, 'n_edge_entities': 0, 'n_edges': 3, 'max_row': 3, 'max_col': 2}
Lazy graph views¶
A GraphView is a filtered lens over the same graph rather than a copied graph. Use it when you want lightweight inspection or temporary narrowing.
In [6]:
Copied!
view = G.view(vertices=['EGFR', 'GRB2', 'SOS1'], edges=['e1', 'e2'])
print('view vertex ids:', view.vertex_ids)
print('view edge ids:', view.edge_ids)
print('view obs:')
print(view.obs)
print('view var:')
print(view.var)
view = G.view(vertices=['EGFR', 'GRB2', 'SOS1'], edges=['e1', 'e2'])
print('view vertex ids:', view.vertex_ids)
print('view edge ids:', view.edge_ids)
print('view obs:')
print(view.obs)
print('view var:')
print(view.var)
view vertex ids: {'EGFR', 'SOS1', 'GRB2'}
view edge ids: {'e2', 'e1'}
view obs:
shape: (3, 4)
┌───────────┬───────┬─────────────────┬─────────────┐
│ vertex_id ┆ score ┆ family ┆ compartment │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ str ┆ str │
╞═══════════╪═══════╪═════════════════╪═════════════╡
│ EGFR ┆ 1.0 ┆ RTK ┆ membrane │
│ GRB2 ┆ 0.8 ┆ adapter ┆ cytosol │
│ SOS1 ┆ 0.7 ┆ exchange_factor ┆ cytosol │
└───────────┴───────┴─────────────────┴─────────────┘
view var:
shape: (2, 3)
┌─────────┬────────────┬────────────┐
│ edge_id ┆ confidence ┆ evidence │
│ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ str │
╞═════════╪════════════╪════════════╡
│ e1 ┆ 0.99 ┆ literature │
│ e2 ┆ 0.95 ┆ screen │
└─────────┴────────────┴────────────┘