CSV
CSV helpers from annnet.io.csv_io.
annnet.io.csv_io
This module purposefully avoids importing stdlib csv and uses Polars for IO.
It ingests a CSV into AnnNet by auto-detecting common schemas: - Edge list (including DOK/COO triples and variations) - Hyperedge table (members column or head/tail sets) - Incidence matrix (rows=entities, cols=edges, ±w orientation) - Adjacency matrix (square matrix, weighted/unweighted) - LIL-style neighbor lists (single column of neighbors)
If auto-detection fails or you want control, pass schema=... explicitly.
Dependencies: polars, numpy, scipy (only if you use sparse helpers), AnnNet
Design notes:
- We treat unknown columns as attributes ("pure" non-structural) and write them via
the corresponding set_*_attrs APIs when applicable.
- slices: if a slice column exists it can contain a single slice or multiple
(separated by |, ;, or ,). Per-slice weight overrides support columns of the
form weight:<slice_name>.
- Directedness: we honor an explicit directed column when present (truthy), else
infer for incidence (presence of negative values) and adjacency (symmetry check).
- We try not to guess too hard. If the heuristics get it wrong, supply
schema="edge_list" / "hyperedge" / "incidence" / "adjacency" / "lil".
Public entry points: - load_csv_to_graph(path, graph=None, schema="auto", options) -> AnnNet - from_dataframe(df, graph=None, schema="auto", options) -> AnnNet
Both will create and return an AnnNet (or mutate the provided one).
Classes
Functions
load_csv_to_graph(path, *, graph=None, schema='auto', default_slice=None, default_directed=None, default_weight=1.0, infer_schema_length=10000, encoding=None, null_values=None, low_memory=True, **kwargs)
Load a CSV and construct/augment an AnnNet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the CSV file. |
required |
graph
|
AnnNet or None
|
If provided, mutate this graph; otherwise create a new AnnNet using
|
None
|
schema
|
('auto', 'edge_list', 'hyperedge', 'incidence', 'adjacency', 'lil')
|
Parsing mode. 'auto' tries to infer the schema from columns and types. |
'auto','edge_list','hyperedge','incidence','adjacency','lil'
|
default_slice
|
str or None
|
slice to register vertices/edges when none is specified in the data. |
None
|
default_directed
|
bool or None
|
Default directedness for binary edges when not implied by data. |
None
|
default_weight
|
float
|
Default weight when not specified. |
1.0
|
infer_schema_length
|
int
|
Row count Polars uses to infer column types. |
10000
|
encoding
|
str or None
|
File encoding override. |
None
|
null_values
|
list[str] or None
|
Additional strings to interpret as nulls. |
None
|
low_memory
|
bool
|
Pass to Polars read_csv for balanced memory usage. |
True
|
**kwargs
|
Any
|
Passed to AnnNet constructor if |
{}
|
Returns:
| Type | Description |
|---|---|
AnnNet
|
The populated graph instance. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If no AnnNet can be constructed or imported. |
ValueError
|
If schema is unknown or parsing fails. |
from_dataframe(df, *, graph=None, schema='auto', default_slice=None, default_directed=None, default_weight=1.0, **kwargs)
Build/augment an AnnNet from a Polars DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Input table parsed from CSV. |
required |
graph
|
AnnNet or None
|
If provided, mutate this graph; otherwise create a new AnnNet using
|
None
|
schema
|
('auto', 'edge_list', 'hyperedge', 'incidence', 'adjacency', 'lil')
|
Parsing mode. 'auto' tries to infer the schema. |
'auto','edge_list','hyperedge','incidence','adjacency','lil'
|
default_slice
|
str or None
|
Fallback slice if no slice is specified in the data. |
None
|
default_directed
|
bool or None
|
Default directedness for binary edges when not implied by data. |
None
|
default_weight
|
float
|
Weight to use when no explicit weight is present. |
1.0
|
Returns:
| Type | Description |
|---|---|
AnnNet
|
The populated graph instance. |
export_edge_list_csv(G, path, slice=None)
Export the binary edge subgraph to a CSV [Comma-Separated Values] file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
G
|
AnnNet
|
AnnNet instance to export. Must support |
required |
path
|
str or Path
|
Output path for the CSV file. |
required |
slice
|
str
|
Restrict the export to a specific slice. If None, all slices are exported. |
None
|
Returns:
| Type | Description |
|---|---|
None
|
|
Notes
- Only binary edges are exported. Hyperedges (edges connecting more than two entities) are ignored.
- Output columns include: 'source', 'target', 'weight', 'directed', and 'slice'.
- If a weight column does not exist, a default weight of 1.0 is written.
- If a directedness column is absent, it will be written as
None. - This format is compatible with
load_csv_to_graph(schema="edge_list").
export_hyperedge_csv(G, path, slice=None, directed=None)
Export hyperedges from the graph to a CSV [Comma-Separated Values] file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
G
|
AnnNet
|
AnnNet instance to export. Must support |
required |
path
|
str or Path
|
Output path for the CSV file. |
required |
slice
|
str
|
Restrict the export to a specific slice. If None, all slices are exported. |
None
|
directed
|
bool
|
Force treatment of hyperedges as directed or undirected. If None, the function attempts to infer directedness from the graph. |
None
|
Returns:
| Type | Description |
|---|---|
None
|
|
Notes
- If the graph exposes a 'members' column, the output will contain one row per undirected hyperedge.
- If 'head' and 'tail' columns are present, the output will contain one row per
directed hyperedge. If
directed=Falseis passed, 'head' and 'tail' are merged into a 'members' column. - A 'weight' column is included if available; otherwise, all weights default to 1.0.
- A 'slice' column is included if present or if
sliceis specified. - This format is compatible with
load_csv_to_graph(schema="hyperedge"). - If the graph does not expose hyperedge columns, a
ValueErroris raised.