Skip to content

CSV

CSV helpers from annnet.io.csv_io.

annnet.io.csv_io

This module purposefully avoids importing stdlib csv and uses Polars for IO.

It ingests a CSV into AnnNet by auto-detecting common schemas: - Edge list (including DOK/COO triples and variations) - Hyperedge table (members column or head/tail sets) - Incidence matrix (rows=entities, cols=edges, ±w orientation) - Adjacency matrix (square matrix, weighted/unweighted) - LIL-style neighbor lists (single column of neighbors)

If auto-detection fails or you want control, pass schema=... explicitly.

Dependencies: polars, numpy, scipy (only if you use sparse helpers), AnnNet

Design notes: - We treat unknown columns as attributes ("pure" non-structural) and write them via the corresponding set_*_attrs APIs when applicable. - slices: if a slice column exists it can contain a single slice or multiple (separated by |, ;, or ,). Per-slice weight overrides support columns of the form weight:<slice_name>. - Directedness: we honor an explicit directed column when present (truthy), else infer for incidence (presence of negative values) and adjacency (symmetry check). - We try not to guess too hard. If the heuristics get it wrong, supply schema="edge_list" / "hyperedge" / "incidence" / "adjacency" / "lil".

Public entry points: - load_csv_to_graph(path, graph=None, schema="auto", options) -> AnnNet - from_dataframe(df, graph=None, schema="auto", options) -> AnnNet

Both will create and return an AnnNet (or mutate the provided one).

Classes

Functions

load_csv_to_graph(path, *, graph=None, schema='auto', default_slice=None, default_directed=None, default_weight=1.0, infer_schema_length=10000, encoding=None, null_values=None, low_memory=True, **kwargs)

Load a CSV and construct/augment an AnnNet.

Parameters:

Name Type Description Default
path str

Path to the CSV file.

required
graph AnnNet or None

If provided, mutate this graph; otherwise create a new AnnNet using AnnNet(**kwargs).

None
schema ('auto', 'edge_list', 'hyperedge', 'incidence', 'adjacency', 'lil')

Parsing mode. 'auto' tries to infer the schema from columns and types.

'auto','edge_list','hyperedge','incidence','adjacency','lil'
default_slice str or None

slice to register vertices/edges when none is specified in the data.

None
default_directed bool or None

Default directedness for binary edges when not implied by data.

None
default_weight float

Default weight when not specified.

1.0
infer_schema_length int

Row count Polars uses to infer column types.

10000
encoding str or None

File encoding override.

None
null_values list[str] or None

Additional strings to interpret as nulls.

None
low_memory bool

Pass to Polars read_csv for balanced memory usage.

True
**kwargs Any

Passed to AnnNet constructor if graph is None.

{}

Returns:

Type Description
AnnNet

The populated graph instance.

Raises:

Type Description
RuntimeError

If no AnnNet can be constructed or imported.

ValueError

If schema is unknown or parsing fails.

from_dataframe(df, *, graph=None, schema='auto', default_slice=None, default_directed=None, default_weight=1.0, **kwargs)

Build/augment an AnnNet from a Polars DataFrame.

Parameters:

Name Type Description Default
df DataFrame

Input table parsed from CSV.

required
graph AnnNet or None

If provided, mutate this graph; otherwise create a new AnnNet using AnnNet(**kwargs).

None
schema ('auto', 'edge_list', 'hyperedge', 'incidence', 'adjacency', 'lil')

Parsing mode. 'auto' tries to infer the schema.

'auto','edge_list','hyperedge','incidence','adjacency','lil'
default_slice str or None

Fallback slice if no slice is specified in the data.

None
default_directed bool or None

Default directedness for binary edges when not implied by data.

None
default_weight float

Weight to use when no explicit weight is present.

1.0

Returns:

Type Description
AnnNet

The populated graph instance.

export_edge_list_csv(G, path, slice=None)

Export the binary edge subgraph to a CSV [Comma-Separated Values] file.

Parameters:

Name Type Description Default
G AnnNet

AnnNet instance to export. Must support edges_view with columns compatible with binary endpoints (e.g., 'source', 'target').

required
path str or Path

Output path for the CSV file.

required
slice str

Restrict the export to a specific slice. If None, all slices are exported.

None

Returns:

Type Description
None
Notes
  • Only binary edges are exported. Hyperedges (edges connecting more than two entities) are ignored.
  • Output columns include: 'source', 'target', 'weight', 'directed', and 'slice'.
  • If a weight column does not exist, a default weight of 1.0 is written.
  • If a directedness column is absent, it will be written as None.
  • This format is compatible with load_csv_to_graph(schema="edge_list").
export_hyperedge_csv(G, path, slice=None, directed=None)

Export hyperedges from the graph to a CSV [Comma-Separated Values] file.

Parameters:

Name Type Description Default
G AnnNet

AnnNet instance to export. Must support edges_view exposing either 'members' (for undirected hyperedges) or 'head'/'tail' (for directed hyperedges).

required
path str or Path

Output path for the CSV file.

required
slice str

Restrict the export to a specific slice. If None, all slices are exported.

None
directed bool

Force treatment of hyperedges as directed or undirected. If None, the function attempts to infer directedness from the graph.

None

Returns:

Type Description
None
Notes
  • If the graph exposes a 'members' column, the output will contain one row per undirected hyperedge.
  • If 'head' and 'tail' columns are present, the output will contain one row per directed hyperedge. If directed=False is passed, 'head' and 'tail' are merged into a 'members' column.
  • A 'weight' column is included if available; otherwise, all weights default to 1.0.
  • A 'slice' column is included if present or if slice is specified.
  • This format is compatible with load_csv_to_graph(schema="hyperedge").
  • If the graph does not expose hyperedge columns, a ValueError is raised.