py_research.db.importing module#

Utilities for importing different data representations into relational format.

class TableMap(table, map, ext_maps=None, link_attr=None, link_type='n-m', join_table_name=None, join_table_map=None, id_type='hash', id_attr=None, hash_id_with_path=False, match_by_attr=False, conflict_policy='raise')[source]#

Bases: object

Configuration for how to map (nested) dictionary items to relational tables.

Parameters:
table: str#

Name of the table to map to attributes in map to.

map: Mapping[str, str | bool | Mapping[str, str | bool | _RelationalMap | TableMap | list[TableMap]] | TableMap | list[TableMap]] | set[str] | str | Callable[[dict | str], Mapping[str, str | bool | Mapping[str, str | bool | _RelationalMap | TableMap | list[TableMap]] | TableMap | list[TableMap]] | set[str] | str]#

Mapping of hierarchical attributes to table columns or other tables.

ext_maps: list[TableMap] | None = None#

Map attributes on the same level to different tables

Override attribute to use when referencing this table from a parent table.

Type of reference between this table and the parent table. Only n-m will create a join table.

join_table_name: str | None = None#

Name of the join table to use for referencing this table to its parent, if any

join_table_map: Mapping[str, str | bool | Mapping[str, str | bool | _AttrMap]] | None = None#

Mapping of hierarchical attributes to join table rows.

id_type: Literal['hash', 'attr', 'uuid'] = 'hash'#

Type of id to use for this table. - hash: sha256 hash of a subset of all mapped attributes - attr: use given attr as id directly, no hashing - uuid: generate random uuid for each row

id_attr: str | list[str] | None = None#

Name of unique, mapped attribute to use as id directly or list of mapped attributes to use for auto-generating row ids via sha256 hashing. If None, all mapped attributes will be used for hashing.

hash_id_with_path: bool = False#

If True, the id will be generated based on the data and the full tree path.

match_by_attr: bool | str = False#

Try to match this mapped data to target table (by given attr) before creating a new row.

conflict_policy: Literal['raise', 'ignore', 'override'] = 'raise'#

Which policy to use if import conflicts occur for this table.

RelDB#

Full relational database representation.

alias of tuple[dict[str, dict[Hashable, dict[str, Any]]], dict[tuple[str, str], tuple[str, str]], set[str]]

tree_to_db(data: dict | str, mapping: TableMap, collect_conflicts: Literal[True] = False) tuple[DB, dict[tuple[str, str, str], tuple[Any, Any]]][source]#
tree_to_db(data: dict | str, mapping: TableMap, collect_conflicts: Literal[False] = False) DB

Transform recursive dictionary data into relational format.

Parameters:
  • data – The data to be transformed

  • mapping – Configuration for how to performm the mapping.

  • collect_conflicts – Collect all conflicts and return them, rather than stopping right away.

Returns:

The relational database representation of the data. If collect_conflicts is True, a tuple of the database and the conflicts is returned.