Skip to content

vishalbelsare/nexus-forge

Repository files navigation

Installation | Getting Started | Contributing | Acknowledgements

Knowledge Graph Forge

A domain-agnostic, generic and extensible Python framework for consistently building and interacting with knowledge graphs in a data science context.

This framework builds a bridge between data engineers, knowledge engineers, and data scientists in the context of knowledge graphs by making easier for:

  • data engineers to define, execute, and share data transformations in a traceable way,
  • knowledge engineers to define and share knowledge representations of heterogeneous data,
  • data scientists to query and register data during their analysis without having to worry about the semantic formats and technologies,

while guaranteeing the consistency of operations with a knowledge graph schema like Neuroshapes.

The architectural design choices:

  1. be generic on where it brings flexibility for adaptation to multiple ecosystems,
  2. be opinionated on where it simplifies the complexity,
  3. have strong separation of concern with delegation to the lowest level for modularity.

Installation

Stable version

pip install kgforge

Upgrade to latest version

pip install --upgrade kgforge

Development version

pip install git+https://github.com/BlueBrain/kgforge

Getting Started

See in the directory examples for examples of usages, configurations, mappings.

User API

Forge

KnowledgeGraphForge(configuration: Union[str, Dict], **kwargs)

Resources

Resource(**properties)

Dataset(forge: KnowledgeGraphForge, type: str = "Dataset", **properties)
  add_parts(resources: List[Resource], versioned: bool = True) -> None
  add_distribution(path: str) -> None
  add_contribution(agent: str, **kwargs) -> None
  add_generation(**kwargs) -> None
  add_derivation(resource: Resource, versioned: bool = True, **kwargs) -> None
  add_invalidation(**kwargs) -> None
  add_files(path: str) -> None
  download(source: str, path: str) -> None

Modeling

prefixes(pretty: bool = True) -> Optional[Dict[str, str]]
types(pretty: bool = True) -> Optional[List[str]]
template(type: str, only_required: bool = False) -> None
validate(data: Union[Resource, List[Resource]]) -> None

Resolving

resolve(text: str, scope: Optional[str] = None, resolver: Optional[str] = None, target: Optional[str] = None, type: Optional[str] = None, strategy: ResolvingStrategy = ResolvingStrategy.BEST_MATCH) -> Optional[Union[Resource, List[Resource]]]

Formatting

format(what: str, *args) -> str

Mapping

sources(pretty: bool = True) -> Optional[List[str]]
mappings(source: str, pretty: bool = True) -> Optional[Dict[str, List[str]]]
mapping(entity: str, source: str, type: Callable = DictionaryMapping) -> Mapping
map(data: Any, mapping: Union[Mapping, List[Mapping]], mapper: Callable = DictionaryMapper, na: Union[Any, List[Any]] = None) -> Union[Resource, List[Resource]]

Reshaping

reshape(data: Union[Resource, List[Resource]], keep: List[str], versioned: bool = False) -> Union[Resource, List[Resource]]

Querying

retrieve(id: str, version: Optional[Union[int, str]] = None) -> Resource:
paths(type: str) -> PathsWrapper:
search(*filters, **params) -> List[Resource]
sparql(query: str) -> List[Resource]
download(data: Union[Resource, List[Resource]], follow: str, path: str) -> None

Storing

register(data: Union[Resource, List[Resource]]) -> None
update(data: Union[Resource, List[Resource]]) -> None
deprecate(data: Union[Resource, List[Resource]]) -> None

Versioning

tag(data: Union[Resource, List[Resource]], value: str) -> None
freeze(data: Union[Resource, List[Resource]]) -> None

Files handling

attach(path: str) -> LazyAction

Converting

as_json(data: Union[Resource, List[Resource]], expanded: bool = False, store_metadata: bool = False) -> Union[Dict, List[Dict]]
as_jsonld(data: Union[Resource, List[Resource]], compacted: bool = True, store_metadata: bool = False) -> Union[Dict, List[Dict]]
as_triples(data: Union[Resource, List[Resource]], store_metadata: bool = False) -> List[Tuple[str, str, str]]
as_dataframe(data: List[Resource], na: Union[Any, List[Any]] = [None], nesting: str = ".", expanded: bool = False, store_metadata: bool = False) -> DataFrame
from_json(data: Union[Dict, List[Dict]], na: Union[Any, List[Any]] = None) -> Union[Resource, List[Resource]]
from_jsonld(data: Union[Dict, List[Dict]]) -> Union[Resource, List[Resource]]
from_triples(data: List[Tuple[str, str, str]]) -> Union[Resource, List[Resource]]
from_dataframe(data: DataFrame, na: Union[Any, List[Any]] = np.nan, nesting: str = ".") -> Union[Resource, List[Resource]]

Internals

Archetypes

Mapper

Mapper(forge: Optional["KnowledgeGraphForge"] = None)
  map(data: Any, mapping: Union[Mapping, List[Mapping]], na: Union[Any, List[Any]]) -> Union[Resource, List[Resource]]

Mapping

Mapping(mapping: str)
  load(source: str) -> Mapping
  save(path: str) -> None

Model

Model(source: str, **source_config)
  prefixes(pretty: bool) -> Optional[Dict[str, str]]
  types(pretty: bool) -> Optional[List[str]]
  template(type: str, only_required: bool) -> str
  sources(pretty: bool) -> Optional[List[str]]
  mappings(source: str, pretty) -> Optional[Dict[str, List[str]]]
  mapping(entity: str, source: str, type: Callable) -> Mapping
  validate(data: Union[Resource, List[Resource]]) -> None

Resolver

Resolver(source: str, targets: List[Dict[str, str]], result_resource_mapping: str, **source_config)
  resolve(text: str, target: Optional[str], type: Optional[str], strategy: ResolvingStrategy) -> Optional[Union[Resource, List[Resource]]]

Store

Store(endpoint: Optional[str] = None, bucket: Optional[str] = None, token: Optional[str] = None, versioned_id_template: Optional[str] = None, file_resource_mapping: Optional[str] = None))
  register(data: Union[Resource, List[Resource]]) -> None
  upload(path: str) -> Union[Resource, List[Resource]]
  retrieve(id: str, version: Optional[Union[int, str]]) -> Resource
  download(data: Union[Resource, List[Resource]], follow: str, path: str) -> None
  update(data: Union[Resource, List[Resource]]) -> None
  tag(data: Union[Resource, List[Resource]], value: str) -> None
  deprecate(data: Union[Resource, List[Resource]]) -> None
  search(resolvers: List[Resolver], *filters, **params) -> List[Resource]
  sparql(prefixes: Dict[str, str], query: str) -> List[Resource]
  freeze(data: Union[Resource, List[Resource]]) -> None

Archetype specializations

Mappers

DictionaryMapper
[TODO] R2RmlMapper
[TODO] ResourceMapper
[TODO] TableMapper

Mappings

DictionaryMapping

Models

DemoModel
[Work In Progress] Neuroshapes

Resolvers

DemoResolver

Stores

DemoStore
[TODO] RdfLibGraph
[Work In Progress] BlueBrainNexus

Contributing

Please add @pafonta as reviewer if your Pull Request modifies core.

Setup

git clone https://github.com/BlueBrain/kgforge
pip install --editable kgforge[dev]

Checks before committing

tox

Styling

PEP 8, PEP 257, and PEP 20 must be followed.

Releasing

# Setup
pip install --upgrade pip setuptools wheel twine

# Checkout
git checkout master
git pull upstream master

# Check
tox

# Tag
git tag -a v<x>.<y>.<z> HEAD
git push upstream v<x>.<y>.<z>

# Build
python setup.py sdist bdist_wheel

# Upload
twine upload dist/*

# Clean
rm -R build dist *.egg-info

Acknowledgements

This project has received funding from the EPFL Blue Brain Project (funded by the Swiss government’s ETH Board of the Swiss Federal Institutes of Technology) and from the European Union’s Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreement No. 785907 (Human Brain Project SGA2).

About

Building and Using Knowledge Graphs made easy

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 97.7%
  • Gherkin 1.8%
  • Jupyter Notebook 0.5%