Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the default classes immutable and mark mutable classes explictly #205

Merged
merged 11 commits into from
Dec 3, 2021
9 changes: 7 additions & 2 deletions doc/datamodel_syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,15 +88,20 @@ There can be one-to-one-relations and one-to-many relations being stored in a pa
```

### Explicit definition of methods
In a few cases, it makes sense to add some more functionality to the created classes. Thus this library provides two ways of defining additional methods and code. Either by defining them inline or in external files. Extra code has to be provided separately for const and non-const additions.
In a few cases, it makes sense to add some more functionality to the created classes. Thus this library provides two ways of defining additional methods and code. Either by defining them inline or in external files. Extra code has to be provided separately for immutable and mutable additions.
Note that the immutable (`ExtraCode`) will also be placed into the mutable classes, so that there is no need for duplicating the code.
Only if some additions should only be available for the mutable classes it is necessary to fill the `MutableExtraCode` section.
The `includes` will be add to the header files of the generated classes.

```yaml
ExtraCode:
includes: <newline separated list of strings (optional)>
declaration: <string>
implementation : <string>
declarationFile: <string> (to be implemented!)
implementationFile: <string> (to be implemented!)
ConstExtraCode:
MutableExtraCode:
includes: <newline separated list of strings (optional)>
declaration: <string>
implementation : <string>
declarationFile: <string> (to be implemented!)
Expand Down
24 changes: 16 additions & 8 deletions doc/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
The driving considerations for the PODIO design are:

1. the concrete data are contained within plain-old-data structures (PODs)
1. user-exposed data types are concrete and do not use inheritance
1. The C++ and Python interface should look as close as possible
1. The user does not do any explicit memory management
1. Classes are generated using a higher-level abstraction and code generators
2. user-exposed data types are concrete and do not use inheritance
3. The C++ and Python interface should look as close as possible
4. The user does not do any explicit memory management
5. Classes are generated using a higher-level abstraction and code generators

The following sections give some more technical details and explanations for the design choices.
More concrete implementation details can be found in the doxygen documentation.
Expand All @@ -15,15 +15,20 @@ More concrete implementation details can be found in the doxygen documentation.
The data model is based on four different kind of objects and layers, namely

1. user visible (physics) classes (e.g. `Hit`). These act as transparent references to the underlying data,
2. a transient object knowing about all data for a certain physics object, including inter-object references (e.g. `HitObject`),
2. a transient object knowing about all data for a certain physics object, including inter-object relations (e.g. `HitObject`),
3. a plain-old-data (POD) type holding the persistent object information (e.g. `HitData`), and
4. a user-visible collection containing the physics objects (e.g. `HitCollection`).

These layers are described in the following.

### The User Layer

The user visible objects (e.g. `Hit`) act as light-weight references to the underlying data, and provide the necessary user interface. For each of the data-members and one-to-one relations declared in the data model definition, corresponding setters and getters are created. For each of the one-to-many relations a vector-like interface is provided.
The user visible objects (e.g. `Hit`) act as light-weight references to the underlying data, and provide the necessary user interface. They come in two flavors, mutable and immutable, where the mutable classes are easily recognizable by their name (e.g. `MutableHit`).
Mutable classes are implicitly converted to immutable ones if necessary, so that interfaces that require only reading access to the data should always use the immutable classes. Only in cases where explicit mutability of the objects is required should mutable classes appear in interface definitions.


For each of the data-members and `OneToOneRelations` declared in the data model definition, corresponding getters (and setters for the mutable classes) are generated.
For each of the `OneToManyRelations` and `VectorMembers` a vector-like interface is provided.

With the chosen interface, the code written in C++ and Python looks almost identical, if taking proper advantage of the `auto` keyword.

Expand All @@ -33,7 +38,7 @@ The internal objects give access to the object data, i.e. the POD, and the refer
These objects inherit from `podio::ObjBase`, which takes care of object identification (`podio::ObjectID`), and object-ownership. The `ObjectID` consists of the index of the object and an ID of the collection it belongs to. If the object does not belong to a collection yet, the data object owns the POD containing the real data, otherwise the POD is owned by the respective collection. For details about the inter-object references and their handling within the data objects please see below.

### The POD Layer
The plain-old-data (POD) contain just the data declared in the `Members` section of the datamodel definition. Ownership and lifetime of the PODs is managed by the other parts of the infrastructure, namely the data objects and the data collections.
The plain-old-data (POD) contain just the data declared in the `Members` section of the datamodel definition as well as some bookkeeping data for data types with `OneToManyRelations` or `VectorMembers`. Ownership and lifetime of the PODs is managed by the other parts of the infrastructure, namely the data objects and the data collections.

### The Collections

Expand All @@ -42,6 +47,9 @@ The collections created serve three purposes:
1. giving access to or creating the data items
2. preparing objects for writing into PODs or preparing them after reading
3. support for the so-called notebook pattern

When used via the so called factory pattern (i.e. using the `create` function to create new objects) a collection will return mutable objects.
It is important to note that objects that are "owned" by a collection (i.e. they are either created via `create` or they are added to the collection via `push_back`) become invalid and can no longer be used once a collection is `clear`ed.

### Vectorization support / notebook pattern

Expand All @@ -54,4 +62,4 @@ While the base assumption of PODIO is that once-created collections are immutabl
it still allows for explicit `unfreezing` collections afterwards.
This feature has to handled with care, as it heavily impacts thread-safety.


2 changes: 1 addition & 1 deletion doc/doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ Many of the design choices are inspired by previous experience of the [LCIO pack

# Quick-start

An up-to-date installation and quick start guide for the impatient user can be found on the [PODIO github page](https://github.com/hegner/podio).
An up-to-date installation and quick start guide for the impatient user can be found on the [PODIO github page](https://github.com/AIDASoft/podio).
16 changes: 0 additions & 16 deletions python/generator_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,10 +130,6 @@ def __init__(self, name, **kwargs):

# For usage in constructor signatures
self.signature = self.full_type + ' ' + self.name
# If used in a relation context. NOTE: The generator might still adapt this
# depending on other criteria. Here it is just filled with a sane default
# that works if none of these criteria are met
self.relation_type = self.full_type

# Needed in case the PODs are exposed
self.sub_members = None
Expand All @@ -157,18 +153,6 @@ def __str__(self):
definition += r' ///< {}'.format(self.description)
return definition

def as_const(self):
"""Get the Const name for the type without any namespace"""
if self.is_array or self.is_builtin:
raise ValueError('Trying to get the Const version of a builtin or array member')
return 'Const{}'.format(self.bare_type)

def as_qualified_const(self):
"""string representation for the ConstType including namespace"""
if self.namespace:
return '::{nsp}::{cls}'.format(nsp=self.namespace, cls=self.as_const())
return self.as_const()

def getter_name(self, get_syntax):
"""Get the getter name of the variable"""
if not get_syntax:
Expand Down
45 changes: 21 additions & 24 deletions python/podio_class_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,15 +161,16 @@ def _get_filenames_templates(template_base, name):
# depending on which category is passed different naming conventions apply
# for the generated files. Additionally not all categories need source files.
# Listing the special cases here
fn_base = {
'Data': 'Data',
'Obj': 'Obj',
'ConstObject': 'Const',
'PrintInfo': 'PrintInfo',
'Object': '',
'Component': '',
'SIOBlock': 'SIOBlock',
}.get(template_base, template_base)
def get_fn_format(tmpl):
"""Get a format string for the filename"""
prefix = {'MutableObject': 'Mutable'}
postfix = {'Data': 'Data',
'Obj': 'Obj',
'SIOBlock': 'SIOBlock',
'Collection': 'Collection',
'CollectionData': 'CollectionData'}

return f'{prefix.get(tmpl, "")}{{name}}{postfix.get(tmpl, "")}.{{end}}'

endings = {
'Data': ('h',),
Expand All @@ -180,7 +181,7 @@ def _get_filenames_templates(template_base, name):
fn_templates = []
for ending in endings:
template_name = '{fn}.{end}.jinja2'.format(fn=template_base, end=ending)
filename = '{name}{fn}.{end}'.format(fn=fn_base, name=name, end=ending)
filename = get_fn_format(template_base).format(name=name, end=ending)
fn_templates.append((filename, template_name))

return fn_templates
Expand Down Expand Up @@ -217,7 +218,7 @@ def _process_datatype(self, name, definition):
datatype = self._preprocess_datatype(name, definition)
self._fill_templates('Data', datatype)
self._fill_templates('Object', datatype)
self._fill_templates('ConstObject', datatype)
self._fill_templates('MutableObject', datatype)
self._fill_templates('Obj', datatype)
self._fill_templates('Collection', datatype)
self._fill_templates('CollectionData', datatype)
Expand All @@ -231,24 +232,17 @@ def _preprocess_for_obj(self, datatype):
includes, includes_cc = set(), set()

for relation in datatype['OneToOneRelations']:
if not relation.is_builtin:
relation.relation_type = relation.as_qualified_const()

if relation.full_type != datatype['class'].full_type:
if relation.namespace not in fwd_declarations:
fwd_declarations[relation.namespace] = []
fwd_declarations[relation.namespace].append('Const' + relation.bare_type)
includes_cc.add(self._build_include(relation.bare_type + 'Const'))
fwd_declarations[relation.namespace].append(relation.bare_type)
includes_cc.add(self._build_include(relation.bare_type))

if datatype['VectorMembers'] or datatype['OneToManyRelations']:
includes.add('#include <vector>')
includes.add('#include "podio/RelationRange.h"')

for relation in datatype['VectorMembers'] + datatype['OneToManyRelations']:
if not relation.is_builtin:
if relation.full_type not in self.reader.components:
relation.relation_type = relation.as_qualified_const()

if relation.full_type == datatype['class'].full_type:
includes_cc.add(self._build_include(datatype['class'].bare_type))
else:
Expand All @@ -261,7 +255,7 @@ def _preprocess_for_obj(self, datatype):
datatype['obj_needs_destructor'] = needs_destructor

def _preprocess_for_class(self, datatype):
"""Do the preprocessing that is necessary for the classes and Const classes"""
"""Do the preprocessing that is necessary for the classes and Mutable classes"""
includes = set(datatype['includes_data'])
fwd_declarations = {}
includes_cc = set()
Expand All @@ -275,11 +269,12 @@ def _preprocess_for_class(self, datatype):
if relation.namespace not in fwd_declarations:
fwd_declarations[relation.namespace] = []
fwd_declarations[relation.namespace].append(relation.bare_type)
fwd_declarations[relation.namespace].append('Const' + relation.bare_type)
fwd_declarations[relation.namespace].append('Mutable' + relation.bare_type)
includes_cc.add(self._build_include(relation.bare_type))

if datatype['VectorMembers'] or datatype['OneToManyRelations']:
includes.add('#include <vector>')
includes.add('#include "podio/RelationRange.h"')

for relation in datatype['OneToManyRelations']:
if self._needs_include(relation):
Expand All @@ -290,7 +285,8 @@ def _preprocess_for_class(self, datatype):
includes.add(self._build_include(vectormember.bare_type))

includes.update(datatype.get('ExtraCode', {}).get('includes', '').split('\n'))
includes.update(datatype.get('ConstExtraCode', {}).get('includes', '').split('\n'))
# TODO: in principle only the mutable classes would need these includes!
includes.update(datatype.get('MutableExtraCode', {}).get('includes', '').split('\n'))

# When we have a relation to the same type we have the header that we are
# just generating in the includes. This would lead to a circular include, so
Expand All @@ -308,7 +304,8 @@ def _preprocess_for_collection(self, datatype):
"""Do the necessary preprocessing for the collection"""
includes_cc = set()
for relation in datatype['OneToManyRelations'] + datatype['OneToOneRelations']:
includes_cc.add(self._build_include(relation.bare_type + 'Collection'))
if datatype['class'].bare_type != relation.bare_type:
includes_cc.add(self._build_include(relation.bare_type + 'Collection'))

if datatype['VectorMembers']:
includes_cc.add('#include <numeric>')
Expand Down
2 changes: 1 addition & 1 deletion python/podio_config_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ class ClassDefinitionValidator(object):
)
valid_extra_datatype_keys = (
"ExtraCode",
"ConstExtraCode"
"MutableExtraCode"
)

# documented but not yet implemented
Expand Down
6 changes: 3 additions & 3 deletions python/templates/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@ set(PODIO_TEMPLATES
${CMAKE_CURRENT_LIST_DIR}/CollectionData.cc.jinja2
${CMAKE_CURRENT_LIST_DIR}/CollectionData.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/Component.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/ConstObject.cc.jinja2
${CMAKE_CURRENT_LIST_DIR}/ConstObject.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/Data.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/Obj.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/Obj.cc.jinja2
${CMAKE_CURRENT_LIST_DIR}/Object.cc.jinja2
${CMAKE_CURRENT_LIST_DIR}/Object.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/Obj.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/MutableObject.cc.jinja2
${CMAKE_CURRENT_LIST_DIR}/MutableObject.h.jinja2
${CMAKE_CURRENT_LIST_DIR}/selection.xml.jinja2
${CMAKE_CURRENT_LIST_DIR}/SIOBlock.cc.jinja2
${CMAKE_CURRENT_LIST_DIR}/SIOBlock.h.jinja2
Expand Down
24 changes: 12 additions & 12 deletions python/templates/Collection.cc.jinja2
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,20 @@
clear();
}

Const{{ class.bare_type }} {{ collection_type }}::operator[](unsigned int index) const {
return Const{{ class.bare_type }}(m_storage.entries[index]);
{{ class.bare_type }} {{ collection_type }}::operator[](unsigned int index) const {
return {{ class.bare_type }}(m_storage.entries[index]);
}

Const{{ class.bare_type }} {{ collection_type }}::at(unsigned int index) const {
return Const{{ class.bare_type }}(m_storage.entries.at(index));
{{ class.bare_type }} {{ collection_type }}::at(unsigned int index) const {
return {{ class.bare_type }}(m_storage.entries.at(index));
}

{{ class.bare_type }} {{ collection_type }}::operator[](unsigned int index) {
return {{ class.bare_type }}(m_storage.entries[index]);
Mutable{{ class.bare_type }} {{ collection_type }}::operator[](unsigned int index) {
return Mutable{{ class.bare_type }}(m_storage.entries[index]);
}

{{ class.bare_type }} {{ collection_type }}::at(unsigned int index) {
return {{ class.bare_type }}(m_storage.entries.at(index));
Mutable{{ class.bare_type }} {{ collection_type }}::at(unsigned int index) {
return Mutable{{ class.bare_type }}(m_storage.entries.at(index));
}

size_t {{ collection_type }}::size() const {
Expand All @@ -54,7 +54,7 @@ void {{ collection_type }}::setSubsetCollection(bool setSubset) {
m_isSubsetColl = setSubset;
}

{{ class.bare_type }} {{ collection_type }}::create() {
Mutable{{ class.bare_type }} {{ collection_type }}::create() {
if (m_isSubsetColl) {
throw std::logic_error("Cannot create new elements on a subset collection");
}
Expand All @@ -65,7 +65,7 @@ void {{ collection_type }}::setSubsetCollection(bool setSubset) {
{% endif %}

obj->id = {int(m_storage.entries.size() - 1), m_collectionID};
return {{ class.bare_type }}(obj);
return Mutable{{ class.bare_type }}(obj);
}

void {{ collection_type }}::clear() {
Expand Down Expand Up @@ -97,7 +97,7 @@ bool {{ collection_type }}::setReferences(const podio::ICollectionProvider* {% i
return true; //TODO: check success
}

void {{ collection_type }}::push_back(Const{{ class.bare_type }} object) {
void {{ collection_type }}::push_back({{ class.bare_type }} object) {
// We have to do different things here depending on whether this is a
// subset collection or not. A normal collection cannot collect objects
// that are already part of another collection, while a subset collection
Expand Down Expand Up @@ -133,7 +133,7 @@ podio::CollectionBuffers {{ collection_type }}::getBuffers() {

{{ iterator_definitions(class) }}

{{ iterator_definitions(class, prefix='Const' ) }}
{{ iterator_definitions(class, prefix='Mutable' ) }}

{{ macros.ostream_operator(class, Members, OneToOneRelations, OneToManyRelations, VectorMembers, use_get_syntax, ostream_collection_settings) }}

Expand Down
Loading