Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot write out event data previously read from file #103

Closed
gaede opened this issue Jun 23, 2020 · 1 comment · Fixed by #144
Closed

cannot write out event data previously read from file #103

gaede opened this issue Jun 23, 2020 · 1 comment · Fixed by #144
Labels

Comments

@gaede
Copy link
Contributor

gaede commented Jun 23, 2020

Currently we cannot write out event data previously read in from a file to another file.
This is a standard use case, e.g. in reconstruction or any other event data processing job.
See example ./tests/read_and_write.cpp (see #102 ).

  • the main reason is the different memory layout of relations and vector members in events that have been read from a file.
@tmadlener
Copy link
Collaborator

Fixing the main issue of the different memory layouts of collections that are newly created and the ones that have been read from file is actually fairly trivial by introducing a boolean flag to the collection that tracks this:

diff --git a/python/templates/Collection.cc.jinja2 b/python/templates/Collection.cc.jinja2
index 282d45a..01d3834 100644
--- a/python/templates/Collection.cc.jinja2
+++ b/python/templates/Collection.cc.jinja2
@@ -16,7 +16,7 @@
 
 {% with collection_type = class.bare_type + 'Collection' %}
 {{ collection_type }}::{{ collection_type }}() :
-  m_isValid(false), m_collectionID(0), m_entries()
+  m_isValid(false), m_isReadFromFile(false), m_collectionID(0), m_entries()
 {%- for relation in OneToManyRelations + OneToOneRelations %},
   m_rel_{{ relation.name }}(new std::vector<{{ relation.namespace }}::Const{{ relation.bare_type }}>())
 {%- endfor %},
@@ -109,6 +109,11 @@ void {{ collection_type }}::prepareForWrite() {
   const auto size = m_entries.size();
   m_data->reserve(size);
   for (auto& obj : m_entries) { m_data->push_back(obj->data); }
+
+  // if the collection has been read from a file the rest of the information is
+  // already in the correct format and we have to skip it, since the temporary
+  // buffers are invalid
+  if (m_isReadFromFile) return;
   for (auto& pointer : m_refCollections) { pointer->clear(); }
 
 {% for relation in OneToManyRelations %}
@@ -155,6 +160,7 @@ void {{ collection_type }}::prepareAfterRead() {
   // have a redundant (but now useless) copy of the data
   m_data->clear();
   m_isValid = true;
+  m_isReadFromFile = true;
 }
 
 bool {{ collection_type }}::setReferences(const podio::ICollectionProvider* collectionProvider) {
diff --git a/python/templates/Collection.h.jinja2 b/python/templates/Collection.h.jinja2
index 03749bb..8ea3641 100644
--- a/python/templates/Collection.h.jinja2
+++ b/python/templates/Collection.h.jinja2
@@ -135,6 +135,7 @@ public:
 
 private:
   bool m_isValid;
+  bool m_isReadFromFile{false};
   int m_collectionID;
   {{ class.bare_type }}ObjPointerContainer m_entries; 

The main idea is that all that prepareForWrite does is to bring all the temporary data into contiguous buffers that can then be written. However, since we have read the data into contiguous buffers we can skip this (almost completely). The one thing that we have to do is to copy all the Data objects into one buffer again. Before #129 not even this would have been necessary and prepareForWrite would have been a no-op.

Things get more tricky at the interplay between the EventStore and the ROOTReader and ROOTWriter. When we are writing events then we are currently re-using the same collection objects internally and simply clear them via EventStore::clearCollections before we start filling them again in the next event. However, when we read events we get entirely new collection objects for every event and we have to dispose them via EventStore::clear. The problem now is that the ROOTWriter will never realize that his initially valid collection objects have become invalid after we have processed the first event. To fix this we would have to re-register all the collections that should be written again for each event to make sure that the ROOTWriter has access to the correct (and valid) collection objects.

Also tagging #140 here to make sure that this is considered in the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants