Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-situ creation and addition of containers to the EventStore #112

Closed
hegner opened this issue Jul 6, 2020 · 2 comments
Closed

In-situ creation and addition of containers to the EventStore #112

hegner opened this issue Jul 6, 2020 · 2 comments

Comments

@hegner
Copy link
Collaborator

hegner commented Jul 6, 2020

Define a policy about enabling "on-demand" creation and addition of containers

(a follow up on #109)

@hegner hegner mentioned this issue Jul 6, 2020
7 tasks
@tmadlener
Copy link
Collaborator

I have just had a brief look into the possibility of having something like "per-event collections", respectively the possibility of not having to register collections in an initialization phase prior to the actual processing. For context: the initialization phase approach is something that Gaudi does in order to e.g. facilitate scheduling of the different algorithms, because the dependencies can be resolved in this initialization phase. The "per-event collections" is similar to what LCIO does in ILCSoft, where the Event is the main entry point and collections can be added to it without first having to initialize the collections.

From what I gathered, it seems that this will not be easy with a "naive" root approach of storing the event data in a TTree as it is currently done. The problem is that the TTree has the notion of this columnar data storage. It is easily possible to add additional branches at any point in the data processing, but this branch will be visible for all events that are stored in the tree.

For example

auto* tree = new TTree("test_tree", "test tree");

int index = 0;
double val = 0;
tree->Branch("index", &index);
tree->Branch("val", &val);

while (index++ < 11) {
  val = 0.5 * index;
  tree->Fill();
}

double lateVal = 0;
tree->Branch("lateVal", &lateVal);
lateVal = 2.5;
tree->Fill();

while (index++ < 20) {
  val = 0.75 * index;
  tree->Fill();
}

results in a TTree with 20 entries, where lateVal is available in all 20 entries with the value of 2.5 (even in those that have been previously filled).

I am not yet sure how or if this can be easily solved.

Another thing that came to my mind after the discussion on Friday: While I think that podio should in principle be able to support both use cases, it might be beneficial to not have to support both simultaneously, i.e. the user has to decide at initialization (or maybe even compile time), which way the data should be handled. Or is there any "realistic" use case where a mixed style would be necessary?

@tmadlener
Copy link
Collaborator

This is a general ROOT "issue". It is documented for the FrameWriter:

/** Store the given frame with the given category. Store all available
* collections from the Frame.
*
* NOTE: The contents of the first Frame that is written in this way
* determines the contents that will be written for all subsequent Frames.
*/
void writeFrame(const podio::Frame& frame, const std::string& category);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants