-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test and fix for persisting a collection without its associated collections #235
Test and fix for persisting a collection without its associated collections #235
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the suspicion that there was a problem with this.
Thanks for confirming that there indeed is one and for fixing it.
tests/read_and_write_associated.cpp
Outdated
// // auto cluster = clusters[0]; | ||
// // for (auto j = cluster.Hits_begin(), end = cluster.Hits_end(); j!=end; ++i){ | ||
// // std::cout << " Referenced hit has an energy of " << j->energy() << std::endl; | ||
// // } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How far should this work. I am not entirely sure. Should we still be able to loop over all the hits, as long as we don't try to access any of them? something like
for (const auto hit : cluster.Hits()) {
if (hit.isAvailable()) throw std::runtime_error("Hit is available, although it has not been written");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the example with your point here.
One can access the clusters, and loop over them in the collection. But then you are also able to loop over the hits in each cluster, because some hits were added. The problem would come when checking if the hit isAvailable()
, this segfaults as the hit was never read.
This happens even if I do this:
auto collGetResult = collectionProvider->get(id.collectionID, coll);
if (collGetResult == false) {
m_rel_Hits->emplace_back(nullptr);
continue;
}
ExampleHitCollection* tmp_coll = static_cast<ExampleHitCollection*>(coll);
That looks like a more profound change to me.
What happens now if there are two Hit Collections and only one of them is written, but a cluster in principle has links to hits in both collections? |
Also, I wanted to follow the template by doing:
So this would be coherent with the check But |
Yeah, that is a left over from #197 in some sense, the |
Relations are resolved one by one, so that in that case the cluster will still have all hits that have been present before writing. However, only those that have actually been written will also be available. I.e. in principle you would have to query each hit, whether it is available or not, before trying to access it. |
yap. That's correct. No way around explicit checking whenever you want to access things referenced to. |
I updated the example so it fails when accessing the hits, as expected. See: #235 (comment) So this "fix" would leave it up to the user to not persist some referenced collections, and then the user needs to take care of not accessing the collections that were never persisted. |
@@ -53,7 +53,7 @@ const std::array<{{ member.full_type }}, arraysize> {{ class.bare_type }}Collect | |||
|
|||
{% macro get_collection(type) %} | |||
podio::CollectionBase* coll = nullptr; | |||
collectionProvider->get(id.collectionID, coll); | |||
if (!collectionProvider->get(id.collectionID, coll)) { continue; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this does not succeed, then you will still have to do an emplace an empty object into the m_rel_{{ relation.name }}
vector, because otherwise the indexing that is used under the hood breaks later. This is the segfault that we currently get. Basically an out of bounds access of a vector.
An attempt at an explanation. The basic layout of things that are involved look like this:
struct ClusterObj { // simplified to the bare minimum
// ...
vector<Hit>* m_rel_hits;
};
struct ClusterCollectionData {
// ...
vector<Hit> m_rel_hits; // this is where the related hits of ALL clusters live
};
struct ClusterData {
// ...
unsigned Hits_begin; // these are indices in
unsigned Hits_end; // the vector in the CollectionData!
};
The important part is that all Cluster
s from a collection point to the same vector<Hit>
that lives in the ClusterCollectionData
, and each individual cluster only uses the begin
and end
indices from its ClusterData
to identify the hits that belong to it. So it is important that we always create a Hit
and put that into the m_rel_hits
vector, even if that Hit
is entirely useless (but at least we can call isAvailable
on it).
I implemented the changes by not using the I reverted It will now |
The test now also creates a subset collection and adds hits that are not written to it. But, the behaviour of the subset and non-subset collections differs:
I would expect the behaviour to be the same in both cases. |
I would naively expect that everything should work the same, i.e. if there is a subset collection, then it should have the same size when writing and when reading. Also because it is possible that a subset collection has elements that live in different (original) collections and some might be written while others might not. Additionally, if you only want to get a number of elements (e.g. the number of muons in an event), I think it makes sense that the size stays unchanged, even if the underlying data cannot be accessed. |
I created an issue to continue and track the inconsistent use of collections and subset collections here: #244 So this PR can be merged. |
e81785c
to
e7941f5
Compare
Fixed the usage of Additionally, subset collections now behave the same as normal relations: I.e. if the referred to objects cannot be found they are simply not available, but the subset collection will still have the same size as before. There was another problem that I had to fix, which seemed to have slipped through our tests until now: the |
@hegner also this would be nice to have before the newer developments start. |
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
…tions; get_collections reverted to original form
Includes a fix to the setReferences behavior, which previously was only available for types with at least one relation (this slipped through until now for some reason).
e7941f5
to
bb27ad3
Compare
Thanks @fdplacido and @tmadlener ! |
BEGINRELEASENOTES
isAvailable
function to guard against such crashes if need be.setReferences
which was mistakenly a no-op for collections of a type without relations. Since this is the mechanism we use for restoring subset collections it obviously has to be present for all types.ENDRELEASENOTES
Original problem:
I created a test that writes and reads some cluster collection, which will fails with a segfault.
Only the clusters are registered for write: the hits are not written. If hits are registered for write the test works without issue.
I pushed a fix which checks the result of
collectionProvider->get()
, and if this fails will just continue to the next referenced collection.As an example, this would generated something like this: