Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure python parts, splitting code generation and python bindings into separate modules #511

Merged
merged 10 commits into from
Nov 8, 2023

Conversation

jmcarcell
Copy link
Member

@jmcarcell jmcarcell commented Nov 6, 2023

BEGINRELEASENOTES

  • Decouple generation tools and files from the rest of the podio python files by creating a new folder called podio_gen. This is a transparent change for users that only use the generator script.
    • This makes the configuration / generation times negligible again, because we don't load libraries unnecessarily any longer for the generation.
  • Simplify the python bindings (podio) __init__.py and remove the test_utils from it.
  • Move the tests for writing frames in python to the tests folder, where they belong and also test the SIO python writer.

ENDRELEASENOTES

Generation tools do not need to load or know anything about the rest of podio. I think the remaining question is if the tests have to be moved or not, some of them only test generation tools.

Copy link
Collaborator

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea. I see you already moved quite a few tests, but you mention more that would need to be moved? Which ones would that be?

@jmcarcell
Copy link
Member Author

I didn't move any tests - I mean moving tests from the current folder podio to the podio_gen folder; for example https://github.com/AIDASoft/podio/blob/master/python/podio/test_ClassDefinitionValidator.py only uses classes from podio_gen in this PR but it's in podio. It only matters from an organization point of view, and if any test is moved then there is some extra cmake code needed to also find those. On that I don't have an opinion to where this one goes, I don't think there are many more.

@tmadlener
Copy link
Collaborator

Ah, apologies. Misread the diff. I would move them, just to have them closer to the code they are testing.

@jmcarcell
Copy link
Member Author

Now they are moved and are in a separate test since it doesn't seem it's easy to get them working in the same since they are now different modules

#--- install templates ---------------------------------------------------------
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/templates
DESTINATION ${podio_PYTHON_INSTALLDIR})

IF (BUILD_TESTING)
add_test( NAME pyunittest COMMAND python3 -m unittest discover -s ${PROJECT_SOURCE_DIR}/python/podio)
add_test( NAME pyunittest COMMAND python3 -m unittest discover -s ${PROJECT_SOURCE_DIR}/python/podio -p "test_*.py")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_test( NAME pyunittest COMMAND python3 -m unittest discover -s ${PROJECT_SOURCE_DIR}/python/podio -p "test_*.py")
add_test( NAME pyunittest COMMAND python3 -m unittest discover -s ${PROJECT_SOURCE_DIR}/python -p "test_*.py")

I think otherwise we are missing the ones that are now in podio_gen, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, of course, that is what I tested before and it doesn't work. I'm checking...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I have no idea how to make it work with a single test. First, when trying to discover from the top directory python will call dir(podio) and will fail like in #512. Even finding a workaround some tests fail with errors related to some errors when importing like this

ImportError: Failed to import test module: podio.test_EventStore
Traceback (most recent call last):
  File "/usr/lib/python3.11/unittest/loader.py", line 407, in _find_test_path
    module = self._get_module_from_name(name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/unittest/loader.py", line 350, in _get_module_from_name
    __import__(name)
  File "/usr/lib/python3.11/site-packages/ROOT/_facade.py", line 154, in _importhook
    return _orig_ihook(name, *args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'podio.test_EventStore'; 'podio' is not a package

so I reverted back to the version that works except pre-commit...

@tmadlener
Copy link
Collaborator

pre-commit complains about a cyclic import, but it doesn't really make sense to me, as the script should now only import from podio_gen, right?

https://github.com/AIDASoft/podio/actions/runs/6776714508/job/18418800193?pr=511#step:4:427

@jmcarcell
Copy link
Member Author

I believe the diagnosis is correct; podio wants to import test_utils and test_utils wants to import podio. Actually the try - except block only seems to work during test time. I'm not sure why it appears now though

@tmadlener
Copy link
Collaborator

Ah then we can maybe switch to a

from .frame import Frame

in test_utils.py?

@jmcarcell
Copy link
Member Author

That breaks some tests due to relative imports. Another proposal: move test_utils.py to podio/tests so that the tests that use it can use it directly. Then the import can be removed from __init__.py (and it makes sense that you don't always import something that you only need for tests, it just happens that the import doesn't work now at runtime). It's a bit less organized but anyway there are some python tests already there

@tmadlener
Copy link
Collaborator

I tried that, but that breaks with the weird c++ error in the comment (or at least it did in the past). I agree, this should really live in the tests directory though. Maybe you can give it another try and see if things just magically work now?

@tmadlener
Copy link
Collaborator

tmadlener commented Nov 8, 2023

I think I found the issue. If I do

from podio.test_utils import Frame
print(Frame)

I get <class 'podio.frame.Frame'>. However, if I do

from podio import Frame
print(Frame)

I get <class cppyy.gbl.podio.Frame at 0x5569c6f3fc60>. So it seems like outside of the test_utils the importing of Frame does not pick up the wrapper, but rather the c++ class instead.

I am not yet sure why things are working inside test_utils as they are also simply doing a from podio import Frame. My best guess at the moment is that this is still happening before the sys.modules["podio"] = podio.

@tmadlener
Copy link
Collaborator

I have removed the sys.modules["podio"] = podio bit from the non-generator part __init__.py to make sure that we always get our wrapper classes in python rather than picking up c++ classes with the same name via the ROOT python bindings accidentally. This also makes it possible to move the test_utils more or less completely to the tests folder into write_frame.py and simply calling that for the different I/O backends.

I think in this case not having all of the c++ bits of podio directly available in python is OK, since for the main use cases we provide wrappers in any case and all the others are effectively mainly there to support the implementation of those, and I don't think we gain too much by exposing them to people in python.

Since this has now become a bit more than just "decoupling the generation files from the rest of the podio python files" I would actually also like to pick up #498 in this PR and make it another general overhaul of the python structure of podio. Mainly because #498 needs to be touched again in any case and we might as well resolve the merge conflict in this PR, rather than later.

@tmadlener tmadlener changed the title Decouple generation tools and files from the rest of the podio python files Restructure python parts, splitting code generation and python bindings into separate modules Nov 8, 2023
@@ -2,59 +2,5 @@
"""Utilities for python unittests"""

import os
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this file only has like 3 lines after all the removals, I'm not sure if it's needed...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can clean this up in the course of #485

@jmcarcell
Copy link
Member Author

Looks good but now if you want to have a fully working podio in python you can't unless you do from ROOT import podio. I see some possible use cases like putting UserDataCollections in Frames or GenericParameters but probably there are few people using them.

@tmadlener
Copy link
Collaborator

I think GenericParameters should be seen as an internal implementation detail of how the Frame supports parameters.

UserDataCollections on the other hand are a good point. Maybe we can try to address them in another PR since Andre is waiting on this for the LCG stacks.

@tmadlener tmadlener merged commit 0839921 into AIDASoft:master Nov 8, 2023
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants