Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python: add automatic dependency tracking for scripts #6102

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

elmarco
Copy link
Contributor

@elmarco elmarco commented Oct 24, 2019

I am looking for early feedback, over the overall approach and on on the syntax in particular.

This feature would be really neat for many projects I work with that have sometime complex python generators.

@elmarco elmarco marked this pull request as ready for review October 26, 2019 20:30
@jpakkane
Copy link
Member

jpakkane commented Nov 5, 2019

Could we instead make this work automatically? That is, when the user defines a build step that uses a Python script, do the dependency tracking transparently.

@elmarco
Copy link
Contributor Author

elmarco commented Nov 5, 2019

Hi thanks for the feedback,

How would you detect it is a python script? also some folks may use different ways to invoke the python script, as I don't see an official / recommended way.

I am not sure this wrapper will work for all existing scripts, so I would rather have a conservative approach here too.

@bonzini
Copy link
Contributor

bonzini commented Dec 10, 2019

The official/recommended way would be to use "#! /usr/bin/env python3", make the script not executable, and use find_program to go from the script name to a program object. (See also commit 0078d80, "find_program: use Meson's Python3 for non-executable Python scripts").

@elmarco
Copy link
Contributor Author

elmarco commented Dec 10, 2019

thanks @bonzini,

@jpakkane what should I do, make it default (and risk to break existing scripts)?

@elmarco
Copy link
Contributor Author

elmarco commented Dec 20, 2019

@jpakkane ping

@eli-schwartz
Copy link
Member

What would be an example use case for this? If it is just about generator() objects which use internal python scripts instead of self-built executables, wouldn't it make more sense to e.g. extend "depends" for generators to include source files and not just build targets? That would make the functionality work for generators written in any scripting language, not just python.

@elmarco
Copy link
Contributor Author

elmarco commented Jan 2, 2020

@eli-schwartz: the use case is the same as generator()/custom_target() but with automatic dependency tracking for python scripts. No need to maintain "depends" including all the source files, as this is often badly maintained (at least in qemu), because it is easy to miss and forget.

This is useful for the following patch, that wants to generate depfile
based on output file name for custom_target and generator.
@lgtm-com
Copy link

lgtm-com bot commented Jan 10, 2020

This pull request introduces 1 alert when merging af62713 into 71bbcc7 - view on LGTM.com

new alerts:

  • 1 for Use of the return value of a procedure

@elmarco
Copy link
Contributor Author

elmarco commented Jan 18, 2020

ping

@elmarco
Copy link
Contributor Author

elmarco commented Jan 25, 2020

@jpakkane hi Jussi, what can I do to make progress? thanks

@elmarco
Copy link
Contributor Author

elmarco commented Feb 3, 2020

@xclaesse @nirbheek what do you think? thanks

@xclaesse
Copy link
Member

xclaesse commented Feb 5, 2020

I'm not a big fan of having a new custom_target() method for this. One reason that come to mind is we also want this for configure_file() I guess... basically everywhere we can exec a command.

Just thinking out loud, but what about:

pymod = import('python')
python = pymod.find_installation(depfile: '@OUTPUT0@.d')
custom_target(..., command: [python, 'myscript.py'])

That means that find_installation() would return a special ExternalProgramHolder object that would have ['--internal', 'pydep', '--depfile', '@OUTPUT0@.d'] in its command, and it could be used anywhere with existing APIs. That can work, no?

@bonzini
Copy link
Contributor

bonzini commented Feb 5, 2020

Could that be used with generators for example since they have command and arguments in separate places? I think there's no need to add an extra argument if we can do it by default for anything that uses the magic #! /usr/bin/env python3 line.

@xclaesse
Copy link
Member

xclaesse commented Feb 5, 2020

Not sure it should be the default, I think it should be opt-in somehow by the user. My proposal above was to have a way for the user to explicitly use an executable that would generate a depfile, without requiring to modify every meson function that can run commands. It can work with generator too, I think it's just a matter of using the python executable returned by find_installation() as executable in the generator.

You can of course base it on the shebang, you would have to change the command line from _shebang_to_cmd().

@eli-schwartz
Copy link
Member

I still do not understand why a python-specific interpreter shim is needed, when it seems like the proper solution is to create a generic method for any generator to specify in-tree buildsystem files which it depends on. Even if your application is written in perl or ruby, and therefore your generator is too.

Because it seems this PR really is all about making sure generators get rerun when the in-tree python modules you use get modified to have changed behavior, and the generator needs to output different content. All this information is known at checkin time.

@xclaesse
Copy link
Member

xclaesse commented Feb 6, 2020

@eli-schwartz for the same reason we don't pass header files to executable(), the compiler knows the list of header files it included and can give us the list in a depfile. It's more convenient and less error prone than maintaining the list manually.

@elmarco
Copy link
Contributor Author

elmarco commented May 8, 2020

@jpakkane should I follow @xavierclaessens suggestion from #6102 (comment)?

@elmarco
Copy link
Contributor Author

elmarco commented Aug 18, 2020

ping @jpakkane, any preference or suggestion before I attempt to change this proposal? thanks

@nirbheek
Copy link
Member

This is a pretty cool feature. I think trying to make this a 'generic depfile API' is a mistake, though. I actually think we should do this automatically when custom_target(), generator(), or configure_file() generators call a python script. Either via command: [pymod.find_installation(), 'path/to/script.py']or command: [find_program('script.py')].

That way none of the existing code will need to be changed to take advantage of this, and we also won't need to bikeshed on how to expose this as an API, which is basically all we've been doing for a year now ;)

Currently, people have to manually add all the python files to depend_files:, such as here: https://gitlab.gnome.org/GNOME/glib/-/blob/4e485d76/gio/meson.build#L237. This would eliminate it completely.

@elmarco
Copy link
Contributor Author

elmarco commented Jan 26, 2021

Hi @nirbheek thanks for the suggestion. I agree it would be nice to do it automatically, but I am afraid this may have surprising result if it's enabled by default. At least, there should be a way to introduce this progressively. Is there a place where we can enable/disable such kind of feature? Something like proejct(.., autodeps: true/false(auto)) ?

@dcbaker
Copy link
Member

dcbaker commented Jan 26, 2021

I definitely have some cases where this would break my targets turned on by default, as they depend on mako templates that are stored in separate files, so the output looks like:

custom_target(
   'out.[ch]',
   command : [prog_python, 'foo.py'],
   depend_files : ['lib_foo.py', 'lib_bar.py', 'templates/foo.mako', 'templates/bar.mako'],
)

This wouldn't pick up those mako templates.

I agree that a universal API is probably the wrong solution at this point, if we suddenly have people implementing wrappers for perl and ruby we can always re-evaluate later.

I'm personally okay with this approach, since if we decide to have a universal API we can always make these methods effectively aliases of the base custom_target with some pre-populated values.

Also, this is very cool and I'd like it.

@dcbaker dcbaker added modules:python Issues specific to the python module enhancement labels Jan 26, 2021
@stefanha
Copy link
Contributor

Hi, I just wanted to express support for this feature being part of meson in some form. I have a Python code generator and had to manually add depend_files: with each of the source files that make up the code generator utility. It would be great if meson could determine the Python dependency files automatically...and probably eliminate missing dependency bugs in meson.build files out there :).

@xclaesse
Copy link
Member

xclaesse commented Feb 17, 2022

Been thinking about this again, and it's definitely a feature we want in my opinion. However:

  • With the current proposal we would have to duplicate every meson function that has a depfile kwarg into the python module.
  • Keep them all in sync when adding new kwargs to those functions.
  • Since it wraps the python script it has a performance impact and could cause unexpected side effects. This feature needs to be opt-in, I don't think we can just wrap all python to write a depfile automatically, can we?

A few ideas, in order of my least to most preferred solution:

  • Add a new kwarg, like python_depfile: true
  • Add a special value to current kwarg, like depfile: 'python-wrapper'
  • Add a special value to current kwarg, like depfile: pymod.depfile(), that new method in the python module would return a opac internal type that we can use as special value to existing depfile kwargs that would make the command being wrapped by our python depfile generator.
  • Add a property on python executor, like python = pymod.find_installation(depfile: true), when that executor use used like custom_target(..., command: [python, 'myscript.py']) then it will wrap the command automatically to generate the depfile. This has the potential advantage that a project that already sets the python executor in all their custom_target() can be converted with 1 line change when getting the executor. Other solutions requires the change every custom_target() calls of the project, even if that's a trivial change. As a transition period it's also easier for projects who wants to keep compat with older meson versions:
if meson.version().version_compare('>=0.62')
  python = pymod.find_installation(depfile: true)
else
  python = pymod.find_installation()
endif

# Now all calls can remain unchanged regardless of meson version
custom_target(..., command: [python, 'myscript.py'])

versus:

# Repeat the version check in every calls
if meson.version().version_compare('>=0.62')
  pymod.custom_target(..., command: [python, 'myscript.py'])
else
  custom_target(..., command: [python, 'myscript.py'])
endif

What do you think?

@elmarco
Copy link
Contributor Author

elmarco commented Feb 17, 2022

Nice to see this PR is not lost in the oblivion :)

Yes, I agree this must be opt-in (it is already somewhat with the current proposed version).

I also like the pymod.find_installation(depfile: true) solution the best.

That shouldn't be too hard to implement. Are you willing to give it a try? 😄

Some reply on your remarks:

With the current proposal we would have to duplicate every meson function that has a depfile kwarg into the python module.

Yeah, I added configure_file in the meantime :) commit 01569fe

Keep them all in sync when adding new kwargs to those functions.

The python functions are wrapping the base functions though, so that should reduce the burdden somewhat.

Since it wraps the python script it has a performance impact and could cause unexpected side effects. This feature needs to be opt-in, I don't think we can just wrap all python to write a depfile automatically, can we?

Performance impact should really be minor. And I am pretty sure it works with 99.9% of the scripts, since it is based on cpython trace.py code. But we never know, and I would rather stay safe than introducing an avoidable regression.

@jpakkane
Copy link
Member

Since it wraps the python script it has a performance impact and could cause unexpected side effects.

The obvious question to this is, are there measurements on how big the hit is?

@xclaesse
Copy link
Member

Currently meson --internal <script> imports most of meson modules which is fairly slow especially on Windows. But there are PRs to fast path that, not sure what's the current status.

@xclaesse
Copy link
Member

@elmarco One potential issue with your implementation: how does it deal with capture: true? In that case we already have to wrap the script with meson to write stdout into a file.

@bonzini
Copy link
Contributor

bonzini commented Feb 18, 2022

capture: true

I think it uses two level of meson --internal in that case? One is added by the methods in the python module, the other by the backend.

@nirbheek
Copy link
Member

I think it uses two level of meson --internal in that case? One is added by the methods in the python module, the other by the backend.

This needs to be handled to use one level of meson --internal I think. We don't want layering here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement modules:python Issues specific to the python module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants