Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW: Better package splitting for multi-output recipes with negative glob in outputs/files #5216

Merged
merged 44 commits into from
Jun 7, 2024
Merged
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
a545986
conda_build.build: simple implementation of files exclusion
duncanmmacleod Feb 19, 2021
14e3548
tests: add test for glob exclusion in files
duncanmmacleod Feb 19, 2021
7fbc142
add docs for negative file matches
duncanmmacleod Feb 19, 2021
67904ce
news: add negative file match news entry
duncanmmacleod Feb 19, 2021
e56c573
conda_build.build: support separate include/exclude file lists
duncanmmacleod Nov 9, 2021
be19766
Merge branch 'main' into files-exclude
carterbox Mar 5, 2024
b667724
REF: Only consider new files to the PREFIX when file picking
carterbox Mar 6, 2024
063ff89
Update docs/source/resources/define-metadata.rst
carterbox Mar 6, 2024
1ab7638
Merge branch 'main' into files-exclude
carterbox Mar 8, 2024
de7715d
Update define-metadata.rst
carterbox Mar 8, 2024
f56e201
Update files-exclude.rst
carterbox Mar 8, 2024
0537c19
Merge branch 'main' into files-exclude
carterbox Mar 20, 2024
30a5e1b
STY: Add missing __future__ import
carterbox Mar 20, 2024
a92964f
STY: Address RUFF lints
carterbox Mar 20, 2024
a2e3963
STY: Remove extra whitespace in news
carterbox Mar 20, 2024
bfac4e0
Merge branch 'main' into files-exclude
jaimergp Mar 21, 2024
3c0b090
BUG: Use default value to make API backwards-compatible
carterbox Mar 21, 2024
fd5e956
Update news/files-exclude.rst
carterbox Mar 21, 2024
4c04d49
REF: Prevent errors when outputs/files is None
carterbox Mar 21, 2024
e16c470
Merge remote-tracking branch 'origin/files-exclude' into files-exclude
carterbox Mar 21, 2024
e598d6a
TST: Robust testing of new filematching
carterbox Mar 21, 2024
8dd906d
TST: Use defaults packages instead of conda-forge
carterbox Mar 21, 2024
642906b
DOC: Use admonition to warn users about explicit file lists
carterbox Mar 28, 2024
d1b12a1
DOC: Update explicit file lists docs
carterbox Mar 28, 2024
9dbd17c
DOC: Fix RST syntax
carterbox Apr 2, 2024
c3a26dd
DOC: Fix RST syntax
carterbox Apr 2, 2024
84aad48
DOC: Fix RST syntax
carterbox Apr 2, 2024
249a065
DOC: Fix RST syntax
carterbox Apr 2, 2024
55248d4
Merge branch 'main' into files-exclude
carterbox Apr 3, 2024
e42b6f0
TST: Prevent missing glob match error in test
carterbox Apr 3, 2024
a62e4b9
DOC: Fix RST syntax
carterbox Apr 10, 2024
e28565f
Update news to new template style
carterbox Apr 10, 2024
ee89c04
DOC: Fixup syntax highlighting in adjacent section
carterbox Apr 10, 2024
b0645c2
Merge branch 'main' into files-exclude
carterbox Apr 15, 2024
822b477
TST: Mark subpackage tests as serial tests to avoid errors on osx
carterbox Apr 23, 2024
475c0ec
Merge branch 'main' into files-exclude
carterbox Apr 23, 2024
4f988f9
TST: Avoid package collisions in subpackage tests
carterbox Apr 25, 2024
b33f6f6
TST: Avoid more package collisions in tests
carterbox Apr 25, 2024
256017a
Fix tests
isuruf May 31, 2024
e5fdcbf
Merge branch 'main' into files-exclude
isuruf May 31, 2024
606ee08
Update docs/source/resources/define-metadata.rst
isuruf May 31, 2024
6d33054
Remove serial
isuruf May 31, 2024
5d8efe9
Update docs/source/resources/define-metadata.rst
carterbox Jun 5, 2024
c92f1a1
REF: Use walrus operator to reduce conditions
carterbox Jun 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 43 additions & 8 deletions conda_build/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -1692,7 +1692,14 @@ def post_process_files(m: MetaData, initial_prefix_files):
return new_files


def bundle_conda(output, metadata: MetaData, env, stats, **kw):
def bundle_conda(
output,
metadata: MetaData,
env,
stats,
new_prefix_files: set[str] = set(),
**kw,
):
log = utils.get_logger(__name__)
log.info("Packaging %s", metadata.dist())
get_all_replacements(metadata.config)
Expand Down Expand Up @@ -1802,10 +1809,26 @@ def bundle_conda(output, metadata: MetaData, env, stats, **kw):
if files:
# Files is specified by the output
# we exclude the list of files that we want to keep, so post-process picks them up as "new"
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(files, metadata.config.host_prefix)
}
if isinstance(files, dict):
# When file matching with include/exclude lists, only
# new_prefix_files are considered. Files in the PREFIX from other
# recipes (dependencies) are ignored
include = files.get("include") or []
exclude = files.get("exclude") or []
exclude_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(exclude, metadata.config.host_prefix)
}
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(include, metadata.config.host_prefix)
}
keep_files = new_prefix_files.intersection(keep_files) - exclude_files
jaimergp marked this conversation as resolved.
Show resolved Hide resolved
else:
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(files, metadata.config.host_prefix)
}
Comment on lines +1827 to +1831
Copy link
Contributor

@kenodegard kenodegard Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the change in behavior above with only allowing new files to be included should the old behavior of including any file in the prefix be deprecated?

Suggested change
else:
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(files, metadata.config.host_prefix)
}
else:
keep_files = {
os.path.normpath(pth)
for pth in utils.expand_globs(files, metadata.config.host_prefix)
}
if keep_files - new_prefix_files:
deprecated.topic(...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still useful for eg when building gcc_bootstrap packages. Deprecating the implicit behaviour and adding a new flag would be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the current behavior of including all files from the prefix for multi-output and only installed files from the prefix for single-output recipes is inconsistent and thus non-intuitive. rattler-build is adding a section called "always_include_files" for recipes that want to repackage artifacts from their host dependencies.

I'm favor of deprecating the old behavior, but I'm uncertain about any removal or replacement schedule.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rattler-build is adding a section called "always_include_files" for recipes that want to repackage artifacts from their host dependencies.

This exists in conda already, and is in use in conda-forge. Main use-case for that from my POV is to be explicit about overwriting some CMake metadata if an incremental build adds more targets to an already-existing CMake file (example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can deprecate that behaviour in a separate PR too. This PR has been open for too long and I wouldn't want to test your patience further @carterbox <3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 LGTM! Just wanted to understand the thoughts/plans for the old behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open an issue to capture this feedback, though I think with the documentation this PR added we are in a better place already.

pfx_files = set(utils.prefix_files(metadata.config.host_prefix))
initial_files = {
item
Expand Down Expand Up @@ -1980,7 +2003,13 @@ def bundle_conda(output, metadata: MetaData, env, stats, **kw):
return final_outputs


def bundle_wheel(output, metadata: MetaData, env, stats):
def bundle_wheel(
output,
metadata: MetaData,
env,
stats,
new_prefix_files: set[str] = set(),
):
ext = ".bat" if utils.on_win else ".sh"
with TemporaryDirectory() as tmpdir, utils.tmp_chdir(metadata.config.work_dir):
dest_file = os.path.join(metadata.config.work_dir, "wheel_output" + ext)
Expand Down Expand Up @@ -2706,7 +2735,11 @@ def build(
# This is wrong, files has not been expanded at this time and could contain
# wildcards. Also well, I just do not understand this, because when this
# does contain wildcards, the files in to_remove will slip back in.
if "files" in output_d:
if (
"files" in output_d
and output_d["files"] is not None
and not isinstance(output_d["files"], dict)
):
output_d["files"] = set(output_d["files"]) - to_remove
carterbox marked this conversation as resolved.
Show resolved Hide resolved

# copies the backed-up new prefix files into the newly created host env
Expand All @@ -2722,7 +2755,9 @@ def build(
with utils.path_prepended(m.config.build_prefix):
env = environ.get_dict(m=m)
pkg_type = "conda" if not hasattr(m, "type") else m.type
newly_built_packages = bundlers[pkg_type](output_d, m, env, stats)
newly_built_packages = bundlers[pkg_type](
output_d, m, env, stats, new_prefix_files
)
# warn about overlapping files.
if "checksums" in output_d:
for file, csum in output_d["checksums"].items():
Expand Down
31 changes: 28 additions & 3 deletions docs/source/resources/define-metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1313,7 +1313,10 @@ build prefix. Explicit file lists support glob expressions.
Directory names are also supported, and they recursively include
contents.

.. code-block:: none
.. warning::
When defining `outputs/files` as a list, any file in the prefix (including those installed by host dependencies) matching one of the glob expressions is included in the output.
isuruf marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: yaml

outputs:
- name: subpackage-name
Expand All @@ -1323,6 +1326,29 @@ contents.
- *.some-extension
- somefolder/*.some-extension

Greater control over file matching may be
achieved by defining ``files`` as a dictionary separating files to
``include`` from those to ``exclude``.
When using include/exclude, only files installed by
the current recipe are considered. i.e. files in the prefix installed
by host dependencies are excluded. include/exclude may not be used
carterbox marked this conversation as resolved.
Show resolved Hide resolved
simultaneously with glob expressions listed directly in ``outputs/files``.
Files matching both include and exclude expressions will be excluded.

.. code-block:: yaml

outputs:
- name: subpackage-name
files:
include:
- a-file
- a-folder
- *.some-extension
- somefolder/*.some-extension
exclude:
- *.exclude-extension
- a-folder/**/*.some-extension

Scripts that create or move files into the build prefix can be
any kind of script. Known script types need only specify the
script name. Currently the list of recognized extensions is
Expand Down Expand Up @@ -1372,10 +1398,9 @@ A subpackage does not automatically inherit any dependencies from its top-level
recipe, so any build or run requirements needed by the subpackage must be
explicitly specified.

.. code-block:: none
.. code-block:: yaml

outputs:

- name: subpackage-name
requirements:
build:
Expand Down
19 changes: 19 additions & 0 deletions news/5216-files-exclude
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
### Enhancements

* Add new include/exclude sections for glob expressions in multi-output `outputs/files`. (#4196 via #5216)

### Bug fixes

* <news item>

### Deprecations

* <news item>

### Docs

* <news item>

### Other

* <news item>
12 changes: 11 additions & 1 deletion tests/test-recipes/split-packages/copying_files/bld.bat
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,14 @@ echo "weee" > %PREFIX%\subpackage_file1
mkdir %PREFIX%\somedir
echo "weee" > %PREFIX%\somedir\subpackage_file1
echo "weee" > %PREFIX%\subpackage_file1.ext
echo "weee" > %PREFIX%\subpackage_file2.ext
echo "weee" > %PREFIX%\subpackage_file2.ext
echo "weee" > %PREFIX%\subpackage_file3.ext

echo "weee" > %PREFIX%\subpackage_include_exclude1
mkdir %PREFIX%\anotherdir
echo "weee" > %PREFIX%\anotherdir\subpackage_include_exclude1
echo "weee" > %PREFIX%\subpackage_include_exclude1.wav
echo "weee" > %PREFIX%\subpackage_include_exclude2.wav
echo "weee" > %PREFIX%\subpackage_include_exclude3.wav
mkdir %PREFIX%\Library\bin
echo "weee" > %PREFIX%\Library\bin\dav1d.fake
15 changes: 15 additions & 0 deletions tests/test-recipes/split-packages/copying_files/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,18 @@ echo "weee" > $PREFIX/somedir/subpackage_file1
# test glob patterns
echo "weee" > $PREFIX/subpackage_file1.ext
echo "weee" > $PREFIX/subpackage_file2.ext
echo "weee" > $PREFIX/subpackage_file3.ext

# The files used to test the two subpackages must be disjoint because they are
# coinstalled
# test copying filename
echo "weee" > $PREFIX/subpackage_include_exclude1
# test copying by folder name
mkdir $PREFIX/anotherdir
echo "weee" > $PREFIX/anotherdir/subpackage_include_exclude1
# test glob patterns
echo "weee" > $PREFIX/subpackage_include_exclude1.wav
echo "weee" > $PREFIX/subpackage_include_exclude2.wav
echo "weee" > $PREFIX/subpackage_include_exclude3.wav
mkdir $PREFIX/lib
echo "weee" > $PREFIX/lib/libdav1d.fake
34 changes: 32 additions & 2 deletions tests/test-recipes/split-packages/copying_files/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,44 @@ package:

requirements:
run:
- my_script_subpackage
- my_script_subpackage_files
- my_script_subpackage_include_exclude

outputs:
- name: my_script_subpackage
- name: my_script_subpackage_files
build:
ignore_run_exports_from:
- libpng
requirements:
host:
- libpng=1.6.39
files:
- subpackage_file1
- somedir
- "*.ext"
# Libs should match because they are in the prefix
- "lib/libpng*" # [unix]
- "Library/bin/libpng*" # [win]
test:
script: subpackage_test.py
script_interpreter: python
- name: my_script_subpackage_include_exclude
build:
ignore_run_exports_from:
- dav1d
requirements:
host:
- dav1d=1.2.1
files:
include:
- subpackage_include_exclude1
- anotherdir
- "*.wav"
# Libs should not match because they come from a different package
- "lib/libdav1d*" # [unix]
- "Library/bin/dav1d*" # [win]
exclude:
- "*3.wav"
test:
script: subpackage_test.py
script_interpreter: python
Original file line number Diff line number Diff line change
@@ -1,33 +1,76 @@
import os
import sys

print(os.getenv('PREFIX'))
filename = os.path.join(os.environ['PREFIX'], 'subpackage_file1')
assert os.path.isfile(filename)
if os.getenv("PKG_NAME") == "my_script_subpackage_files":
file_basename = "subpackage_file"
dirname = "somedir"
extension = "ext"

if "darwin" in sys.platform:
external_host_file = "lib/libpng16.dylib"
elif "win32" in sys.platform:
external_host_file = "Library/bin/libpng16.dll"
else:
external_host_file = "lib/libpng16.so"

filename = os.path.join(os.environ["PREFIX"], f"{file_basename}3.{extension}")
print(filename)
assert os.path.isfile(filename), filename + " is missing"
print("glob files OK")

filename = os.path.join(os.environ["PREFIX"], external_host_file)
print(filename)
assert os.path.isfile(filename), filename + " is missing"
print("glob files prefix OK")

if os.getenv("PKG_NAME") == "my_script_subpackage_include_exclude":
file_basename = "subpackage_include_exclude"
dirname = "anotherdir"
extension = "wav"

if "darwin" in sys.platform:
external_host_file = "lib/libdav1d.6.dylib"
elif "win32" in sys.platform:
external_host_file = "Library/bin/dav1d.dll"
else:
external_host_file = "lib/libdav1d.so.6"

filename = os.path.join(os.environ["PREFIX"], f"{file_basename}3.{extension}")
assert not os.path.isfile(filename), filename + " is missing"
print("glob exclude OK")

filename = os.path.join(os.environ["PREFIX"], external_host_file)
assert not os.path.isfile(filename), filename + " is missing"
print("glob exclude prefix OK")

print(os.getenv("PREFIX"))
filename = os.path.join(os.environ["PREFIX"], f"{file_basename}1")
assert os.path.isfile(filename), filename + " is missing"
contents = open(filename).read().rstrip()
if hasattr(contents, 'decode'):
if hasattr(contents, "decode"):
contents = contents.decode()
assert "weee" in contents, 'incorrect file contents: %s' % contents
assert "weee" in contents, "incorrect file contents: %s" % contents
print("plain file OK")

filename = os.path.join(os.environ['PREFIX'], 'somedir', 'subpackage_file1')
filename = os.path.join(os.environ["PREFIX"], dirname, f"{file_basename}1")
assert os.path.isfile(filename), filename + " is missing"
contents = open(filename).read().rstrip()
if hasattr(contents, 'decode'):
if hasattr(contents, "decode"):
contents = contents.decode()
assert "weee" in contents, 'incorrect file contents: %s' % contents
assert "weee" in contents, "incorrect file contents: %s" % contents
print("subfolder file OK")

filename = os.path.join(os.environ['PREFIX'], 'subpackage_file1.ext')
assert os.path.isfile(filename)
filename = os.path.join(os.environ["PREFIX"], f"{file_basename}1.{extension}")
assert os.path.isfile(filename), filename + " is missing"
contents = open(filename).read().rstrip()
if hasattr(contents, 'decode'):
if hasattr(contents, "decode"):
contents = contents.decode()
assert "weee" in contents, 'incorrect file contents: %s' % contents
assert "weee" in contents, "incorrect file contents: %s" % contents

filename = os.path.join(os.environ['PREFIX'], 'subpackage_file2.ext')
assert os.path.isfile(filename)
filename = os.path.join(os.environ["PREFIX"], f"{file_basename}2.{extension}")
assert os.path.isfile(filename), filename + " is missing"
contents = open(filename).read().rstrip()
if hasattr(contents, 'decode'):
if hasattr(contents, "decode"):
contents = contents.decode()
assert "weee" in contents, 'incorrect file contents: %s' % contents
assert "weee" in contents, "incorrect file contents: %s" % contents
print("glob OK")
1 change: 1 addition & 0 deletions tests/test_subpackages.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@


@pytest.mark.slow
@pytest.mark.serial
isuruf marked this conversation as resolved.
Show resolved Hide resolved
@pytest.mark.parametrize(
"recipe",
[
Expand Down
Loading