Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuilding for longer path length (conda-build 2.0.0) #171

Closed
jakirkham opened this issue Jun 7, 2016 · 31 comments
Closed

Rebuilding for longer path length (conda-build 2.0.0) #171

jakirkham opened this issue Jun 7, 2016 · 31 comments

Comments

@jakirkham
Copy link
Member

jakirkham commented Jun 7, 2016

The build prefix is going to get longer in conda-build 2.0.0. ( conda/conda-build#877 ) There has been some discussion about what this will affect and what needs to get rebuilt. I have moved this from a different thread so the discussion can see the light of day. 😄 An excerpt of it is below. Gist is we need to rebuild a few things some of which we know. Some we may not have. These include curl, fftw, pkg-config, and tk. If swig ever gets added, we have to watch out for that too. I suspect libtool and git will also be affected.


@msarahan commented on Sun Jun 05 2016

@stuarteberg - 2.0.0beta tagged: https://github.com/conda/conda-build/releases/tag/2.0.0beta


@stuarteberg commented on Sun Jun 05 2016

OHHH yeahhhh....

.... wait, there's no emoji for the kool-aid man? W. T. F.

I'll try to test this out this week. Thanks for the heads-up.


@stuarteberg commented on Mon Jun 06 2016

2.0.0beta tagged

I'll try to test this out this week.

Whoa now. I clearly wasn't paying enough attention to conda/conda-build#877. :-)

Sounds like the right decision was made. But it will take time to test -- I need to rebuild my entire stack and it will be a few days before I can do that. When do plan on turning the 'beta' release into a real release?


@msarahan commented on Mon Jun 06 2016

Sometime before mid-June. It can stew for a week or two.

On Mon, Jun 6, 2016 at 3:05 PM Stuart Berg notifications@github.com wrote:

2.0.0beta tagged

I'll try to test this out this week.

Whoa now. I clearly wasn't paying enough attention to
conda/conda-build#877 conda/conda-build#877. :-)

Sound like the right decision was made. But it will take time to test -- I
need to rebuild my entire stack and it will be a few days before I can do
that. When do plan on turning the 'beta' release into a real release?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
conda-forge/jpeg-feedstock#2 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AACV-e8K4EbpfEzQyKN_toW3zajyP8mcks5qJH1xgaJpZM4Ifozb
.


@msarahan commented on Mon Jun 06 2016

Please keep me posted on your findings. I'm especially interested in how compatible new packages are with old ones. I think they should be interchangeable, but only the new ones will work on systems with long prefixes.


@stuarteberg commented on Mon Jun 06 2016

My very first attempt at building a package failed immediately (on OS X). The _build... prefix was really long, and apparently I have dependency that contains binary files which include the prefix.

It looks like many of my recipes include detect_binary_files_with_prefix: true, which causes many (all?) of the dylibs to be listed in the package's info/has_prefix metadata. Was it a mistake to use that detect_binary_files_with_prefix setting in the first place?


@msarahan commented on Mon Jun 06 2016

I don't think it was a mistake to use that, but it does imply the long prefix. What were the error messages? Are you able to build something with no dependencies?


@stuarteberg commented on Mon Jun 06 2016

What were the error messages?

I'm building a package that depends on fftw. My fftw package is from my own channel (ilastik - for reference, recipe here: https://github.com/ilastik/ilastik-build-conda/blob/master/fftw/meta.yaml)

So while it was creating the _build environment, the linking step failed when it tried to "link" fftw:

ERROR: placeholder '/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho' too short in: ilastik::fftw-3.3.4-1

Are you able to build something with no dependencies?

Yes. For example, this recipe builds:
https://github.com/ilastik/ilastik-build-conda/blob/master/lz4/meta.yaml

I'll do some more digging (and thinking) about this later this week. If I have to rebuild my whole stack, that's no big deal. But it would be nice if we can come up with clear guidance for people who run into the same issue I'm seeing.


@msarahan commented on Mon Jun 06 2016

Agreed. Thanks for being my guinea pig.


@jakirkham commented on Mon Jun 06 2016

Well, the placeholder size changed in 2.0.0beta. Not sure if you caught that or not, @stuarteberg, but that could be causing you some pain.


@msarahan commented on Tue Jun 07 2016

@stuarteberg before you get too far, I'm going to tag a 1.21.0 release that has everything but the prefix length change. We are working on a new Anaconda release, and this change is simply too disruptive so close to a release.

Is it true to say that new builds can not use old builds, but old builds can use new builds?


@stuarteberg commented on Tue Jun 07 2016

I'm going to tag a 1.21.0 release

Sounds like a good idea.

Is it true to say that new builds can not use old builds, but old builds can use new builds?

I think that's right, if I understand the problem correctly.


@stuarteberg commented on Tue Jun 07 2016

Continuing to investigate the example from above (my problematic fftw package), here's what I see.

(Reminder: this package was built with detect_binary_files_with_prefix: true)

$ cat /miniconda/pkgs/fftw-3.3.4-1/info/has_prefix
/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho binary lib/libfftw3.3.dylib
/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho binary lib/libfftw3.dylib
/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho binary lib/libfftw3f.3.dylib
/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho binary lib/libfftw3f.dylib
/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho binary lib/libfftw3l.3.dylib
/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho binary lib/libfftw3l.dylib
/opt/anaconda1anaconda2anaconda3 text lib/libfftw3.la
/opt/anaconda1anaconda2anaconda3 text lib/libfftw3_threads.la
/opt/anaconda1anaconda2anaconda3 text lib/libfftw3f.la
/opt/anaconda1anaconda2anaconda3 text lib/libfftw3f_threads.la
/opt/anaconda1anaconda2anaconda3 text lib/libfftw3l.la
/opt/anaconda1anaconda2anaconda3 text lib/libfftw3l_threads.la
/opt/anaconda1anaconda2anaconda3 text lib/pkgconfig/fftw3.pc
/opt/anaconda1anaconda2anaconda3 text lib/pkgconfig/fftw3f.pc
/opt/anaconda1anaconda2anaconda3 text lib/pkgconfig/fftw3l.pc

OK, so for some reason all of the .dylib files have the prefix embedded in them. But, wait, they use relative RPATHs and whatnot. Why do they contain the prefix?

Here's an ugly command that identifies the files containing _build and prints out the guilty strings:

$ for f in $(find /miniconda/pkgs/fftw-3.3.4-1/lib/ -type f | xargs grep -l _build); do echo "$f:"; strings $f | grep _build; done
/miniconda/pkgs/fftw-3.3.4-1/lib//libfftw3.3.dylib:
gcc -arch x86_64 -I/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho/include
/miniconda/pkgs/fftw-3.3.4-1/lib//libfftw3.a:
gcc -arch x86_64 -I/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho/include
/miniconda/pkgs/fftw-3.3.4-1/lib//libfftw3f.3.dylib:
gcc -arch x86_64 -I/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho/include
/miniconda/pkgs/fftw-3.3.4-1/lib//libfftw3f.a:
gcc -arch x86_64 -I/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho/include
/miniconda/pkgs/fftw-3.3.4-1/lib//libfftw3l.3.dylib:
gcc -arch x86_64 -I/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho/include
/miniconda/pkgs/fftw-3.3.4-1/lib//libfftw3l.a:
gcc -arch x86_64 -I/miniconda/envs/_build_placehold_placehold_placehold_placehold_placehold_placeho/include

OK, so apparently the gcc command is stored within the binaries for some reason? BTW, I checked on Linux, and it's the same. Not sure why this is the case. Do you know?


@msarahan commented on Tue Jun 07 2016

I don't know, but I guess I get to learn.


@msarahan commented on Tue Jun 07 2016

1.21.0beta tagged: https://github.com/conda/conda-build/releases


@stuarteberg commented on Tue Jun 07 2016

I don't know, but I guess I get to learn.

It might not be worth learning: Looking through my packages, there are several dylib files that DO include the _build_placeholder... prefix, but for different reasons (i.e. not the gcc commands as shown above). I don't know if those uses of it are important (I suspect most aren't), but it's probably not worth investigating each one.

Even so, I'm attempting to get some explanation for the embedded build commands, just for curiosity's sake:
http://stackoverflow.com/questions/37684320/what-causes-a-compiled-library-to-store-its-build-command-internally


@stuarteberg commented on Tue Jun 07 2016

OK, I did some more digging (with the help of a stackoverflow user), and it turns out that fftw is the only package on my machine that includes it's own build command in the binary itself. (For the record, it's in a variable named FFTW_CC, which makes it's way into the binary by way of api/version.c)

Anyway, whatever, it doesn't matter. I think there's no way around it: People whose detect_binary_files_with_prefix: true may have to rebuild some of their packages.

The good news is that -- as far as I can tell -- this applies to very few of the packages in the default anaconda distribution.

But there are other packages from the defaults channel that will need updating, such as:

  • curl (The latest version needs to be rebuilt, but the version pinned in anaconda doesn't for some reason.)
  • tk (ditto)
  • gcc
  • pkg-config
  • swig

... and probably more.


@msarahan commented on Tue Jun 07 2016

Good to know. Thanks! Should we be trying to clear that information? Any idea why people put it there? Posterity's sake?


@stuarteberg commented on Tue Jun 07 2016

Should we be trying to clear that information?

I don't see the harm in leaving it there. Does conda even provide fftw? I don't think it does. This was in one of my own packages. I guess now that you guys ship mklfft, there's no need for me to use fftw anyway...


@jakirkham commented on Tue Jun 07 2016

Does conda even provide fftw?

conda-forge does.


@stuarteberg commented on Tue Jun 07 2016

conda-forge does.

Then make sure you rebuild it when conda-build 2.0 comes out! ;-)


@jakirkham commented on Tue Jun 07 2016

Duly noted.

curl (The latest version needs to be rebuilt, but the version pinned in anaconda doesn't for some reason.)

So, curl hardcodes the path to where the certificates live. I'm guessing old versions of curl used these from the system. So they were not affected by the prefix length during build time. Newer versions of curl (from defaults) use certificates that are provided in the openssl package. So, they are affected by the prefix during build time. At conda-forge we have a separate certificates package that ends up living in the same location as where the openssl one from defaults place them for compatibility reasons. So, it will probably be affected. Though we already knew about this as we had to fix it before. 😄

tk (ditto)

Not sure where this hardcodes things. Did you see this anywhere in it? Probably points to $PREFIX/lib/tkX.Y/ and $PREFIX/lib/tclX.Y/.

gcc

Not surprised. Any chance we could get a rebuild of gcc before that conda-build release, @msarahan? I think only a few packages are affected by this (ones using Fortran or OpenMP), but we should probably get the compiler fixed for them.

pkg-config

Have not inspected this, but I'm guessing it hardcodes the path to $PREFIX/lib/pkgconfig somewhere. Hence unsurprising it needs a rebuild.

Surprised you didn't mention libtool, which adds .la files to $PREFIX/lib/. Would figure it has the same problem.

swig

I don't really use SWIG. So, I trust your judgement. Probably hardcoding some path to some provided .i files.


@jakirkham commented on Tue Jun 07 2016

Did you try git? I think that might be affected too. At least, I suspect ours will be. Guessing defaults is similar.


@jakirkham commented on Tue Jun 07 2016

Alright, this conversation needs to see the light of day (not a closed unrelated PR), I'm going to try using ZenHub's move issue feature, but this could go horribly wrong. So, please fasten your seatbelts. 😁

@jakirkham jakirkham changed the title MNT: Re-render the feedstock Rebuilding for longer path length (conda-build 2.0.0) Jun 7, 2016
@jakirkham
Copy link
Member Author

cc @conda-forge/core @stuarteberg ( just in case 😉 )

@stuarteberg
Copy link
Contributor

For a given conda installation, this is the command I'm using to decide if the package will need to be rebuilt after the conda-build 2.0 release:

fgrep -l ' binary ' $(conda info --root)/pkgs/*/info/has_prefix

I'm not sure how to do that for all packages in conda-forge generally.

@jakirkham
Copy link
Member Author

Also asked the Anaconda team for some way to check this easily. See issue ( https://github.com/Anaconda-Platform/support/issues/44 ).

@pelson
Copy link
Member

pelson commented Jun 23, 2016

tl;dr - do we need to rebuild everything that has the PREFIX in it?

@msarahan
Copy link
Member

Yep - or rather yep, for anything that is a library that something else links to.

@pelson
Copy link
Member

pelson commented Jun 23, 2016

Yep - or rather yep, for anything that is a library that something else links to.

Is it worth extending conda so that it can handle both lengths of prefix placeholder?

@msarahan
Copy link
Member

so, try to use long first, but if that fails, fall back to short? That's probably a good idea. It might make the transition less jarring.

@jakirkham
Copy link
Member Author

jakirkham commented Aug 14, 2016

As more info for this planning, @bollwyvl made an addition released anaconda-client version 1.5.1 a little while ago. This makes sure that the has_prefix info shows up for new packages that have been uploaded with that version. We updated to that version with PR ( conda-forge/anaconda-client-feedstock#1 ). So anything within the past 10 days should have this info now.

However, we still have the challenging issue of all the old packages that were not uploaded with this info. This is described in issue ( https://github.com/Anaconda-Platform/support/issues/57 ). It would be good if we can either get that info added to those package. Alternatively, if that is not possible, a list of packages with that info for the conda-forge channel might also be nice.

@jakirkham
Copy link
Member Author

So we still don't know all the things we need to rebuild because of the path length increase. I really don't understand how we can proceed to conda-build version 2 without a list of these. While I understand that it may take us some time to do the rebuild, I'm not eager to learn what these are by trial and error. Sorry if this comes across as pushy, but we have been talking about this for 3mos. What do we need to do to get this list?

@jjhelmus
Copy link
Contributor

After reading these comment this is how I understand the issue, please correct me if I am incorrect in my understanding.

From my reading of this issue, we need to determine which packages in conda-forge need to be rebuild with the longer prefix. Then these packages need to be rebuild with conda build 2.0+ in the correct order to support the longer prefix.

For packages which were uploaded with anaconda-client version 1.5.1 or newer this information should be available from the Anaconda.org API by looking at the has_prefix entry. For example pyfive 0.2.0 indicates that prefixes are not used where-as distributed 1.13.2 does use a prefix and would need to be rebuild.

For packages uploaded with older version of anaconda-client this information is not available through the Anaconda.org API. Downloading the package and looking to see if a has_prefix file in the info directory is one way (the only way?) to determine if these packages use a prefix and need to be re-built.

@jakirkham
Copy link
Member Author

Yep, that's all correct.

For packages uploaded with older version of anaconda-client this information is not available through the Anaconda.org API. Downloading the package and looking to see if a has_prefix file in the info directory is one way (the only way?) to determine if these packages use a prefix and need to be re-built.

This is what I'm unsure about. There was a fair bit of discussion about this being done server-side (so presumably by Continuum). Though it is unclear to me whether that is still the case or not as there has been little response of late.

@jakirkham
Copy link
Member Author

jakirkham commented Sep 15, 2016

Relevant info from gitter on a script to download and check by @jjhelmus from start to end with some results and comments.

The script: https://gist.github.com/jjhelmus/869d6827ac8e0275437e7643989974e4
The results: https://gist.github.com/jjhelmus/3d4a3122e14adfc1223a328b4b8cabd2

@jakirkham
Copy link
Member Author

Ran the aforementioned script by @jjhelmus with modifications to download all files no matter what and then search for binary in the info/has_prefix file (if present). If it was present and that text was found, it counted as a match. In some cases we were unable to download files, we will need to manually check these or update the script to handle them.

Have included the log, the list of files with a binary prefix, the list of packages downloaded with md5 checksums, the modified script, and the list of undetermined files in gist ( https://gist.github.com/jakirkham/621cd3a03098205f5eba83533df932fe ).

Found 30 packages that are known issues and 26 that are undetermined. At the time of this writing there are 1149 packages according to anaconda.org on conda-forge. We downloaded 1085 packages. Given the 26 packages we were unable to download at all due to a missing entry, this leaves us with an additional 38 that are unaccounted for.

@patricksnape
Copy link

What if we use similar logic to conda-build in order to search the package binaries for the shorter prefix string?

@jakirkham
Copy link
Member Author

jakirkham commented Sep 15, 2016

Maybe, but maybe that runs us into the same issues we have seen with conda inspect linkages. While it is a bit old fashioned and hacky to search the has_prefix file, this just works basically regardless how old the package is. The main questions I'm worried about are...

  1. Why are there 38 packages not listed in the index and what are they?
  2. Why can't we find download links for 26 packages (especially when they have packages)?

Edit: Added xrefs for upstream issues about these 2 problems.

@patricksnape
Copy link

No we should definitely do what you are doing - just for the files that are missing has_prefix?

@jakirkham
Copy link
Member Author

As you are the second person asking me this question, I must have done a terrible job explaining this. Sorry about that. 😞

AFAIK all packages have the has_prefix file. However, we seem to be unable to download some packages or even find them via the anaconda.org API. That's the issue we are wrestling with right now. Hope that is clearer. If not, please ask more questions.

@jjhelmus
Copy link
Contributor

Why can't we find download links for 26 packages (especially when they have packages)?
xref: Anaconda-Platform/support#74

This is caused by a bug in the script used to find the latest versions of the packages which filters out all non-latest versions. If the package uses a non-standard version scheme, post releases or other oddities all instances of the package can be filtered out. If you change the find_latest_version function to the following all the packages will be downloaded and examined.

def find_latest_versions(index, package_name):
    """ Return the latest version and packages from a conda channel index. """
    valid = [v for v in index.values() if v['name'] == package_name]
    versions = [parse_version(v['version']) for v in valid]
    latest_ver = str(max(versions))
    entries = [v for v in valid if v['version'] == latest_ver]
    if len(entries) == 0:
        # fall back to sorting versions by string if all entries were removed
        versions = [v['version'] for v in valid]
        latest_ver = sorted(versions)[-1]
        entries = [v for v in valid if v['version'] == latest_ver]
    return latest_ver, entries

I'm still only find 1149 packages where-as Anaconda.org shows 1173 (24 missing) which I am investigating.

@jjhelmus
Copy link
Contributor

jjhelmus commented Sep 20, 2016

Why are there 38 packages not listed in the index and what are they?
xref: https://github.com/Anaconda-Platform/support/issues/73

This is also a bug in the script. The list of Anaconda.org packages is only pulled for the platform on which the script is running on (linux-64, osx-64, etc) and therefore fails to find some of the files. I will post a modified version of the script shortly which checks all packages on Anaconda.org.

@jakirkham
Copy link
Member Author

Ah, ok, so these are presumably non-osx packages in my case. Thanks for clarifying. Will close that bug report.

Could you also please incorporate the changes from my script when you update?

@jjhelmus
Copy link
Contributor

The following script can be used to find packages with a binary prefix in the conda-forge or another channel.

It is not perfect; it only search what it thinks to be the latest version of the package, and then only a random (based on the package's MD5) build/platform of these packages. None-the-less it finds 29 packages which have a binary prefix which will need to be rebuild with conda-build 2.0. It misses the curl package because the windows package it downloads does not need a binary prefix:

! /usr/bin/env python3
""" Find conda packages which use a binary prefix. """

import argparse
import bz2
import json
import os
import tarfile
import urllib

try:
    from packaging.version import parse as parse_version
except ImportError:
    from pip._vendor.packaging.version import parse as parse_version


def get_channel_index(channel):
    """ Return the channel index for all platforms. """

    # find all packages in the channel one platform at a time
    index = {}
    url_template = 'https://conda.anaconda.org/%s/%s/repodata.json.bz2'
    for platform in ['linux-64', 'osx-64', 'win-32', 'win-64', 'linux-32']:
        url = url_template % (channel, platform)
        response = urllib.request.urlopen(url)
        decomp = bz2.decompress(response.read())
        json_response = json.loads(decomp.decode('utf-8'))
        index.update(json_response['packages'])

    # add a download url to all packages in the index
    channel_url = 'https://conda.anaconda.org/%s' % channel
    for fn, info in index.items():
        subdir = info['subdir']
        info['url'] = channel_url + '/' + subdir + '/' + fn

    return index


def find_latest_versions(index, package_name):
    """ Return the latest version and packages from a conda channel index. """
    valid = [v for v in index.values() if v['name'] == package_name]
    versions = [parse_version(v['version']) for v in valid]
    latest_ver = str(max(versions))
    entries = [v for v in valid if v['version'] == latest_ver]
    if len(entries) == 0:
        # fall back to sorting versions by string if all entries were removed
        versions = [v['version'] for v in valid]
        latest_ver = sorted(versions)[-1]
        entries = [v for v in valid if v['version'] == latest_ver]
    return latest_ver, entries


def parse_arguments():
    """ Parse command line arguments. """
    parser = argparse.ArgumentParser(
        description="Find conda packages which use a prefix")
    parser.add_argument(
        'packages', nargs='*',
        help=('Name of packages to check, leave blank to check all packages '
              'on the channel'))
    parser.add_argument(
        '--skip', '-s', action='store', help=(
            'file containing list of packages to skip when checking for '
            'prefixes'))
    parser.add_argument(
        '--verb', '-v', action='store_true', help='verbose output')
    parser.add_argument(
        '--channel', '-c', action='store', default='conda-forge',
        help='Conda channel to check.  Default is conda-forge')
    parser.add_argument(
        '--json', action='store', help='Save outdated packages to json file.')
    parser.add_argument(
        '--directory', '-d', action='store',
        default=os.path.join(os.getcwd(), 'pkg_cache'),
        help='where to store packages')
    return parser.parse_args()


def find_prefix_packages(index, package_names, verbose, cache_dir):
    """ Return a list of packages which use a prefix. """

    pkgs_with_bin_prefix = []
    pkgs_with_no_bin_prefix = []

    for package_name in sorted(package_names):
        _, entries = find_latest_versions(index, package_name)

        if not entries:
            print(package_name + " : Missing any entries. Skipping...")
            continue

        # sort entired by md5 so we try the same package each time
        entries = sorted(entries, key=lambda k: k['md5'])
        url = entries[0]['url']
        filename = os.path.join(cache_dir, url.split('/')[-1])

        # Download if not in cache
        if not os.path.exists(filename):
            print("Downloading:", filename)
            response = urllib.request.urlopen(url)
            with open(filename, 'wb') as f:
                f.write(response.read())

        # determine if package uses a binary prefix
        tf = tarfile.open(filename)
        try:
            uses_prefix = b' binary ' in tf.extractfile(
                tf.getmember('info/has_prefix')).read()
        except KeyError:
            uses_prefix = False

        if uses_prefix:
            print(package_name, "uses a binary prefix")
            pkgs_with_bin_prefix.append(package_name)

        elif uses_prefix is False:
            pkgs_with_no_bin_prefix.append(package_name)
            if verbose:
                print(package_name, "does NOT use a binary prefix")

    print("Uses a binary prefix:", len(pkgs_with_bin_prefix))
    print("Does NOT use a binary prefix:", len(pkgs_with_no_bin_prefix))
    print("Total:", len(pkgs_with_bin_prefix) + len(pkgs_with_no_bin_prefix))

    return pkgs_with_bin_prefix


def main():
    """ main function """
    args = parse_arguments()

    # create somewhere to store downloaded packages.
    if not os.path.exists(args.directory):
        os.makedirs(args.directory)

    # determine package names to check
    index = get_channel_index(args.channel)
    package_names = set(args.packages)
    if len(package_names) == 0:  # no package names given on command line
        package_names = {v['name'] for k, v in index.items()}

    # remove skipped packages
    if args.skip is not None:
        with open(args.skip) as f:
            pkgs_to_skip = [line.strip() for line in f]
        package_names = [p for p in package_names if p not in pkgs_to_skip]

    # find packages which use a binary prefix
    pkgs_with_bin_prefix = find_prefix_packages(
        index, package_names, args.verb, args.directory)

    # save pkgs_with_bin_prefix to json formatted file is specified
    if args.json is not None:
        with open(args.json, 'w') as f:
            json.dump(pkgs_with_bin_prefix, f)


if __name__ == "__main__":
    main()

@jjhelmus
Copy link
Contributor

Results of running the script:

$ python binary_prefix_check_file.py --json bin_prefix.json
bison uses a binary prefix
cycamore uses a binary prefix
cyclus uses a binary prefix
dbus uses a binary prefix
eccodes uses a binary prefix
ecmwf_grib uses a binary prefix
flex uses a binary prefix
glib uses a binary prefix
hdf5 uses a binary prefix
iverilog uses a binary prefix
lua uses a binary prefix
mpi4py uses a binary prefix
ncurses uses a binary prefix
openmpi uses a binary prefix
openssl uses a binary prefix
openturns uses a binary prefix
pango uses a binary prefix
pkg-config uses a binary prefix
python-eccodes uses a binary prefix
python-ecmwf_grib uses a binary prefix
python-spams uses a binary prefix
simbody uses a binary prefix
swig uses a binary prefix
texlive-core uses a binary prefix
tk uses a binary prefix
udunits uses a binary prefix
udunits2 uses a binary prefix
uwsgi uses a binary prefix
vigra uses a binary prefix
Uses a binary prefix: 29
Does NOT use a binary prefix: 1148
Total: 1177

I think that list along with curl is all that we should focus on rebuilding right away when we move to conda build 2.0. If other packages are found to have a short prefix we can rebuild those as needed.

@jakirkham
Copy link
Member Author

Ok, that's reassuring that we are now seeing the right number of packages.

I know that the prefix length was increased on Mac/Linux.

However, I'm less clear on what (if any) change was done for Windows. @msarahan, could you please provide us some guidance on Windows?

...it only search what it thinks to be the latest version of the package, and then only a random (based on the package's MD5) build/platform of these packages.

Should we be doing this randomly or should we try to get one from each platform (if possible)? Assuming of course we need to be rebuilding on Windows too and that Mac/Linux is any different on this front.

@jjhelmus
Copy link
Contributor

Updated my original gist so that the script now downloads and checks a package from each platform (osx, linux, and win). This might still miss a few corner cases but I think it is good enough. The list of packages which use a binary prefix that this script finds are:

$ python binary_prefix_check_file.py
bison uses a binary prefix
curl uses a binary prefix
cycamore uses a binary prefix
cyclus uses a binary prefix
dbus uses a binary prefix
eccodes uses a binary prefix
ecmwf_grib uses a binary prefix
flex uses a binary prefix
fontconfig uses a binary prefix
git uses a binary prefix
glib uses a binary prefix
graphviz uses a binary prefix
harfbuzz uses a binary prefix
hdf5 uses a binary prefix
iverilog uses a binary prefix
lua uses a binary prefix
mpi4py uses a binary prefix
ncurses uses a binary prefix
obspy uses a binary prefix
openmpi uses a binary prefix
openssl uses a binary prefix
openturns uses a binary prefix
pango uses a binary prefix
pkg-config uses a binary prefix
python-eccodes uses a binary prefix
python-ecmwf_grib uses a binary prefix
python-spams uses a binary prefix
simbody uses a binary prefix
swig uses a binary prefix
texlive-core uses a binary prefix
tk uses a binary prefix
udunits uses a binary prefix
udunits2 uses a binary prefix
uwsgi uses a binary prefix
vigra uses a binary prefix
Uses a binary prefix: 35
Does NOT use a binary prefix: 1168
Total: 1203

@jakirkham
Copy link
Member Author

Thanks @jjhelmus. Completely agree that this is good enough. 👍

@jakirkham
Copy link
Member Author

At this point, we are waiting for a conda-smithy release. Please see issue ( conda-forge/conda-smithy#305 ).

@bgruening
Copy link
Contributor

Thanks for working on this! Really appreciated!

@jakirkham
Copy link
Member Author

So here is a case ( conda-forge/cmake-feedstock#14 ) where we are encountering a build failure in recipes using cmake with conda-build version 2. It seems like ncurses short prefix length forces us to use a shorter prefix, but it is not clear if this causes the failure or not ATM. The most common case thus far iscmake cannot find libarchive. Though there have been reports of it failing to find other libraries.

Now I'm not entirely sure whether rebuilding with conda-build version 2.x fixes this problem or not, but I'd like a little time to explore that first before we proceed. If other people would like to help, that would definitely be appreciated.

@pelson
Copy link
Member

pelson commented Oct 8, 2016

So here is a case ( conda-forge/cmake-feedstock#14 ) where we are encountering a build failure in recipes using cmake with conda-build version 2.

I have been trying to find some evidence of this (thank you for all of the CircleCI logs), but still am not able to convince myself that the cmake problem was a conda-build 2 issue at all. I have been attempting to reproduce the problem locally, and no matter which conda-build I use, cmake is functional (cmake --help, rather than compiling anything) for both 3.5 and 3.6. I will try getting the repo exactly as was in other examples to see if I can reproduce. If not, I'm inclined to move the conda-build version pinning forward, and deal with any fallout from there.

@pelson
Copy link
Member

pelson commented Oct 8, 2016

OK, there is definitely something in the conda-build 2 theory - able to reproduce. Will track any updates in conda-forge/cmake-feedstock#21 to avoid making further noise in this issue.

@jakirkham
Copy link
Member Author

This is essentially a solved issue with the exception of one or two stragglers that appear unmaintained at this point. Closing this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants