Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include NumPy BLAS/LAPACK info in client.get_versions() #1827

Open
jakirkham opened this issue Mar 9, 2018 · 16 comments
Open

Include NumPy BLAS/LAPACK info in client.get_versions() #1827

jakirkham opened this issue Mar 9, 2018 · 16 comments
Labels
good first issue Clearly described and easy to accomplish. Good for beginners to the project.

Comments

@jakirkham
Copy link
Member

At the risk of overloading client.get_versions() with info, it would be handy to be able to check the NumPy BLAS/LAPACK linkage in here. This can be really helpful when debugging a slow computation or a very strange segfault that might be BLAS or LAPACK related. One way at this info is numpy.__config__.show(), but that might be too heavy for client.get_versions(). Open to other ways to include this info if there are suggestions.

@mrocklin
Copy link
Member

mrocklin commented Mar 9, 2018 via email

@rbubley
Copy link
Contributor

rbubley commented Mar 10, 2018

Where the task is largely about gathering information from workers, I was wondering if the right approach might be to modify Client.run() to be able to return values (or futures). Deciding what information to harvest from the workers would then be in the control of the clients, not reliant on changes to distributed.

@mrocklin
Copy link
Member

Yes, that's doable today from user-space and a fine solution.

One reason by get_versions doesn't take this approach (it used to) is that it also gathers information from the scheduler, where we try to avoid depending on pickle. I suspect that in the future, using pickle may be turned off by default in the scheduler.

@martindurant martindurant added the good first issue Clearly described and easy to accomplish. Good for beginners to the project. label Jul 14, 2018
@lalitparate
Copy link

Hi, I am first time contributing to open source. Can I wok on it?

@jhamman
Copy link
Member

jhamman commented Dec 7, 2018

@lalitparate - yes, dask is a community driven open-source project. As such, anyone is welcome to work on anything. Let us know if you need help.

@moshiba
Copy link

moshiba commented Mar 26, 2020

Are we still aiming to show this worker linkage info in client.get_versions() ?

I think it's reasonable to include more things. It's fairly cheap. We might also keep get_versions as it is, but make a larger get_info function that has a wider scope

Or should I build get_info() by wrapping client.run() ?

@quasiben
Copy link
Member

As others have commented, adding to get_versions seems to be a supported idea. You might want to look at #3567 as it has some updates to get_versions as well as tests

@moshiba
Copy link

moshiba commented Mar 30, 2020

As others have commented, adding to get_versions seems to be a supported idea. You might want to look at #3567 as it has some updates to get_versions as well as tests

Sure, thanks.

May I ask what exactly do we want to show in get_versions()?
Since there are lots of possible BLAS/LAPACK library linking options in
Numpy, (seven currently)
I'm not sure if showing every build info presented in Numpy.show_config() is the best idea.

@moshiba
Copy link

moshiba commented Mar 30, 2020

Another question is how should we fit the various library linkage info into client.get_versions()?
It seems to me that the current output layout is not meant to present a list of sublists about a package but to show version info alone,
packing stuff like this into get_versions() for every worker seems suboptimal

blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

@quasiben
Copy link
Member

@HsuanTingLu do you have thoughts on a more optimal layout ?

@moshiba
Copy link

moshiba commented Mar 31, 2020

No, I don't have one, so I'll probably add it anywhere you guys see fit.

Back to the second question, how should the info be fitted into client.get_versions()?
I'm thinking about adding a numpy-config sublist under host, or maybe somewhere under package::numpy?

@quasiben
Copy link
Member

I am +1 on package::numpy. I understand this to mean something like:

 'packages': {'numpy': 'blas_opt_info: {}

Is that right ?

@moshiba
Copy link

moshiba commented Apr 1, 2020

Yeah something like this
'packages': { 'numpy': '1.18.2', 'blas_opt_info: {}, 'lapack_opt_info: {}}

@GenevieveBuckley
Copy link
Contributor

@HsuanTingLu do you still want to work on this? Did the comments from Ben answer all your questions?

@moshiba
Copy link

moshiba commented Oct 19, 2021

@GenevieveBuckley I have a few commits lying around, I think I'll need a few weeks to put them together

@GenevieveBuckley
Copy link
Contributor

Sounds great, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Clearly described and easy to accomplish. Good for beginners to the project.
Projects
None yet
Development

No branches or pull requests

9 participants