Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ho w do I enable all the imports with in my service available and work with dask #3566

Closed
Sargababu opened this issue Mar 11, 2020 · 12 comments

Comments

@Sargababu
Copy link

I am having a lot of imports with in my service and when I try to make it work as expected by calling dask scheduler in my cluster I am getting ModuleNotFoundError as below.
raise result File "/opt/conda/lib/python3.6/site-packages/distributed/worker.py", line 972, in upload_file File "/opt/conda/lib/python3.6/site-packages/distributed/utils.py", line 1055, in import_file File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module File "/worker-i7k5722z/main.py", line 12, in <module> ModuleNotFoundError: No module named 'src' [2020-03-11 14:24:06 -0700] [98249] [INFO] Worker exiting (pid: 98249) [2020-03-11 14:24:06 -0700] [98250] [ERROR] Exception in worker process
My directory structure is as follows:

  • Segmentation-withdask:
    - resources:
    input_schema.json
    - src:
    client:
    datalake_download.py
    - utils:
    dice_calculator.py
    main.py

With in main.py I call my client and gather the results. I importing all the files with in main.py. What is the better approach in my case to make all my imports available across my dask workers.

@jrbourbeau
Copy link
Member

@Sargababu could you provide a minimal example (see https://blog.dask.org/2018/02/28/minimal-bug-reports)? It's not clear to me what code you're running that's resulting in the traceback you posted. Looks like you're using Client.upload_file, but I'm not sure what else

@Sargababu
Copy link
Author

@jrbourbeau Thanks for taking time to look into this issue. So basically I have my dask scheduler
and workers deployed to AWS cloud. I have a service that does some operations like s3 download and processing the image data and its modularized as I have mentioned above . With Sagemaker notebook pulling all these functionality to one single notebook file with dask client works. But when I try to call dask client from my main.py file it shoot the above error.

With in my main.py I have these many imports(I am just adding the relevant parts from my script)

from src.utils.exceptions import *
from src.utils.constants import *
from src.client.datalake_download import datalake_image_download

client = Client(dask_scheduler_ip)
def main_fn():
data_futureobj = [client.submit(request_handler, image_data, org_id) for image_data in
request_data['inputResources']]
metrics_data = client.gather(data_futureobj)

The error is shooting because dask workers are not able to fetch my imports like src.utils.exceptions, src.utils.constants. I found suggestions online to use client.upload_file to specify the files explicitly . But I am not sure how to use it in my case or is there any other way to make all my file imports available to my workers.

@mrocklin
Copy link
Member

In general Dask assumes that the software environment is the same between your client and your workers. Dask itself can move around individual files with upload_file, but for more complex environments you will have to manage the software environment yourself. Typically on the cloud people do this with Docker images and something like Kubernetes.

@Sargababu
Copy link
Author

@mrocklin Thanks for the clarification. I had one more question around this. So when we ask a dask worker to do some job that requires some package say requests or opencv then will it look into client library location or worker location? Probably a dumb question.
I have a script in sagemaker notebook that does some work calling dask scheuler in cluster it works as expected . But when I try the same running the script as docker with same package versions and everything I am getting this error as below

File "/opt/conda/lib/python3.7/site-packages/distributed/utils.py", line 329, in f
result[0] = yield future
File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/opt/conda/lib/python3.7/site-packages/distributed/client.py", line 1741, in _gather
raise exception.with_traceback(traceback)
File "test_dask.py", line 36, in newunit
pred = downloadreq(org_id,contract_id,fileurl)
File "test_dask.py", line 32, in downloadreq
response = requests.get(url)
SystemError: unknown opcode

Any idea why this could be happening?

@mrocklin
Copy link
Member

So when we ask a dask worker to do some job that requires some package say requests or opencv then will it look into client library location or worker location? Probably a dumb question.

A Dask Worker is a Python process. If you ask it to use some library then that Python process will try to import it however Python imports things. The worker is unable to look at the client's process and see its software environment. Dask doesn't do any magic here. It is just a bunch of Python processes.

Regarding "unknown opcode" unfortunately no, I'm unfamiliar with that error message. Given your concern above about mismatched software environments my first recommendation would be to ensure that all of your Python processes have the same versions. You might try the following command:

client.get_versions(check=True)

@Sargababu
Copy link
Author

@mrocklin yes I did try that and have made all versions same across client and workers. can this be caused of mismatch in python versions.

@mrocklin
Copy link
Member

mrocklin commented Mar 17, 2020 via email

@Sargababu
Copy link
Author

so if there is python version mismatch will this client.get_versions(check=True) detects that?

@jrbourbeau
Copy link
Member

Not currently, but #3567 adds Python to the list of packages that are checked with client.get_versions(check=True)

@Sargababu
Copy link
Author

@jrbourbeau ok thanks for confirming that. I guess it could be my python version that causing this issue. Let me just get that confirmed . @mrocklin @jrbourbeau really appreciate taking effort and time to help me here :)

@Sargababu
Copy link
Author

@mrocklin @jrbourbeau Thanks alot again changing python version to same across client and workers solved my issue.

@jrbourbeau
Copy link
Member

Great! Glad to hear your issue has been resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants