You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
8 processes works OK: While running mpiexec.hydra -np 8 yank script --yaml=p-xylene-implicit.yaml:
bash-4.2$ free
total used free shared buff/cache available
Mem: 131934588 5161320 114950568 1014712 11822700 124444344
Swap: 0 0 0
20 processes gives an error: While running with mpiexec.hydra -np 20 yank script --yaml=p-xylene-implicit.yaml, just before failure:
bash-4.2$ free
total used free shared buff/cache available
Mem: 131934588 6578724 113531156 1019564 11824708 123022088
Swap: 0 0 0
The first error message and surrounding text were:
<…snip…>
2022-05-20 13:23:05,043: WARNING - openmmtools.multistate.multistatesampler - Warning: The openmmtools.multistate API is experimental and may change in future releases
Traceback (most recent call last):
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/schema/validator.py", line 411, in call_constructor
obj = subcls(**constructor_kwargs)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/replicaexchange.py", line 217, in init
super(ReplicaExchangeSampler, self).init(**kwargs)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/multistatesampler.py", line 203, in init
self._display_cuda_devices()
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/multistatesampler.py", line 1772, in _display_cuda_devices
cuda_query_output = os.popen("nvidia-smi --query-gpu=index,gpu_name --format=csv,noheader").read().strip()
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/os.py", line 980, in popen
bufsize=buffering)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/subprocess.py", line 729, in init
restore_signals, start_new_session)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/subprocess.py", line 1295, in _execute_child
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/bin/yank", line 10, in
sys.exit(main())
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/cli.py", line 73, in main
dispatched = getattr(commands, command).dispatch(command_args)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/commands/script.py", line 155, in dispatch
yaml_builder.run_experiments(write_status=write_status)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 747, in run_experiments
group_size = self._get_experiment_mpi_group_size(all_experiments)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 2862, in _get_experiment_mpi_group_size
sampler_names = {self._create_experiment_sampler(exp[1], []).class.name for exp in experiments}
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 2862, in
sampler_names = {self._create_experiment_sampler(exp[1], []).class.name for exp in experiments}
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 2990, in _create_experiment_sampler
return schema.call_sampler_constructor(constructor_description)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/schema/validator.py", line 470, in call_sampler_constructor
special_conversions=special_conversions)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/schema/validator.py", line 413, in call_constructor
raise RuntimeError('Attempt to initialize failed with: {}'.format(str(e)))
RuntimeError: Attempt to initialize failed with: [Errno 12] Cannot allocate memory
2022-05-20 13:23:05,054: CRITICAL - mpiplus.mpiplus - MPI node 1/20 raised an exception and called Abort()! The exception traceback follows
<…snip…>
For what it's worth, I get an entirely different error with -np 25 (so perhaps I am just running things incorrectly since I count 25 lambda values for the complex system):
<...snip...>
Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 6939 RUNNING AT ba173
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
The text was updated successfully, but these errors were encountered:
8 processes works OK: While running mpiexec.hydra -np 8 yank script --yaml=p-xylene-implicit.yaml:
bash-4.2$ free
total used free shared buff/cache available
Mem: 131934588 5161320 114950568 1014712 11822700 124444344
Swap: 0 0 0
20 processes gives an error: While running with mpiexec.hydra -np 20 yank script --yaml=p-xylene-implicit.yaml, just before failure:
bash-4.2$ free
total used free shared buff/cache available
Mem: 131934588 6578724 113531156 1019564 11824708 123022088
Swap: 0 0 0
The first error message and surrounding text were:
<…snip…>
2022-05-20 13:23:05,043: WARNING - openmmtools.multistate.multistatesampler - Warning: The openmmtools.multistate API is experimental and may change in future releases
Traceback (most recent call last):
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/schema/validator.py", line 411, in call_constructor
obj = subcls(**constructor_kwargs)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/replicaexchange.py", line 217, in init
super(ReplicaExchangeSampler, self).init(**kwargs)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/multistatesampler.py", line 203, in init
self._display_cuda_devices()
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/multistatesampler.py", line 1772, in _display_cuda_devices
cuda_query_output = os.popen("nvidia-smi --query-gpu=index,gpu_name --format=csv,noheader").read().strip()
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/os.py", line 980, in popen
bufsize=buffering)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/subprocess.py", line 729, in init
restore_signals, start_new_session)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/subprocess.py", line 1295, in _execute_child
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/bin/yank", line 10, in
sys.exit(main())
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/cli.py", line 73, in main
dispatched = getattr(commands, command).dispatch(command_args)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/commands/script.py", line 155, in dispatch
yaml_builder.run_experiments(write_status=write_status)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 747, in run_experiments
group_size = self._get_experiment_mpi_group_size(all_experiments)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 2862, in _get_experiment_mpi_group_size
sampler_names = {self._create_experiment_sampler(exp[1], []).class.name for exp in experiments}
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 2862, in
sampler_names = {self._create_experiment_sampler(exp[1], []).class.name for exp in experiments}
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/experiment.py", line 2990, in _create_experiment_sampler
return schema.call_sampler_constructor(constructor_description)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/schema/validator.py", line 470, in call_sampler_constructor
special_conversions=special_conversions)
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/schema/validator.py", line 413, in call_constructor
raise RuntimeError('Attempt to initialize failed with: {}'.format(str(e)))
RuntimeError: Attempt to initialize failed with: [Errno 12] Cannot allocate memory
2022-05-20 13:23:05,054: CRITICAL - mpiplus.mpiplus - MPI node 1/20 raised an exception and called Abort()! The exception traceback follows
<…snip…>
For what it's worth, I get an entirely different error with -np 25 (so perhaps I am just running things incorrectly since I count 25 lambda values for the complex system):
<...snip...>
Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 6939 RUNNING AT ba173
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
The text was updated successfully, but these errors were encountered: