You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some developers working in the Elasticsearch repository have reported intermittent problems with the machine learning controller process crashing when running a locally built Elasticsearch. The usual symptoms are Elasticsearch will fail to start and the log will contain this message:
[ERROR][o.e.b.Elasticsearch ] [runTask-0] fatal exception while booting Elasticsearch org.elasticsearch.ElasticsearchException: Failure running machine learning native code. This could be due to running on an unsupported OS or distribution, missing OS libraries, or a problem with the temp directory. To bypass this problem by running Elasticsearch without machine learning functionality set [xpack.ml.enabled: false].
A crash report can be found in the macOS Console app.
Path: /Users/USER/*/controller.app/Contents/MacOS/controller
Identifier: co.elastic.ml-cpp.controller
Version: 8.7.0
Code Type: ARM-64 (Native)
Exception Type: EXC_BAD_ACCESS (SIGKILL (Code Signature Invalid))
Exception Subtype: UNKNOWN_0x32 at 0x000000010249c000
Exception Codes: 0x0000000000000032, 0x000000010249c000
VM Region Info: 0x10249c000 is in 0x10249c000-0x1024b0000; bytes after start: 0 bytes before end: 81919
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
UNUSED SPACE AT START
---> mapped file 10249c000-1024b0000 [ 80K] r-x/r-x SM=COW ...t_id=787f689f
mapped file 1024b0000-1024b4000 [ 16K] rw-/rw- SM=COW ...t_id=787f689f
Exception Note: EXC_CORPSE_NOTIFY
Termination Reason: CODESIGNING 2
The error has been observed on Apple silicon only (so far).
Reproducing
It has not been possible to reproduce reliably but once the problem occurs a crash report can be generated by running controller --help in the Elasticsearch repository.
When security is enabled the spctl --assess function returns the same message as codesign --verify
cd <ES_REPO>
sudo spctl --asses -vv ./distribution/archives/darwin-aarch64-tar/build/install/elasticsearch-8.7.0-SNAPSHOT/modules/x-pack-ml/platform/darwin-aarch64/controller.app/Contents/MacOS/controller
./distribution/archives/darwin-aarch64-tar/build/install/elasticsearch-8.7.0-SNAPSHOT/modules/x-pack-ml/platform/darwin-aarch64/controller.app/Contents/MacOS/controller: code has no resources but signature indicates they must be present
echo $?
1
Code Signing
Maybe.
The crash report indicates code signing is involved
cd <ES_REPO>
codesign -d --verify --verbose=4 ./distribution/archives/darwin-aarch64-tar/build/install/elasticsearch-8.7.0-SNAPSHOT/modules/x-pack-ml/platform/darwin-aarch64/controller.app
./distribution/archives/darwin-aarch64-tar/build/install/elasticsearch-8.7.0-SNAPSHOT/modules/x-pack-ml/platform/darwin-aarch64/controller.app/Contents/MacOS/controller: code has no resources but signature indicates they must be present
It is not clear if that is a terminal error however.
Workarounds
In the commands below replace elasticsearch-8.7.0-SNAPSHOT with your version.
Deleting the bundled app from the local build and rebuilding is the most reliable fix:
cd <ES_REPO>
rm -rf distribution/archives/darwin-aarch64-tar/build/install/elasticsearch-8.7.0-SNAPSHOT/modules/x-pack-ml/platform/darwin-aarch64/controller.app
./gradlew run
Resigning the app with an ad-hoc signature works for some
--sign - means use an ad hoc identity. This is a workaround for the a local development machine only.
From the codesign man page:
If identity is the single letter "-" (dash), ad-hoc signing is performed.
Ad-hoc signing does not use an identity at all, and identifies exactly
one instance of code. Significant restrictions apply to the use of ad-hoc
signed code; consult documentation before using this.
Almost exactly 1 year later this problem has returned.
The easiest way to resolve the problem is still to delete controller.app
cd <ES_REPO>
rm -rf distribution/archives/darwin-aarch64-tar/build/install/elasticsearch-8.13.0-SNAPSHOT/modules/x-pack-ml/platform/darwin-aarch64/controller.app
./gradlew run
Some developers working in the Elasticsearch repository have reported intermittent problems with the machine learning controller process crashing when running a locally built Elasticsearch. The usual symptoms are Elasticsearch will fail to start and the log will contain this message:
A crash report can be found in the macOS Console app.
The error has been observed on Apple silicon only (so far).
Reproducing
It has not been possible to reproduce reliably but once the problem occurs a crash report can be generated by running
controller --help
in the Elasticsearch repository.Running the app from a different location works ?!
Copy the app to a folder in the home directory and running the copy does not result in a crash:
Possible Causes
macOS Quarantine
No.
The downloaded
controller.app
does not have the the quarantine attribute set.find . -xattrname com.apple.quarantine
returns nothing.Security Policy
No.
After disabling security with
sudo spctl --global-disable
the controller app still crashes.When security is enabled the
spctl --assess
function returns the same message ascodesign --verify
Code Signing
Maybe.
The crash report indicates code signing is involved
and
Verifying the signing returns an error message
It is not clear if that is a terminal error however.
Workarounds
In the commands below replace
elasticsearch-8.7.0-SNAPSHOT
with your version.The text was updated successfully, but these errors were encountered: