Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSearch container exits on startup due to model deployment failure #674

Open
yuvalsp-pelles opened this issue Aug 13, 2024 · 1 comment

Comments

@yuvalsp-pelles
Copy link

Description

The OpenSearch container is exiting with code 1 shortly after startup, preventing the system from fully initializing. The error occurs during the model setup phase, specifically when deploying the TEXT_SIMILARITY model.

Error Message

ERROR:root:Unrecognized error message: Cannot invoke "java.lang.Integer.intValue()" because "totalChunks" is null
ERROR: Failed to setup models

Steps to Reproduce

  1. Run docker compose up
  2. Observe OpenSearch container logs
  3. Container exits before fully initializing

Environment

  • MacOS: Apple M3 Pro, 14.4
  • OpenSearch Version: 2.12.0

Additional Details

  • The error occurs during the deployment of the TEXT_SIMILARITY model
  • The 'totalChunks' variable is unexpectedly null, causing a NullPointerException
  • The container successfully performs initial

Complete Log:
opensearch-1 | Waiting for opensearch to start... Sleeping. Try 7/300
demo-ui-1 | No issues found.
opensearch-1 | Waiting for opensearch to start... Sleeping. Try 8/300
opensearch-1 | Waiting for opensearch to start... Sleeping. Try 9/300
importer-1 | No changes at 2024-08-13 08:08:41.489388 sleeping
opensearch-1 | Waiting for opensearch to start... Success
opensearch-1 | **************************************************************************
opensearch-1 | ** This tool will be deprecated in the next major release of OpenSearch **
opensearch-1 | ** opensearch-project/security#1755 **
opensearch-1 | **************************************************************************
opensearch-1 | Security Admin v7
opensearch-1 | Will connect to localhost:9200 ... done
opensearch-1 | Connected as "CN=Admin,O=Aryn.ai,ST=California,C=US"
opensearch-1 | OpenSearch Version: 2.12.0
opensearch-1 | Contacting opensearch cluster 'opensearch' and wait for YELLOW clusterstate ...
opensearch-1 | Clustername: opensearch
opensearch-1 | Clusterstate: GREEN
opensearch-1 | Number of nodes: 1
opensearch-1 | Number of data nodes: 1
opensearch-1 | .opendistro_security index already exists, so we do not need to create one.
opensearch-1 | Populate config from /usr/share/opensearch/config/opensearch-security
opensearch-1 | Will update '/config' with config/opensearch-security/config.yml
opensearch-1 | SUCC: Configuration for 'config' created or updated
opensearch-1 | Will update '/roles' with config/opensearch-security/roles.yml
opensearch-1 | SUCC: Configuration for 'roles' created or updated
opensearch-1 | Will update '/rolesmapping' with config/opensearch-security/roles_mapping.yml
opensearch-1 | SUCC: Configuration for 'rolesmapping' created or updated
opensearch-1 | Will update '/internalusers' with config/opensearch-security/internal_users.yml
opensearch-1 | SUCC: Configuration for 'internalusers' created or updated
opensearch-1 | Will update '/actiongroups' with config/opensearch-security/action_groups.yml
opensearch-1 | SUCC: Configuration for 'actiongroups' created or updated
opensearch-1 | Will update '/tenants' with config/opensearch-security/tenants.yml
opensearch-1 | SUCC: Configuration for 'tenants' created or updated
opensearch-1 | Will update '/nodesdn' with config/opensearch-security/nodes_dn.yml
opensearch-1 | SUCC: Configuration for 'nodesdn' created or updated
opensearch-1 | Will update '/whitelist' with config/opensearch-security/whitelist.yml
opensearch-1 | SUCC: Configuration for 'whitelist' created or updated
opensearch-1 | Will update '/audit' with config/opensearch-security/audit.yml
opensearch-1 | SUCC: Configuration for 'audit' created or updated
opensearch-1 | Will update '/allowlist' with config/opensearch-security/allowlist.yml
opensearch-1 | SUCC: Configuration for 'allowlist' created or updated
opensearch-1 | SUCC: Expected 10 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","internalusers","actiongroups","config"],"updated_config_size":10,"message":null} is 10 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","internalusers","actiongroups","config"]) due to: null
opensearch-1 | Done with success
opensearch-1 | Waiting for opensearch on ssl... Success
opensearch-1 | CLUSTER SETTINGS SET
opensearch-1 | INFO:root:ARYN MODEL GROUP ID: QwLTQJEBS9XVaW7rBoNz
opensearch-1 | INFO:root:>EMBEDDING MODEL ID: RQLTQJEBS9XVaW7rB4Mh
opensearch-1 | WARNING:root:Error detected: {'model_id': 'SALTQJEBS9XVaW7raIMd', 'task_type': 'DEPLOY_MODEL', 'function_name': 'TEXT_SIMILARITY', 'state': 'FAILED', 'worker_node': ['hLMxDOX6T0KJd7L7jbfKtg'], 'create_time': 1723536523656, 'last_update_time': 1723536523705, 'error': '{"hLMxDOX6T0KJd7L7jbfKtg":"Cannot invoke \"java.lang.Integer.intValue()\" because \"totalChunks\" is null"}', 'is_async': True}
opensearch-1 | ERROR:root:Unrecognized error message: Cannot invoke "java.lang.Integer.intValue()" because "totalChunks" is null
opensearch-1 | ERROR: Failed to setup models
opensearch-1 exited with code 1

@HenryL27
Copy link
Collaborator

We see this from time to time... can you do the following:

$ docker compose up reset
$ NOEXIT=1 DEBUG=1 docker compose up
# in another terminal or something
$ docker compose exec -it opensearch sh -c "cat /usr/share/opensearch/data/aryn_status/opensearch.log"

and tell me what you get?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants