Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource Detectors produce blank Exception() after 5 seconds #3644

Open
jeremydvoss opened this issue Jan 22, 2024 · 4 comments · Fixed by #3645
Open

Resource Detectors produce blank Exception() after 5 seconds #3644

jeremydvoss opened this issue Jan 22, 2024 · 4 comments · Fixed by #3645
Assignees
Labels
bug Something isn't working

Comments

@jeremydvoss
Copy link
Contributor

jeremydvoss commented Jan 22, 2024

Describe your environment
Resource Detector only get 5 seconds by default to run. However, instead of ending the process after 5 seconds, a blank "Exception()" is creating resulting in the follow warning:
Exception in detector <opentelemetry.resource.detector.<RESOURCE DETECTOR CLASS> object at 0x000002B955D43D60>, ignoring
Note that the Exception string is blank resulting in 2 whitespaces: Exception__in...
This blank exception is causing confusion among Azure customers. And perhaps more importantly, the process still hangs.

 
Steps to reproduce

  1. Create a resource detector that sleeps for more than 5 seconds (try 10-20):
from opentelemetry.sdk.resources import Resource, ResourceDetector

from time import sleep

class SleepingResourceDetector(ResourceDetector):
    # pylint: disable=no-self-use
    def detect(self) -> "Resource":
        sleep(20)
        attributes = {}
        return Resource(attributes)
  1. Set resource detector entry point in pyproject.toml
[project.entry-points.opentelemetry_resource_detector]
sleeping = "opentelemetry.resource.detector.sleeping:SleepingResourceDetector"
  1. Install package to install entry point
  2. Point sdk to resource detector: export OTEL_EXPERIMENTAL_RESOURCE_DETECTORS=sleeping
  3. Create a Resource: Resource.create() This will eventually call get_aggregated_resources

What is the expected behavior?
The "concurrent future" setup should gracefully exit the process and not print out a confusing blank warning.

What is the actual behavior?
Process hangs and confusing warning message with blank error.

Additional context
Add any other context about the problem here.

@ocelotl
Copy link
Contributor

ocelotl commented Jan 25, 2024

Reopening since #3645 did not fully fix this issue.

@aabmass
Copy link
Member

aabmass commented Jan 25, 2024

This seems similar to #3309 when processors or exporters block, we have no way to cancel them.

@jeremydvoss
Copy link
Contributor Author

jeremydvoss commented Jan 25, 2024

Some options for adding a timeout inside resource detectors:

  • Add subtype like ResourceDetectorWithTimeout
  • Unified Resource Detector Timeout env var
  • Add method to base class getTimeout() that can be but doesn't need to be used by detectors

@jeremydvoss
Copy link
Contributor Author

A couple more issues:

  • While processes run in parallel, timeout applies sequentially. Detectors waited for last can take the longest
  • We wait for all to finish before allowing app to continue. Even if we let the process finish, we should discard its result and let the app continue in parallel. @aabmass mentioned this may be possible by changing how the with block works. Will look through the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants