Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Span, sometimes #3275

Open
Liubey opened this issue Apr 21, 2023 · 14 comments
Open

Duplicate Span, sometimes #3275

Liubey opened this issue Apr 21, 2023 · 14 comments
Labels
bug Something isn't working

Comments

@Liubey
Copy link

Liubey commented Apr 21, 2023

Steps to reproduce
When I use opentelemetry in python3, sometimes I got duplicate span .
these code used to create tracer:

        from opentelemetry import trace, version
        from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
        from opentelemetry.sdk.resources import Resource
        from opentelemetry.sdk.trace import TracerProvider
        from opentelemetry.sdk.trace.export import BatchSpanProcessor, SimpleSpanProcessor, ConsoleSpanExporter
        from opentelemetry.propagate import set_global_textmap
        from opentelemetry.propagators.b3 import B3Format
        set_global_textmap(B3Format())
        os.environ.setdefault(u'OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT', u'10240')
        resource = Resource(attributes={
            PROCESS_IP_TAG_KEY: local_ip(),
            PROCESS_HOSTNAME_TAG_KEY: host_name(),
            PROCESS_SERVICE_NAME_TAG_KEY: service_name,
            PROCESS_SDK_VERSION_TAG_KEY: u'Opentelemetry-Python-' + version.__version__
        })
        # https://opentelemetry.io/docs/instrumentation/python/exporters/
        trace.set_tracer_provider(TracerProvider(resource=resource))
        otlp_exporter = OTLPSpanExporter(endpoint=u'http://localhost:6831', insecure=True)
        span_processor = BatchSpanProcessor(otlp_exporter)
        trace.get_tracer_provider().add_span_processor(span_processor)
        if is_tracing_print_log():
            trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))

        tracer_implements = trace.get_tracer(__name__)

this code used to create span:

        from opentelemetry import trace, baggage
        exist_active_span = trace.get_current_span()
        span = self.tracer_implements.start_span(name, context=parent_span_context)
        span.set_attribute(TAG_PRODUCT_FLAG, product_flag)
        scope = trace.use_span(span, end_on_exit=True)
        setattr(scope, u'__span__', span)
        scope.__enter__()
        return scope

this code used to end span:

        current_scope.__exit__(None, None, None)

Something that needs to be said IT IS NOT 100%, when I restart my server It may be disappeared. But Something when I restart my server again, It may be appear.
I swear It's duplicate by report step. cause console print twice already.
In Jaeger It looks like this:
image

What is the expected behavior?
What happened?

What is the actual behavior?

Additional context

@Liubey Liubey added the bug Something isn't working label Apr 21, 2023
@ocelotl
Copy link
Contributor

ocelotl commented Jun 6, 2023

Is there a way for you to check if the span IDs are duplicated?

@marctc
Copy link

marctc commented Jul 18, 2023

I can confirm that span and traces id are duplicated. I'm testing this connector sending traces from a Django application that use last otel instrumentation library (0.39) and I receive this:

2023-07-18T12:18:01.543Z	debug	exceptionsconnector/connector_metrics.go:99checking trace/span	{"kind": "exporter", "name": "exceptions", "exporter_in_pipeline": "traces", "receiver_in_pipeline": "metrics", "trace_id": "2bc89f693a428f16e44f9a1e63e3bc50", "span_id": "5a507db467f88147"}
2023-07-18T12:18:01.545Z	debug	exceptionsconnector/connector_metrics.go:99checking trace/span	{"kind": "exporter", "name": "exceptions", "exporter_in_pipeline": "traces", "receiver_in_pipeline": "metrics", "trace_id": "2bc89f693a428f16e44f9a1e63e3bc50", "span_id": "5a507db467f88147"}

With previous versions (last time i tried it was 0.26) I didn't have this issue.

@marctc
Copy link

marctc commented Jul 18, 2023

Doing some testing I found that opentelemetry v1.12.0rc2 & v0.32b0 was the first release that includes this bug.

@lzchen
Copy link
Contributor

lzchen commented Sep 25, 2023

@Liubey @marctc
There has been a few releases since this issue was created. Can you confirm whether you are still getting duplicate spans?

@devdut1999
Copy link

Hey. I'm also getting a similar issue. I'm seeing duplicate spans.

@aabmass
Copy link
Member

aabmass commented Oct 26, 2023

This seems really hard to reproduce and the culprit could be in several different components. If someone could provide a full repro with a docker compose yaml setup, and pinned python dependencies, we could definitely investigate.

@marctc
Copy link

marctc commented Oct 27, 2023

@Liubey @marctc There has been a few releases since this issue was created. Can you confirm whether you are still getting duplicate spans?

I can confirm the problem still persists 🤷

@marctc
Copy link

marctc commented Oct 27, 2023

This seems really hard to reproduce and the culprit could be in several different components. If someone could provide a full repro with a docker compose yaml setup, and pinned python dependencies, we could definitely investigate.

this is my docker-compose:

version: "2"
services:
  api:
    depends_on:
      - otel-collector
    build: ./service/api/.
    command: opentelemetry-instrument uwsgi --http :8000 --module faulty.wsgi --master
    ports:
      - "8000:8000"
    environment:
      - DJANGO_SETTINGS_MODULE=api.settings
      - OTEL_RESOURCE_ATTRIBUTES=service.name=api
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
      - PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
      - OTEL_EXPORTER_OTLP_INSECURE=true

  otel-collector:
    build: ./otel-collector/.
    ports:
      - "6831:6831"
      - "14268:14268"
      - "4317:4317"
      - "4318:4318"
    volumes:
      - ./config/otel-collector.yaml:/config/otel-collector.yaml
    command: /bin/otelcol --config=/config/otel-collector.yaml

wsgi.py content:

import os

from django.core.wsgi import get_wsgi_application
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
from uwsgidecorators import postfork

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'faulty.settings')

application = get_wsgi_application()


@postfork
def init_tracing():
    trace.set_tracer_provider(TracerProvider())
    trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(
        OTLPSpanExporter(endpoint=os.environ.get('OTEL_EXPORTER_OTLP_ENDPOINT'))
    ))
    trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(
        ConsoleSpanExporter()
    ))

requirements.txt content

opentelemetry-exporter-otlp==1.20.0
opentelemetry-instrumentation-django==v0.41b0
opentelemetry-distro==v0.41b0
Django==4.1
uWSGI==2.0.21

@aabmass
Copy link
Member

aabmass commented Oct 31, 2023

@marctc this example is missing Dockerfiles and django app? A github gist or repo I can clone and just run docker compose up would be easiest.

@marctc
Copy link

marctc commented Nov 1, 2023

@marctc this example is missing Dockerfiles and django app? A github gist or repo I can clone and just run docker compose up would be easiest.

there you go https://github.com/marctc/django-otel-demo

@aabmass
Copy link
Member

aabmass commented Nov 1, 2023

@marctc you are adding two OTLPSpanExporters to the TracerProvider which is causing duplicates. I added this to your wsgi.py

diff --git a/service/faulty/faulty/wsgi.py b/service/faulty/faulty/wsgi.py
index 82eacb9..997d833 100644
--- a/service/faulty/faulty/wsgi.py
+++ b/service/faulty/faulty/wsgi.py
@@ -31,3 +31,7 @@ def init_tracing():
     trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(
         ConsoleSpanExporter()
     ))
+
+    print(
+        [sp.span_exporter for sp in trace.get_tracer_provider()._active_span_processor._span_processors]
+    )

and it prints out

[<opentelemetry.exporter.otlp.proto.grpc.trace_exporter.OTLPSpanExporter object at 0x7fba8dc19150>, <opentelemetry.exporter.otlp.proto.grpc.trace_exporter.OTLPSpanExporter object at 0x7fba74263e50>, <opentelemetry.sdk.trace.export.ConsoleSpanExporter object at 0x7fba744733d0>]

If you look at the Python startup logs, you'll see the warning Overriding of current TracerProvider is not allowed. What's happening is opentelemetry-instrument sets up the global tracer provider for you already with an OTLP exporter (the default). Then in wsgi.py you're adding trace.set_tracer_provider(TracerProvider()) has no effect (because you can't override the global once set) and you end up just adding more exporters to the existing TracerProvider.

You should only use one of these two options for setting up your SDK:

  1. Use opentelemetry-instrument with appropriate envvars or command line flags
  2. Create a TracerProvider in your code and set it as global with opentelemetry.trace.set_tracer_provider()

@Liubey
Copy link
Author

Liubey commented Nov 9, 2023

@aabmass But I NEVER use opentelemetry-instrument.
And It's difficulty to reproduce.
But I will notice the log 'Overriding of current TracerProvider is not allowed'

@aabmass
Copy link
Member

aabmass commented Nov 9, 2023

@bazingarj
Copy link

bazingarj commented Aug 7, 2024

We are Observing the same Issue, sometimes it is 20x, 8x , 2x duplication. Its pretty random and ruining the traces.
Collector Version: 0.100.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants