Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should messaging.kafka.message.offset be in the message namespace? #1156

Closed
lmolkova opened this issue Jun 13, 2024 · 5 comments · Fixed by #1245
Closed

Should messaging.kafka.message.offset be in the message namespace? #1156

lmolkova opened this issue Jun 13, 2024 · 5 comments · Fixed by #1245

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Jun 13, 2024

Kafka settlement is offset-based.

When reporting settlement, we'd want to include messaging.kafka.message.offset on the settlement span as shown in #1155. But that span does not have to be reported per message (it can be done in the background periodically, for each 1000th message, etc).

It looks weird to have the offset attribute in the message namespace as it's not just a message property, but also a thing on its own. Someone may also want to create a metric for the offset (e.g. latest published offset vs latest consumed offset would show the size of the queue). In this case it's rather a property of the topic/consumer group.

Suggesting messaging.kafka.offset or even a more generic one messaging.offset since offset concept is popular in messaging world.

@pyohannes
Copy link
Contributor

I'm in favor of making this a generic attribute (outside the kafka namespace).

I have no strong opinion whether it should be messaging.offset or messaging.message.offset. I'd have a slight preference for messaging.offset because it's shorter.

@lmolkova
Copy link
Contributor Author

Related: #1036

@lmolkova
Copy link
Contributor Author

lmolkova commented Jun 21, 2024

Did some research on the terminology used by different messaging systems:

  • Kafka
    • ✅ uses it in documentation
    • ✅ exposes in public APIs
    • ✅ uses it as a settlement mechanism
  • Azure Event Hubs
    • ✅ uses it in documentation
    • ✅ exposes in public APIs
    • ✅ uses it as a settlement mechanism
  • Azure Service Bus
    • ❌ does not have a notion of offset.
    • has a notion of sequence number
    • uses lock tokens to settle messages
  • Apache RocketMQ
    • ✅ uses it in documentation
    • ❌ does not expose it in public APIs
    • ❌ does not use it as a settlement mechanism (uses receipt handles)
  • RabbitMQ Streams, not queues
    • ✅ uses it in documentation
    • 🟨 exposes it as a message property message.properties.headers["x-stream-offset"]
    • ❌ does not use it as a settlement mechanism (uses delivery tags)
  • AWS SQS
    • ❌ does not have a notion of offset or sequence number
    • uses receipt handle to settle messages
  • AWS SNS
    • ❌ does not have a notion of offset or sequence number
  • GCP Pub/Sub
    • ❌ does not have a notion of offset or sequence number
    • uses ack_id to settle
  • Apache Pulsar
  • Nats.io
    • ❌ does not have a notion of offset
    • has sequence number
    • settles with reply_to

@lmolkova
Copy link
Contributor Author

lmolkova commented Jun 21, 2024

Based on this, I would prefer messaging.kafka.offset, messaging.eventhubs.offset (future), messaging.rabbitmq.offset (if needed)

@pyohannes
Copy link
Contributor

Discussed in the workgroup meeting, we will go with the proposal from @lmolkova in the previous comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: V1 - Stable Semantics
Development

Successfully merging a pull request may close this issue.

3 participants