Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use memory_limiter and better batch settings #488

Merged
merged 5 commits into from
Mar 11, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 34 additions & 4 deletions deploy/helm/sumologic/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,7 @@ otelcol:
memBallastSizeMib: "683"
image:
name: "sumologic/opentelemetry-collector"
tag: "0.2.6.5"
tag: "0.2.6.6"
pullPolicy: IfNotPresent
config:
receivers:
Expand All @@ -533,8 +533,11 @@ otelcol:
zipkin:
endpoint: "0.0.0.0:9411"
processors:
# Tags spans with K8S metadata, basing on the context IP
k8s_tagger:
# When true, only IP is assigned and passed (so it could be tagged on another collector)
passthrough: false
# Extracted fields and assigned names
extract:
metadata:
# extract the following well-known metadata fields
Expand Down Expand Up @@ -574,26 +577,53 @@ otelcol:
labels:
- tag_name: pod_label_%s
key: "*"

# The memory_limiter processor is used to prevent out of memory situations on the collector.
memory_limiter:
# check_interval is the time between measurements of memory usage for the
# purposes of avoiding going over the limits. Defaults to zero, so no
# checks will be performed. Values below 1 second are not recommended since
# it can result in unnecessary CPU consumption.
check_interval: 5s

# Maximum amount of memory, in MiB, targeted to be allocated by the process heap.
# Note that typically the total memory usage of process will be about 50MiB higher
# than this value.
limit_mib: 1900
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set 2048 (2Gi) as a limit so this should be OK.


# The queued_retry processor uses a bounded queue to relay batches from the receiver or previous
# processor to the next processor.
queued_retry:
# Number of workers that dequeue batches
num_workers: 16
# Maximum number of batches kept in memory before data is dropped
queue_size: 10000
# Whether to retry on failure or give up and drop
retry_on_failure: true

# The batch processor accepts spans and places them into batches grouped by node and resource
batch:
send_batch_size: 1024
# Number of spans after which a batch will be sent regardless of time
send_batch_size: 256
# Number of tickers that loop over batch buckets
num_tickers: 10
# Time duration after which a batch will be sent regardless of size
timeout: 5s
extensions:
health_check: {}
exporters:
zipkin:
url: "exporters.zipkin.url_replace"
# Following generates verbose logs with span content, useful to verify what
# metadata is being tagged. To enable, uncomment and add "logging" to exporters below
# metadata is being tagged. To enable, uncomment and add "logging" to exporters below.
# There are two levels that could be used: `debug` and `info` with the former
# being much more verbose and including (sampled) spans content
# logging:
# loglevel: debug
service:
extensions: [health_check]
pipelines:
traces:
receivers: [jaeger, zipkin, opencensus]
processors: [k8s_tagger, batch, queued_retry]
processors: [memory_limiter, k8s_tagger, batch, queued_retry]
exporters: [zipkin]