Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add monitoring for FluentD plugins #108

Merged
merged 6 commits into from
Aug 1, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion deploy/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ RUN gem install fluent-plugin-s3 -v 1.1.4 \
&& gem install fluent-plugin-sumologic_output -v 1.5.0 \
&& gem install fluent-plugin-concat -v 2.3.0 \
&& gem install fluent-plugin-rewrite-tag-filter -v 2.1.0 \
&& gem install fluent-plugin-prometheus -v 1.1.0 \
&& gem install fluent-plugin-prometheus -v 1.4.0 \
&& gem install fluent-plugin-kubernetes_sumologic -v 2.4.1

# FluentD plugins from this repository
Expand Down
33 changes: 32 additions & 1 deletion deploy/helm/prometheus-overrides.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,31 @@ alertmanager:
grafana:
enabled: false
prometheus:
additionalServiceMonitors:
- name: fluentd
additionalLabels:
k8s-app: fluentd-sumologic
release: prometheus-operator
endpoints:
- port: metrics
namespaceSelector:
matchNames:
- sumologic
selector:
matchLabels:
k8s-app: fluentd-sumologic
- name: fluentd-events
additionalLabels:
k8s-app: fluentd-sumologic-events
release: prometheus-operator
endpoints:
- port: metrics
namespaceSelector:
matchNames:
- sumologic
selector:
matchLabels:
k8s-app: fluentd-sumologic-events
prometheusSpec:
externalLabels:
# Set this to a value to distinguish between different k8s clusters
Expand Down Expand Up @@ -264,7 +289,13 @@ prometheus:
- action: keep
regex: 'prometheus_remote_storage_.*'
sourceLabels: [__name__]
# fluent-bit metrics
# fluentd metrics
- url: http://fluentd:9888/prometheus.metrics
writeRelabelConfigs:
- action: keep
regex: 'fluentd_.*'
sourceLabels: [__name__]
# fluent-bit metrics
- url: http://fluentd:9888/prometheus.metrics
writeRelabelConfigs:
- action: keep
Expand Down
106 changes: 87 additions & 19 deletions deploy/kubernetes/fluentd-sumologic.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,18 @@ metadata:
name: fluentd-config
data:
fluent.conf: |-
@include common.conf
@include metrics.conf
@include logs.conf
common.conf: |-
<source>
@type prometheus
metrics_path /metrics
port 24231
</source>
<source>
@type prometheus_output_monitor
</source>
metrics.conf: |-
<source>
@type http
Expand Down Expand Up @@ -86,36 +96,43 @@ data:
</filter>
<match prometheus.datapoint.apiserver**>
@type sumologic
@id sumologic.endpoint.metrics.apiserver
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_APISERVER']}"
@include metrics.output.conf
</match>
<match prometheus.datapoint.kubelet**>
@type sumologic
@id sumologic.endpoint.metrics.kubelet
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBELET']}"
@include metrics.output.conf
</match>
<match prometheus.datapoint.kube-controller-manager**>
@type sumologic
@id sumologic.endpoint.metrics.kube.controller.manager
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBE_CONTROLLER_MANAGER']}"
@include metrics.output.conf
</match>
<match prometheus.datapoint.kube-scheduler**>
@type sumologic
@id sumologic.endpoint.metrics.kube.scheduler
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBE_SCHEDULER']}"
@include metrics.output.conf
</match>
<match prometheus.datapoint.kube-state**>
@type sumologic
@id sumologic.endpoint.metrics.kube.state
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBE_STATE']}"
@include metrics.output.conf
</match>
<match prometheus.datapoint.node-exporter**>
@type sumologic
@id sumologic.endpoint.metrics.node.exporter
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_NODE_EXPORTER']}"
@include metrics.output.conf
</match>
<match prometheus.datapoint**>
@type sumologic
@id sumologic.endpoint.metrics
endpoint "#{ENV['SUMO_ENDPOINT_METRICS']}"
@include metrics.output.conf
</match>
Expand Down Expand Up @@ -187,8 +204,11 @@ data:
exclude_container_regex "#{ENV['EXCLUDE_CONTAINER_REGEX']}"
exclude_host_regex "#{ENV['EXCLUDE_HOST_REGEX']}"
</filter>

@include logs.output.conf
<match **>
@type sumologic
@id sumologic.endpoint.logs
@include logs.output.conf
</match>
</label>

logs.source.systemd.conf: |-
Expand All @@ -207,7 +227,11 @@ data:
exclude_priority_regex "#{ENV['EXCLUDE_PRIORITY_REGEX']}"
exclude_unit_regex "#{ENV['EXCLUDE_UNIT_REGEX']}"
</filter>
@include logs.output.conf
<match **>
@type sumologic
@id sumologic.endpoint.logs.kubelet
@include logs.output.conf
</match>
</label>
<match host.**>
@type relabel
Expand All @@ -229,22 +253,23 @@ data:
_sumo_metadata ${record["_sumo_metadata"][:source] = tag_parts[1]; record["_sumo_metadata"]}
</record>
</filter>
@include logs.output.conf
<match **>
@type sumologic
@id sumologic.endpoint.logs.systemd
@include logs.output.conf
</match>
</label>

logs.output.conf: |-
<match **>
@type sumologic
data_type logs
log_key log
endpoint "#{ENV['SUMO_ENDPOINT_LOGS']}"
verify_ssl "#{ENV['VERIFY_SSL']}"
log_format "#{ENV['LOG_FORMAT']}"
add_timestamp "#{ENV['ADD_TIMESTAMP']}"
timestamp_key "#{ENV['TIMESTAMP_KEY']}"
proxy_uri "#{ENV['PROXY_URI']}"
@include buffer.output.conf
</match>
data_type logs
log_key log
endpoint "#{ENV['SUMO_ENDPOINT_LOGS']}"
verify_ssl "#{ENV['VERIFY_SSL']}"
log_format "#{ENV['LOG_FORMAT']}"
add_timestamp "#{ENV['ADD_TIMESTAMP']}"
timestamp_key "#{ENV['TIMESTAMP_KEY']}"
proxy_uri "#{ENV['PROXY_URI']}"
@include buffer.output.conf

buffer.output.conf: |-
<buffer>
Expand Down Expand Up @@ -313,7 +338,7 @@ spec:
- "/bin/sh"
- "-c"
- "[[ $( pgrep ruby | wc -l) == 2 ]]"
initialDelaySeconds: 45
initialDelaySeconds: 30
periodSeconds: 5
volumeMounts:
- name: config-volume
Expand Down Expand Up @@ -414,19 +439,28 @@ data:
fluent.conf: |-
@include events.conf
events.conf: |-
<source>
@type prometheus
metrics_path /metrics
port 24231
</source>
<source>
@type prometheus_output_monitor
</source>
<source>
@type events
deploy_namespace $NAMESPACE
</source>
<match kubernetes.**>
@type sumologic
@id sumologic.endpoint.events
endpoint "#{ENV['SUMO_ENDPOINT_EVENTS']}"
data_type logs
disable_cookies true
verify_ssl "#{ENV['VERIFY_SSL']}"
proxy_uri "#{ENV['PROXY_URI']}"
<buffer>
@type memory
@type memory
compress gzip
flush_interval "#{ENV['FLUSH_INTERVAL']}"
flush_thread_count "#{ENV['NUM_THREADS']}"
Expand Down Expand Up @@ -475,6 +509,22 @@ spec:
mountPath: /fluentd/etc/
- name: pos-files
mountPath: /mnt/pos/
livenessProbe:
exec:
command:
- "/bin/sh"
- "-c"
- "[[ $( pgrep ruby | wc -l) == 2 ]]"
initialDelaySeconds: 300
periodSeconds: 20
readinessProbe:
exec:
command:
- "/bin/sh"
- "-c"
- "[[ $( pgrep ruby | wc -l) == 2 ]]"
initialDelaySeconds: 30
periodSeconds: 5
env:
- name: SUMO_ENDPOINT_EVENTS
valueFrom:
Expand All @@ -491,7 +541,6 @@ spec:
value: "100k"
- name: TOTAL_LIMIT_SIZE
value: "128m"

---
apiVersion: v1
kind: Service
Expand All @@ -511,4 +560,23 @@ spec:
port: 24321
targetPort: 24321
protocol: TCP
- name: metrics
port: 24231
targetPort: 24231
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: fluentd-events
labels:
k8s-app: fluentd-sumologic-events
spec:
selector:
k8s-app: fluentd-sumologic-events
ports:
- name: metrics
port: 24231
targetPort: 24231
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed for now, but we should look at using named ports for the targetPort so we dont have to update the Pod and Service each time.

protocol: TCP
---
4 changes: 3 additions & 1 deletion fluent-plugin-protobuf/lib/fluent/plugin/parser_protobuf.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
require_relative '../../types_pb'
require_relative '../../remote_pb'

WriteRequest = Prometheus::WriteRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, otherwise our module Prometheus conflicts with the fluentd-prometheus plugin's module Prometheus and our parsing breaks.


module Fluent
module Plugin
# fluentd parser plugin to parse Prometheus metrics into timeseries events.
Expand All @@ -14,7 +16,7 @@ class ProtobufParser < Fluent::Plugin::Parser

def parse(text)
inflated = Snappy.inflate(text)
decoded = Prometheus::WriteRequest.decode(inflated)
decoded = WriteRequest.decode(inflated)
log.trace "protobuf::parse - in: (#{text.bytesize}/#{inflated.bytesize}), out: #{decoded.timeseries.length}"
record = {}
record[KEY_TIMESERIES] = decoded.timeseries
Expand Down