Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentbit Kubernetes Merge_Log results in conflicting field types and rejection by elasticsearch #58

Closed
ricsanfre opened this issue Jul 28, 2022 · 2 comments · Fixed by #59
Labels
bug Something isn't working

Comments

@ricsanfre
Copy link
Owner

ricsanfre commented Jul 28, 2022

Activating Merge_Log capability in fluentd makes some logs to be rejected by elasticsearch. In my case elasticsearch log entries are rejected.

Fluent-bit kubernetes filter configuration:

  [FILTER]
      Name kubernetes
      Match kube.*
      Kube_Tag_Prefix kube.var.log.containers.
      Merge_Log On
      Merge_Log_Trim Off
      Merge_Log_Key log_processed
      Keep_Log Off
      K8S-Logging.Parser On
      K8S-Logging.Exclude On
      Annotations Off
      Labels Off

Errors printed by fluentd aggregator (NOTE: fluentd is configured to trace received errors from Elastic search API (log_es_400_reason configuration parameter)
fluentd aggregator is printing the following log error: 400 - Rejected by Elasticsearch [error type]: mapper_parsing_exception [reason]: 'failed to parse field [log_processed.event] of type [text] in document with..

2022-07-28 10:06:38 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch [error type]: mapper_parsing_exception [reason]: 'failed to parse field [log_processed.event] of type [text] in document with id 'FlZFRIIBx-uCpR9Dw1hC'. Preview of field's value: '{dataset=elasticsearch.server}''" location=nil tag="kube.var.log.containers.efk-es-default-0_k3s-logging_elasticsearch-4718c3ca3bf9f531d5b6768c86ccb0ac8be225a10d808efcbf639271be548daf.log" time=2022-07-28 10:06:33.636543380 +0000 record={"time"=>"2022-07-28T12:06:33.63654338+02:00", "log_processed"=>{"@timestamp"=>"2022-07-28T10:06:33.635Z", "log.level"=>"INFO", "message"=>"[.kibana_8.1.2_001/TWOgIY07SMG_rsrY2lsnoQ] update_mapping [_doc]", "ecs.version"=>"1.2.0", "service.name"=>"ES_ECS", "event.dataset"=>"elasticsearch.server", "process.thread.name"=>"elasticsearch[efk-es-default-0][masterService#updateTask][T#1]", "log.logger"=>"org.elasticsearch.cluster.metadata.MetadataMappingService", "trace.id"=>"3160dd83044e7c2a51943a5081e29a15", "elasticsearch.cluster.uuid"=>"TWqBjJ-MRXetwJjQ2liV7Q", "elasticsearch.node.id"=>"qIMyAjJRQxKQM-kUEBmQtQ", "elasticsearch.node.name"=>"efk-es-default-0", "elasticsearch.cluster.name"=>"efk"}, "k8s.cluster.name"=>"picluster", "@timestamp"=>"2022-07-28T10:06:33.636543380+00:00", "tag"=>"kube.var.log.containers.efk-es-default-0_k3s-logging_elasticsearch-4718c3ca3bf9f531d5b6768c86ccb0ac8be225a10d808efcbf639271be548daf.log"}

Elasticsearch original log is

{"@timestamp": "2022-07-28T10:06:33.635Z", 
 "log.level": "INFO", 
"message":"[.kibana_8.1.2_001/TWOgIY07SMG_rsrY2lsnoQ] update_mapping [_doc]"
 "ecs.version": "1.2.0", 
"service.name":"ES_ECS", 
"event.dataset":"elasticsearch.server"}

In this case event field is JSON mapping instead of a text field, which is expected by ES because the mapping rule for this field has been created when loaded another log coming from MetalLB, where event field is also present.
Metal LB log:

{"caller":"service_controller.go:95",
 "controller":"ServiceReconciler",
 "event":"force service reload",
 "level":"info",
 "ts":"2022-07-28T09:45:27Z"
}
@ricsanfre ricsanfre added the bug Something isn't working label Jul 28, 2022
@ricsanfre
Copy link
Owner Author

There is an open issue in Fluentbit repository addressing the very same problem: fluent/fluent-bit#4830. There the use of a Lua-based filter is proposed to encode nested json maps

@ricsanfre
Copy link
Owner Author

To solve this issue a filter rule in the aggregation layer (fluentd) has been created. This rule removes log_processed field and creates a new field source_log.<container-name>, making unique the fields before ingesting into ES.

    <filter kube.**>
      @type record_transformer
      enable_ruby true
      remove_keys log_processed
      <record>
        source_log.${record["k8s.container.name"]} ${(record.has_key?('log_processed'))? record['log_processed'] : nil}
      </record>
    </filter>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant