Fluentbit Kubernetes Merge_Log results in conflicting field types and rejection by elasticsearch #58

ricsanfre · 2022-07-28T10:40:25Z

Activating Merge_Log capability in fluentd makes some logs to be rejected by elasticsearch. In my case elasticsearch log entries are rejected.

Fluent-bit kubernetes filter configuration:

  [FILTER]
      Name kubernetes
      Match kube.*
      Kube_Tag_Prefix kube.var.log.containers.
      Merge_Log On
      Merge_Log_Trim Off
      Merge_Log_Key log_processed
      Keep_Log Off
      K8S-Logging.Parser On
      K8S-Logging.Exclude On
      Annotations Off
      Labels Off

Errors printed by fluentd aggregator (NOTE: fluentd is configured to trace received errors from Elastic search API (log_es_400_reason configuration parameter)
fluentd aggregator is printing the following log error: 400 - Rejected by Elasticsearch [error type]: mapper_parsing_exception [reason]: 'failed to parse field [log_processed.event] of type [text] in document with..

2022-07-28 10:06:38 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch [error type]: mapper_parsing_exception [reason]: 'failed to parse field [log_processed.event] of type [text] in document with id 'FlZFRIIBx-uCpR9Dw1hC'. Preview of field's value: '{dataset=elasticsearch.server}''" location=nil tag="kube.var.log.containers.efk-es-default-0_k3s-logging_elasticsearch-4718c3ca3bf9f531d5b6768c86ccb0ac8be225a10d808efcbf639271be548daf.log" time=2022-07-28 10:06:33.636543380 +0000 record={"time"=>"2022-07-28T12:06:33.63654338+02:00", "log_processed"=>{"@timestamp"=>"2022-07-28T10:06:33.635Z", "log.level"=>"INFO", "message"=>"[.kibana_8.1.2_001/TWOgIY07SMG_rsrY2lsnoQ] update_mapping [_doc]", "ecs.version"=>"1.2.0", "service.name"=>"ES_ECS", "event.dataset"=>"elasticsearch.server", "process.thread.name"=>"elasticsearch[efk-es-default-0][masterService#updateTask][T#1]", "log.logger"=>"org.elasticsearch.cluster.metadata.MetadataMappingService", "trace.id"=>"3160dd83044e7c2a51943a5081e29a15", "elasticsearch.cluster.uuid"=>"TWqBjJ-MRXetwJjQ2liV7Q", "elasticsearch.node.id"=>"qIMyAjJRQxKQM-kUEBmQtQ", "elasticsearch.node.name"=>"efk-es-default-0", "elasticsearch.cluster.name"=>"efk"}, "k8s.cluster.name"=>"picluster", "@timestamp"=>"2022-07-28T10:06:33.636543380+00:00", "tag"=>"kube.var.log.containers.efk-es-default-0_k3s-logging_elasticsearch-4718c3ca3bf9f531d5b6768c86ccb0ac8be225a10d808efcbf639271be548daf.log"}

Elasticsearch original log is

{"@timestamp": "2022-07-28T10:06:33.635Z", 
 "log.level": "INFO", 
"message":"[.kibana_8.1.2_001/TWOgIY07SMG_rsrY2lsnoQ] update_mapping [_doc]"
 "ecs.version": "1.2.0", 
"service.name":"ES_ECS", 
"event.dataset":"elasticsearch.server"}

In this case event field is JSON mapping instead of a text field, which is expected by ES because the mapping rule for this field has been created when loaded another log coming from MetalLB, where event field is also present.
Metal LB log:

{"caller":"service_controller.go:95",
 "controller":"ServiceReconciler",
 "event":"force service reload",
 "level":"info",
 "ts":"2022-07-28T09:45:27Z"
}

The text was updated successfully, but these errors were encountered:

ricsanfre · 2022-07-28T10:55:11Z

There is an open issue in Fluentbit repository addressing the very same problem: fluent/fluent-bit#4830. There the use of a Lua-based filter is proposed to encode nested json maps

ricsanfre · 2022-07-28T15:35:33Z

To solve this issue a filter rule in the aggregation layer (fluentd) has been created. This rule removes log_processed field and creates a new field source_log.<container-name>, making unique the fields before ingesting into ES.

    <filter kube.**>
      @type record_transformer
      enable_ruby true
      remove_keys log_processed
      <record>
        source_log.${record["k8s.container.name"]} ${(record.has_key?('log_processed'))? record['log_processed'] : nil}
      </record>
    </filter>

ricsanfre added the bug Something isn't working label Jul 28, 2022

ricsanfre mentioned this issue Aug 3, 2022

Feature/fluentd #59

Merged

ricsanfre closed this as completed in 8f729af Aug 4, 2022

ricsanfre closed this as completed in #59 Aug 4, 2022

ricsanfre mentioned this issue Mar 19, 2023

Configure ElasticSearch ILM policies (data retention policies) and Index templates (data model) for Fluentd logs #107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fluentbit Kubernetes Merge_Log results in conflicting field types and rejection by elasticsearch #58

Fluentbit Kubernetes Merge_Log results in conflicting field types and rejection by elasticsearch #58

ricsanfre commented Jul 28, 2022 •

edited

Loading

ricsanfre commented Jul 28, 2022

ricsanfre commented Jul 28, 2022

Fluentbit Kubernetes Merge_Log results in conflicting field types and rejection by elasticsearch #58

Fluentbit Kubernetes Merge_Log results in conflicting field types and rejection by elasticsearch #58

Comments

ricsanfre commented Jul 28, 2022 • edited Loading

ricsanfre commented Jul 28, 2022

ricsanfre commented Jul 28, 2022

ricsanfre commented Jul 28, 2022 •

edited

Loading