Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter_kubernetes: new option 'use_tag_for_meta' to use tag for metadata #4062

Merged
merged 1 commit into from
Sep 9, 2021

Conversation

edsiper
Copy link
Member

@edsiper edsiper commented Sep 6, 2021

The following patch adds a new option called 'use_tag_for_meta' which allows
to enrich the metadata only by using the information coming from the record
tags. This feature is useful only if you don't want to talk to API server or
Kubelet for metadata.

The data enrichment depends heavily on a right setup for Tail and the regular
expression to extract the proper components.

Usage example:

--- fluent-bit.conf ---

[INPUT]
    name       tail
    path       /var/log/containers/*.log
    tag        kube.<pod>.<namespace>.<container>
    tag_regex  ^/var/log/containers/(?:[^/]+/)?(?<pod>.+)_(?<namespace>.+)_(?<container>.+)\.log$

[FILTER]
    name             kubernetes
    match            kube.*
    kube_tag_prefix  kube.
    regex_parser     kube-name
    use_tag_for_meta on

In any of your parsers file, append the following entries:

--- parsers.conf ---

[PARSER]
    name    kube-name
    format  regex
    regex   (?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9]))\.(?<namespace_name>[^_]+)\.(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})$

Assuming that one of your log files has the proper common name it will be parsed and enriched as follows:

  • file name
  /var/log/containers/traefik-97b44b794-f6sp6_kube-system_traefik-5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b.log
  • record output
  tag   : kube.traefik-97b44b794-f6sp6.kube-system.traefik-987cea4dac49e14e64cd6caa4dfcc5610669e6838bd199fa396167a4adcbb4c0:
  record: [1630903216.378076554, {"log"=>"..."
                                  "time="2021-09-06T02:35:53Z"
                                  "kubernetes"=>{"pod_name"=>"traefik-97b44b794-f6sp6",
                                                 "namespace_name"=>"kube-system",
                                                 "container_name"=>"traefik",
                                                 "docker_id"=>"5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b"
                                                }
                                 }
          ]

Signed-off-by: Eduardo Silva [email protected]


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • Documentation required for this feature

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

The following patch adds a new option called 'use_tag_for_meta' which allows
to enrich the metadata only by using the information coming from the record
tags. This feature is useful only if you don't want to talk to API server or
Kubelet for metadata.

The data enrichment depends heavily on a right setup for Tail and the regular
expression to extract the proper components.

Usage example:

--- fluent-bit.conf ---

[INPUT]
    name       tail
    path       /var/log/containers/*.log
    tag        kube.<pod>.<namespace>.<container>
    tag_regex  ^/var/log/containers/(?:[^/]+/)?(?<pod>.+)_(?<namespace>.+)_(?<container>.+)\.log$

[FILTER]
    name             kubernetes
    match            kube.*
    kube_tag_prefix  kube.
    regex_parser     kube-name
    use_tag_for_meta on

--- eof ---

In any of your parsers file, append the following entries:

--- parsers.conf ---
[PARSER]
    name    kube-name
    format  regex
    regex   (?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9]))\.(?<namespace_name>[^_]+)\.(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})$
--- eof ---

Assuming that one of your log files has the proper common name it will be parsed and enriched as follows:

- file name

  /var/log/containers/traefik-97b44b794-f6sp6_kube-system_traefik-5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b.log

- record output

  tag   : kube.traefik-97b44b794-f6sp6.kube-system.traefik-987cea4dac49e14e64cd6caa4dfcc5610669e6838bd199fa396167a4adcbb4c0:
  record: [1630903216.378076554, {"log"=>"..."
                                  "time="2021-09-06T02:35:53Z"
                                  "kubernetes"=>{"pod_name"=>"traefik-97b44b794-f6sp6",
                                                 "namespace_name"=>"kube-system",
                                                 "container_name"=>"traefik",
                                                 "docker_id"=>"5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b"
                                                }
                                 }
          ]

Signed-off-by: Eduardo Silva <[email protected]>
@edsiper edsiper merged commit 3379df9 into master Sep 9, 2021
edsiper added a commit that referenced this pull request Sep 9, 2021
…ata (#4062)

The following patch adds a new option called 'use_tag_for_meta' which allows
to enrich the metadata only by using the information coming from the record
tags. This feature is useful only if you don't want to talk to API server or
Kubelet for metadata.

The data enrichment depends heavily on a right setup for Tail and the regular
expression to extract the proper components.

Usage example:

--- fluent-bit.conf ---

[INPUT]
    name       tail
    path       /var/log/containers/*.log
    tag        kube.<pod>.<namespace>.<container>
    tag_regex  ^/var/log/containers/(?:[^/]+/)?(?<pod>.+)_(?<namespace>.+)_(?<container>.+)\.log$

[FILTER]
    name             kubernetes
    match            kube.*
    kube_tag_prefix  kube.
    regex_parser     kube-name
    use_tag_for_meta on

--- eof ---

In any of your parsers file, append the following entries:

--- parsers.conf ---
[PARSER]
    name    kube-name
    format  regex
    regex   (?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9]))\.(?<namespace_name>[^_]+)\.(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})$
--- eof ---

Assuming that one of your log files has the proper common name it will be parsed and enriched as follows:

- file name

  /var/log/containers/traefik-97b44b794-f6sp6_kube-system_traefik-5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b.log

- record output

  tag   : kube.traefik-97b44b794-f6sp6.kube-system.traefik-987cea4dac49e14e64cd6caa4dfcc5610669e6838bd199fa396167a4adcbb4c0:
  record: [1630903216.378076554, {"log"=>"..."
                                  "time="2021-09-06T02:35:53Z"
                                  "kubernetes"=>{"pod_name"=>"traefik-97b44b794-f6sp6",
                                                 "namespace_name"=>"kube-system",
                                                 "container_name"=>"traefik",
                                                 "docker_id"=>"5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b"
                                                }
                                 }
          ]

Signed-off-by: Eduardo Silva <[email protected]>
pwhelan pushed a commit to pwhelan/fluent-bit that referenced this pull request Sep 16, 2021
…ata (fluent#4062)

The following patch adds a new option called 'use_tag_for_meta' which allows
to enrich the metadata only by using the information coming from the record
tags. This feature is useful only if you don't want to talk to API server or
Kubelet for metadata.

The data enrichment depends heavily on a right setup for Tail and the regular
expression to extract the proper components.

Usage example:

--- fluent-bit.conf ---

[INPUT]
    name       tail
    path       /var/log/containers/*.log
    tag        kube.<pod>.<namespace>.<container>
    tag_regex  ^/var/log/containers/(?:[^/]+/)?(?<pod>.+)_(?<namespace>.+)_(?<container>.+)\.log$

[FILTER]
    name             kubernetes
    match            kube.*
    kube_tag_prefix  kube.
    regex_parser     kube-name
    use_tag_for_meta on

--- eof ---

In any of your parsers file, append the following entries:

--- parsers.conf ---
[PARSER]
    name    kube-name
    format  regex
    regex   (?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9]))\.(?<namespace_name>[^_]+)\.(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})$
--- eof ---

Assuming that one of your log files has the proper common name it will be parsed and enriched as follows:

- file name

  /var/log/containers/traefik-97b44b794-f6sp6_kube-system_traefik-5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b.log

- record output

  tag   : kube.traefik-97b44b794-f6sp6.kube-system.traefik-987cea4dac49e14e64cd6caa4dfcc5610669e6838bd199fa396167a4adcbb4c0:
  record: [1630903216.378076554, {"log"=>"..."
                                  "time="2021-09-06T02:35:53Z"
                                  "kubernetes"=>{"pod_name"=>"traefik-97b44b794-f6sp6",
                                                 "namespace_name"=>"kube-system",
                                                 "container_name"=>"traefik",
                                                 "docker_id"=>"5ce550068d69ec7db2ba4cd9342bb04d79686da97cb802dd8e1eb19487ff727b"
                                                }
                                 }
          ]

Signed-off-by: Eduardo Silva <[email protected]>
@edsiper edsiper deleted the kube-meta-tag branch November 16, 2021 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant