Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphite metrics should clean metrics names #637

Closed
JensRantil opened this issue Aug 31, 2014 · 17 comments
Closed

Graphite metrics should clean metrics names #637

JensRantil opened this issue Aug 31, 2014 · 17 comments
Milestone

Comments

@JensRantil
Copy link

Graphite does not allow metrics/sensors with names containing ', but Metrics happily submits them. This makes Graphite log a bunch of errors not being able to create the whisper files yada yada.

I see three different solutions to this:

  • Patch Metrics to not submit these metrics if they have the wrong naming, logging the metrics with the bad naming.
  • Automagically clean the sensor names (removing weird characters etc.) on submission to Graphite/carbon.
  • Keep things the way they are. Possibly adding a static helper method that can be used to clean sensor names from garbage. I am currently wrapping a bunch of sensor names in:
public class MetricsUtils {

    /**
     * Clean graphite metrics names.
     * <p>
     * This was created because I was seeing a lot of stacktraces in Carbon log due to broken metric names.
     * 
     * @param proposal
     *            the proposed metrics' name.
     * @return cleaned metric's name
     */
    public static String cleanMetricName(String proposal) {
        return proposal.replace("'", "");
    }

}

Input welcome.

@ryantenney ryantenney added this to the 3.1.1 milestone Oct 8, 2014
@ryantenney
Copy link
Contributor

Are there characters other than ' which need to be sanitized from metric names?

@jmason
Copy link

jmason commented Oct 10, 2014

Newlines for sure; they are used as a line separator in the text protocol: http://graphite.readthedocs.org/en/latest/feeding-carbon.html

I'd suggest sanitising anything that isn't healthy in a UNIX filename, so any control chars too.

@thattolleyguy
Copy link

Is there any status on the desired requirements of this issue? I'd love to help out, I'm just not sure what steps to take.

@felixbarny
Copy link

This is my static helper class. I got the regex somewhere from the graphite source code.
You can use and modify as you like. It's under Apache 2.0 license as its part of the stagemonitor project.

https://github.com/stagemonitor/stagemonitor/blob/master/stagemonitor-core/src/main/java/org/stagemonitor/core/util/GraphiteSanitizer.java

public final class GraphiteSanitizer {

    public static final Pattern DISALLOWED_CHARS = Pattern.compile("[^a-zA-Z0-9!#\\$%&\"'\\*\\+\\-:;<=>\\?@\\[\\\\\\]\\^_`\\|~]");

    private GraphiteSanitizer() {
    }

    /**
     * Graphite only supports alphanumeric characters + !#$%&"'*+-.:;<=>?@[\]^_`|~ so each metric name segment has to
     * be cleared from other chars.
     * <p/>
     * <pre>
     * metric path/segment delimiter
     *         _|_
     *        |   |
     * metrics.cpu.utilisation  <- metric name
     *    |     |      |
     *    -------------
     *          |
     *  metric name segments
     * </pre>
     *
     * @param metricNameSegment the metric name segment (see diagram above for a explanation of what a metric name segment is)
     * @return the metricNameSegment that contains only characters that graphite can handle
     */
    public static String sanitizeGraphiteMetricSegment(String metricNameSegment) {
        return DISALLOWED_CHARS.matcher(metricNameSegment.replace('.', ':').replace(' ', '-').replace('/', '|')).replaceAll("_");
    }
}

@thattolleyguy
Copy link

So do people think it would be better to remove the offending metrics or sanitize the names? Should we support both with an option to determine which? If so, what should the default be?

@JensRantil
Copy link
Author

I believe the default should be to sanitize. There are few worse things than missing a metric when debugging.

On Fri, Dec 19, 2014 at 5:46 AM, Tyler Tolley notifications@github.com
wrote:

So do people think it would be better to remove the offending metrics or sanitize the names? Should we support both with an option to determine which? If so, what should the default be?

Reply to this email directly or view it on GitHub:
#637 (comment)

@JensRantil
Copy link
Author

FYI, I just ran into a bug(?) triggered in Graphite when a metric had a parenthesis in it. Filename was /whisper_root/package/path(something)/1_mean.wsp. And I was plotting aliasByNode(package.*.1_mean, 1). We are now internally stripping all parenthesis.

@felixbarny
Copy link

Graphite only supports alphanumeric characters + !#$%&"'*+-.:;<=>?@[]^_`|~

@JensRantil
Copy link
Author

@felixbarny Thanks! Do you have a reference for this list? As stated in the initial description, ' is broke my Graphite.

@felixbarny
Copy link

I've found it somewhere in Graphite's source code. Can't remember where though...

@ryantenney
Copy link
Contributor

I've gone with the following regex: [^a-z0-9!#\$%&"'*+-.:;<=>\?@[]^_`|~]+

If there are no objections then I will be releasing 3.1.1 today.

@felixbarny
Copy link

I think the regex is missing the backslash character (\\)

@ryantenney
Copy link
Contributor

So it is. Thanks!

@ryantenney
Copy link
Contributor

Forward slash is legal then?

@felixbarny
Copy link

No, that's one of the most illegal chars ;)
But look at my sanitizeGraphiteMetricSegment method. I'm replacing it separately by | to that URLs don't look like 💩 . For example GET /index.html -> GET-|index:html as supposed to GETindexhtml or GET--index-html but thats a matter of taste I guess...

@nickbabcock
Copy link
Contributor

With #936 merged, should this be closed?

@ryantenney
Copy link
Contributor

Yep looks like it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants