Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow underscores in usernames #57

Merged
merged 3 commits into from
Mar 14, 2018

Conversation

ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Mar 5, 2018

Change the regex in :mention_username_pattern directly in this plugin as the only non-word character whitelisted in GitHub usernames seems to be the hyphen - based on currently used regex

Resolves #24
Resolves #45

@DirtyF DirtyF requested a review from a team March 5, 2018 09:59
@DirtyF DirtyF requested a review from pathawks March 6, 2018 21:51
HTML::Pipeline::MentionFilter::UsernamePattern.source,
Regexp::IGNORECASE
)
@mention_username_pattern ||= %r![\w][\w_-]*!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to change this here rather than in HTML::Pipeline?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does GitHub allow underscores in the username..? I couldn't find any documentation that said so..
Besides, its better to have a custom regex (that allows using valid twitter handles) here in the wrapper plugin instead of proposing a change in the main engine (HTML::Pipeline) which may not be open to support external usernames.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value can be overriden by providing the username_pattern variable in the context.

Should we go this route instead?

This RegEx has the potential to be very inefficient. If we are committing ourselves to bring this complexity into this plugin, we should see if we need to tighten it up at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we go this route instead?

That's how it is already.. This regex is passed to the context in html-profiler via :filter_with_mention method

This RegEx has the potential to be very inefficient.

Really? I wonder and would definitely like to know how different this regex is in comparison to existing regex..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? I wonder and would definitely like to know how different this regex is in comparison to existing regex..

Sorry, I didn't mean that as a criticism; I see that you've mostly copied the existing regex.

I'm just worried that this regex isn't limiting itself to word boundaries or anything. I'm assuming it's only trying to match against the characters between @ and the next space, but I haven't dug far enough into the code over there to figure it out.

I hope this isn't looping through each word in every text node and trying to match that word with a @ in front of it...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be worth it to add a negative lookbehind assertion, to make sure we are only trying to match words that follow @?

@mention_username_pattern ||= %r!(?<=@)[\w][\w_-]*!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this isn't looping through each word in every text node and trying to match

I think :gsub with a regex pattern does that anyways..

I wonder if it would be worth it to add a negative lookbehind assertion

This is not necessary as the logic in html-profiler already includes a regex with an atomic group..

      # Hash that contains all of the mention patterns used by the pipeline
      MentionPatterns = Hash.new do |hash, key|
        hash[key] = /
          (?:^|\W)                    # beginning of string or non-word char
          @((?>#{key}))  # @username
          (?!\/)                      # without a trailing slash
          (?=
            \.+[ \t\W]|               # dots followed by space or non-word character
            \.+$|                     # dots at end of line
            [^0-9a-zA-Z_.]|           # non-word character except dot
            $                         # end of line
          )
        /ix
      end

@DirtyF
Copy link
Member

DirtyF commented Mar 14, 2018

@jekyllbot: merge +minor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants