Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to configure characters and disable emphasis/strong #135

Merged
merged 6 commits into from
Oct 2, 2015

Conversation

0b10011
Copy link
Contributor

@0b10011 0b10011 commented Jul 7, 2015

  • Option to disable parsing of _
  • Option to disable parsing of *
  • Option to disable rendering of emphasis
  • Option to disable rendering of strong
  • Tests

Is this something you would be open to taking into core? If so, I'll write some tests for it. (Any suggestions for how the tests should be organized?)

@colinodell
Copy link
Member

What's the use case for this feature?

@0b10011
Copy link
Contributor Author

0b10011 commented Jul 8, 2015

What's the use case for this feature?

We have used Markdown in the past for all of the content in our database. Some of this content uses a subset of Markdown features to promote consistency with the content itself and how it is converted into HTML (we have content dating back to 1998), rather than escaping the content for the version of Markdown parser we were using at the time and hoping no new features are added that could mess up past content.

Our largest collection of content is in this subset of Markdown that only allows block quotes, code blocks, lists, paragraphs, inline code, and emphasis (only with asterisks). There's no guarantee that this content does not include underscores, and if it does, it is definitely not for emphasis/strong. Additionally, empty emphasis groups may have gotten into the database at one point or another. **foo** should be parsed as <em></em>foo<em></em>, not <strong>foo</strong>. So, disabling underscores and strong is a must for us.

This adds the same flexibility that extensions have for adding/removing features for emphasis and strong, without having to add a new parser for every combination of features. (Which will help with maintenance in the long run.)

@colinodell
Copy link
Member

Thanks for providing that explanation! Let me think on this for a bit and get back to you.

@colinodell colinodell added the feedback wanted We need your input! label Jul 8, 2015
use League\CommonMark\Util\RegexHelper;

class EmphasisParser extends AbstractInlineParser
{
protected $config;

public function __construct(array $newConfig = array())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit-pick - could we use the short array syntax here?

@colinodell
Copy link
Member

I've got three major concerns with implementing this functionality:

1. It explicitly allows users to disable core CommonMark features.

This somewhat contradicts our primary purpose of support the full CommonMark spec. I'm okay with consumers adding and modifying functionality themselves, and I'l also okay with providing extension points, but providing options to explicitly disable CommonMark in a CommonMark library feels like we're going too far.

2. The slippery slope argument - if we implement this, do we also need to implement similar functionality elsewhere?

Where do we draw the line? Is this truly a special case because EmphasisParser combines different characters and emphasis types in one class? Would other users want to see similar configurations for all parsers and renderers? Or is this something unique to your particular implementation which other users probably won't use?

3. Is this the best solution?

Because your custom Markdown is using a non-standard approach to emphasis, would it make more sense to implement your own version of EmphasisParser to fit your particular needs? Technically speaking that approach should work, so it becomes a question of whether that's something you want to take on. IMO you probably should since you need a non-standard implementation anyway, and any future changes we make might further deviates from your expectations.

For those reasons, I feel I inclined to say "no" to this PR. I'm certainly open to your thoughts and feedback though (and others' too)!

@0b10011
Copy link
Contributor Author

0b10011 commented Jul 10, 2015

  1. It explicitly allows users to disable core CommonMark features.

This somewhat contradicts our primary purpose of support the full CommonMark spec. I'm okay with consumers adding and modifying functionality themselves, and I'l also okay with providing extension points, but providing options to explicitly disable CommonMark in a CommonMark library feels like we're going too far.

The parsing options can only be passed in with a custom extension, so this shouldn't affect CommonMarkCoreExtension at all, correct? The only way I know of to set these options is with new EmphasisParser(['enable_strong' => false]).

Core CommonMark features can already be disabled by extensions by simply excluding the parser, processor, and/or renderer for the particular feature. Emphasis/strong is a weird case because it combines both into a single parser/processor/renderer group. Had these been implemented as separate pieces, they could be disabled accordingly. But they aren't (and I'm not sure they should be, as it avoids repeating quite a bit of code).

  1. The slippery slope argument - if we implement this, do we also need to implement similar functionality elsewhere?

Where do we draw the line? Is this truly a special case because EmphasisParser combines different characters and emphasis types in one class?

I believe it is indeed a special case. It is the only parser that looks for multiple characters. It also treats each character slightly differently (different rules for $canOpen and $canClose).

Would other users want to see similar configurations for all parsers and renderers? Or is this something unique to your particular implementation which other users probably won't use?

I'm not sure a similar configuration would be needed with the other parsers. The issue is the code for emphasis/strong is trying to do several things at once: parse _ one way and * another, and render emphasis and strong. The other parsers/renderers (AFAICT) only parse a single opening character and only render a single element. Adding the options to this parser/processor makes it possible to separate these four things into separate pieces without sacrificing the legibility, adding repetitive code, or increasing the chance for bugs for the default configuration.

  1. Is this the best solution?

Because your custom Markdown is using a non-standard approach to emphasis, would it make more sense to implement your own version of EmphasisParser to fit your particular needs? Technically speaking that approach should work, so it becomes a question of whether that's something you want to take on.

We would use (are using) this PR's code for our needs. I'd rather not write a brand new parser that's essentially a copy of EmphasisParser, so we would just fork this project and merge this PR to ensure we continue to get bug fixes from upstream. (This is what we're currently doing, actually.) I would much rather pull directly from upstream, though, to make sure we get the latest updates immediately.

We already have several extensions written for our needs, using only the parsers and renderers we need from CommonMark. We've also added a few of our own that are specific to our needs. This feature is the only thing holding us back from ditching our branch and returning to upstream.

IMO you probably should since you need a non-standard implementation anyway, and any future changes we make might further deviates from your expectations.

I'm not too worried about this. The features have remained consistent enough with Markdown for us to stick with CommonMark. And any change made in CommonMark is one we'd likely like to make the move to. But that's mostly because we're able to disable features with extensions. It seems strange that we're not able to do the same with _ and strong when we're able to do that with every other character/feature just by changing which parsers/renders are used.

@colinodell
Copy link
Member

Thanks for the response!

I've been completely swamped the past few days, and this week is looking pretty busy too, so it might be a few days before I can dedicate the time to thoroughly review this and provide a thoughtful response. (If you don't hear from me by Sunday then feel free to ping me.)

@GrahamCampbell
Copy link
Member

Any news on this?

@colinodell
Copy link
Member

Thanks for the ping @GrahamCampbell.

@bfrohs - thanks for your feedback, and sorry for sitting on this for so long. You've convinced me that this is indeed worthwhile, so I'm a 👍 for merging this in.

I think the 0.11 release may have broken compatibility (or at least caused some kind of merge conflict), but it shouldn't be too difficult to fix. If you could do that I'll get this merged ASAP.

Thanks!

@colinodell colinodell added enhancement New functionality or behavior and removed feedback wanted We need your input! labels Sep 19, 2015
@colinodell colinodell added this to the 0.11.1 milestone Sep 19, 2015
@GrahamCampbell
Copy link
Member

Since this is a new feature, this should probably go to 0.12.0 as per semver rather than 0.11.1.

@colinodell colinodell removed this from the 0.11.1 milestone Sep 23, 2015
@0b10011
Copy link
Contributor Author

0b10011 commented Sep 25, 2015

I will try to get to this over the weekend!

@colinodell
Copy link
Member

Awesome, thanks!

On Fri, Sep 25, 2015, 10:11 AM Brandon Frōhs notifications@github.com
wrote:

I will try to get to this over the weekend!


Reply to this email directly or view it on GitHub
#135 (comment)
.

Conflicts:
	src/Inline/Parser/EmphasisParser.php
	src/Inline/Processor/EmphasisProcessor.php
@0b10011
Copy link
Contributor Author

0b10011 commented Oct 2, 2015

@colinodell I've fixed the compatibility issues and added tests. If you need anything changed/moved around, just let me know!

@colinodell
Copy link
Member

This looks great to me! I'll go ahead and merge this in. If you wouldn't mind updating the docs as well (the configuation and command line pages) I'd greatly appreciate it. Otherwise I'll try to add them in the next few days.

Thanks again!

colinodell added a commit that referenced this pull request Oct 2, 2015
Add ability to configure characters and disable emphasis/strong
@colinodell colinodell merged commit 6432d36 into thephpleague:master Oct 2, 2015
colinodell added a commit that referenced this pull request Oct 2, 2015
@0b10011 0b10011 deleted the emphasisConfig branch October 2, 2015 19:44
0b10011 added a commit to 0b10011/commonmark that referenced this pull request Oct 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New functionality or behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants