Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove mbstring as a dependency #40

Merged
merged 7 commits into from
Dec 27, 2014
Merged

Remove mbstring as a dependency #40

merged 7 commits into from
Dec 27, 2014

Conversation

philsturgeon
Copy link
Member

Half done with #38, but 2 of the 3 uses are removed.

Found some weird hack for string length that involves regex. I'll take it.

Not sure what to do about mb_strtoupper() though.

https://github.com/thephpleague/commonmark/search?utf8=%E2%9C%93&q=mb_strtoupper&type=Code

@GrahamCampbell
Copy link
Member

php 5.3 and hhvm aren't happy.

@philsturgeon
Copy link
Member Author

I get emails about that. :)

Before 82421eb, the mb_decode_numericentity function wouldn't convert code points beyond plane 2 (per the flags).  This change essentially adds the same behavior.
@colinodell
Copy link
Member

Thanks for getting this started! I've resolved the current failures and am working on mb_strtoupper now - I'll let you know how far I get.

@@ -36,7 +36,8 @@ public static function detabLine($string)

foreach ($parts as $part) {
// Calculate number of spaces; insert them followed by the non-tab contents
$amount = 4 - mb_strlen($line, 'UTF-8') % 4;
$lineLength = strlen(utf8_decode($line));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure this works with non ascii characters? You should include some tests for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep it does. utf8_decode will turn the multi-byte characters into a single byte representation. It's okay if the conversion fails (and we get ? characters) since we just need a count here.

Per the spec, reference link matching is done by normalizing the label with a Unicode case fold, which mb_strtoupper provides.

But since not all systems have the relevant extension installed, we need to manually implement this logic by converting characters according to this table: http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
colinodell added a commit that referenced this pull request Dec 27, 2014
Remove mbstring as a dependency
@colinodell colinodell merged commit ec8bcb9 into master Dec 27, 2014
@colinodell colinodell deleted the rm-mbstring branch December 27, 2014 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants