-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Tickbox and Math Protection #39
Conversation
This is definitely still a work in progress!
commonmark wraps text in <emph></emph> tags. If I add `asis` to them, they disappear... which probably means that I just need to add the `asis` tag to the text nodes :face_palm:
It's now gibberish!
This is why when you run into odd things during development, to restart and work with a clean environment, because sometimes you screw up.
This allows us to include the delimiters for the search
Deer lord, I hope this makes sense six months down the line
I had modified `find_between()` from pegboard:::find_between_tags() (https://github.com/carpentries/pegboard/blob/378c627b4e08869540aa049b678f9c4a560dff59/R/div.R#L125-L159) My initial modification to extract all of the descendants was in the context of block math where I knew that I just needed to add the 'asis' attribute to the nodes. In the context of inline math, where I needed to create new nodes, this was not feasible since extracting descendant-or-self would give me duplicate nodes, which would hinder the process of creating new nodes and lead to stuttering of the output. Separating out the search functionality into `find_between()` really helped me fix those two inconsistencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is brilliant! 👏 Thank you!
Apart from my small comments/questions, would it make sense to add a paragraph about Math in the README?
Also, when would one not want to use protect_math()
?
Co-authored-by: Maëlle Salmon <maelle.salmon@yahoo.se>
Add not math dollar in example
Yes, it would!
I can really only think of edge cases (e.g. writing a BASIC tutorial or describing the Burroughs-Wheeler Transform with ^ and $ as the delimiters.) |
README.md
Outdated
### LaTeX equations | ||
|
||
While markdown parsers like pandoc know what LaTeX is, commonmark does | ||
not, and that means LaTeX equations will end up with extra markup due to | ||
commonmark’s desire to escape characters. | ||
|
||
For users of the `yarn` object, if you have LaTeX equations that use | ||
either `$` or `$$` to delimit them, you can protect them from formatting | ||
changes with the `$protect_math()` method: | ||
|
||
``` r | ||
path <- system.file("extdata", "math-example.md", package = "tinkr") | ||
math <- tinkr::yarn$new(path) | ||
math$tail() # malformed | ||
#| | ||
#| $$ | ||
#| Q\_{N(norm)}=\\frac{C\_N +C\_{N-1}}2\\times | ||
#| \\frac{\\sum *{i=N-n}^{N}Q\_i} {\\sum*{j=N-n}^{N}{(\\frac{C\_j+C\_{j-1}}2)}} | ||
#| $$ | ||
math$protect_math()$tail() # success! | ||
#| | ||
#| $$ | ||
#| Q_{N(norm)}=\frac{C_N +C_{N-1}}2\times | ||
#| \frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}} | ||
#| $$ | ||
``` | ||
|
||
Note, however, that there are a few caveats for this: | ||
|
||
1. The dollar notation for inline math must be adjacent to the text. | ||
E.G. `$\alpha$` is valid, but `$ \alpha$` and `$\alpha $` are not | ||
valid. | ||
|
||
2. We do not currently have support for bracket notation | ||
|
||
3. If you use a postfix dollar sign in your prose (e.g. BASIC commands | ||
or a Burroughs-Wheeler Transformation demonstration), you must be | ||
sure to either use punctuation after the trailing dollar sign OR | ||
format the text as code. (i.e. `` `INKEY$` `` is good, but `INKEY$` | ||
by itself is not good and will be interepreted as LaTeX code, | ||
throwing an error: | ||
|
||
``` r | ||
path <- system.file("extdata", "basic-math.md", package = "tinkr") | ||
math <- tinkr::yarn$new(path) | ||
math$head(15) # malformed | ||
#| --- | ||
#| title: basic math | ||
#| --- | ||
#| | ||
#| BASIC programming can make things weird: | ||
#| | ||
#| - Give you $2 to tell me what INKEY$ means. | ||
#| - Give you $2 to *show* me what INKEY$ means. | ||
#| - Give you $2 to *show* me what `INKEY$` means. | ||
#| | ||
#| Postfix dollars mixed with prefixed dollars can make things weird: | ||
#| | ||
#| - We write $2 but say 2$ verbally. | ||
#| - We write $2 but *say* 2$ verbally. | ||
math$protect_math() #error | ||
#| Error: Inline math delimiters are not balanced. | ||
#| | ||
#| HINT: If you are writing BASIC code, make sure you wrap variable | ||
#| names and code in backtics like so: `INKEY$`. | ||
#| | ||
#| Below are the pairs that were found: | ||
#| start...end | ||
#| -----...--- | ||
#| Give you $2 to ... me what INKEY$ means. | ||
#| Give you $2 to ... 2$ verbally. | ||
#| We write $2 but ... | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maelle, let me know if this looks okay for you for the README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does! I just wonder what folks who do not use the yarn object should do: probably update their code to use the yarn object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason why I am using the yarn object here is because we can use our namespace object in the internal parsing, but I can export that object so that we can use it as default.
structure(c(md = "http://commonmark.org/xml/1.0"), class = "xml_namespace")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the updates!! I only added a small comment.
Excellent! I realized I need to add a couple of more touches to make it complete (e.g. clearing the namespace of the new nodes before copying to avoid the issue of duplicating namespace). Thank you so much for reviewing this! |
- add xml_ns_strip() before node insertion events to avoid duplicate namespaces from accumulating (see https://community.rstudio.com/t/adding-nodes-in-xml2-how-to-avoid-duplicate-default-namespaces/84870) - export md_ns() and protect_math() for users of to_xml() - protect_tickbox() is now default in to_xml() - updated README - updated asis tests
Description
This fixes TWO issues with one solution!
What is the solution?
@asis
attribute in the xml definition that allows text to be passed from xml to markdown without being escapedfind_between()
: a function that searches between patterns (used to find nodes between both $$ and $).md_ns()
: a named alias for the commonmark namespaceprotect_math()
: a function for protecting math elements (not default)add_node_siblings()
, an internal function that will take a nodelist and add them as siblings following a given node in the document with the option to have the siblings replace the node in question.Related Issue
This addresses #38 by allowing dollar-notation LaTeX math to be protected (though bracket notation is still needed). It also fixes #10 and supersedes #20.
Example
Created on 2021-05-07 by the reprex package (v2.0.0)
Created on 2021-05-06 by the reprex package (v2.0.0)