Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespace vignette #52

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Namespace vignette #52

wants to merge 7 commits into from

Conversation

zkamvar
Copy link
Member

@zkamvar zkamvar commented May 27, 2021

This will address #48. Here is the rendered version via knitr::purl() and reprex::reprex()

Draft of vingette text (updated 2021-05-28)
library("tinkr")
library("magrittr")
library("commonmark")
library("xml2")
library("xslt")
library("purrr")
#> 
#> Attaching package: 'purrr'
#> The following object is masked from 'package:magrittr':
#> 
#>     set_names

Introduction

This document was written to address common confusions about XML namespaces and
their implications in constructing XPath queries, adding new XML nodes, and
converting XML to markdown. This guide is written for the user who is
comfortable with XPath queries and wants to understand more about how to handle
and manpiulate their XML representation of markdown.

Motivation

The underlying motivation for {tinkr} was to wrap the process of converting
markdown documents to XML and back again. This process uses {commonmark} and
{xml2} to translate and read in the markdown to an XML document.

xml <- commonmark::markdown_xml("## h1\n\ntext with `r 'code'`") %>% 
  xml2::read_xml()
xml
#> {xml_document}
#> <document xmlns="http://commonmark.org/xml/1.0">
#> [1] <heading level="2">\n  <text xml:space="preserve">h1</text>\n</heading>
#> [2] <paragraph>\n  <text xml:space="preserve">text with </text>\n  <code xml: ...

We use the xslt package to to the conversion from XML back to markdown.

xslt_style <- tinkr::stylesheet() %>% xml2::read_xml()
cat(xslt::xml_xslt(xml, xslt_style))
#> ## h1
#> 
#> text with `r 'code'`

One of the downsides of this conversion is that commonmark provides a default
namespace, which means that nodes in XPath queries must have a prefix that
defines the namespace. For example, an XPath query to select all paragraphs
that have executable R code looks like the following query:

xml2::xml_find_first(xml, "//d1:paragraph[d1:code[starts-with(text(), 'r ')]]") 
#> {xml_node}
#> <paragraph>
#> [1] <text xml:space="preserve">text with </text>
#> [2] <code xml:space="preserve">r 'code'</code>

The reason why we add d1 is because that’s the prefix for the default
namespace in {xml2}.

xml2::xml_ns(xml)
#> d1 <-> http://commonmark.org/xml/1.0

The {tinkr} difference

The XML document that {tinkr} generates has no namespace by default because
operations on an XML document without a namespace becomes easier than if there
were a default or a prefixed namespace.

xml2::xml_ns_strip(xml)
xml2::xml_find_first(xml, "//paragraph[code[starts-with(text(), 'r ')]]")
#> {xml_node}
#> <paragraph>
#> [1] <text xml:space="preserve">text with </text>
#> [2] <code xml:space="preserve">r 'code'</code>

However, removing the namespace has implications for exporting XML objects
because namespaces are important. For example, this document namespace-less
document no longer can be converted with our XSLT stylesheet, which expects a
commonmark namespace:

xslt_style <- tinkr::stylesheet() %>% xml2::read_xml()
cat(xslt::xml_xslt(xml, xslt_style))

To alleviate this, we add the namespace just before it’s converted in
tinkr::to_md().

xml2::xml_set_attr(xml, "xmlns", "http://commonmark.org/xml/1.0")
cat(xslt::xml_xslt(xml, xslt_style))
#> ## h1
#> 
#> text with `r 'code'`

Read on to find out more about XML namespaces and their implications on your
tinkering.

XML namespaces

XML namespaces are a lot like package namespaces in R: they allow you to avoid
clashes of names for example, table can represent data or furniture.

By default, nodes in XML do not have namespaces unless you give them one, which
means that when you use XPath search, you can use the node names by default:

d <- xml2::read_xml("<document>
    <paragraph>
      <text>hello there</text>
      <text> ello  here</text>
    </paragraph>
  </document>")
xml2::xml_ns(d)
#>  <->
xml2::xml_find_all(d, "//document")
#> {xml_nodeset (1)}
#> [1] <document>\n  <paragraph>\n    <text>hello there</text>\n    <text> ello  ...
xml2::xml_find_all(d, "//text[contains(text(), 'hello')]")
#> {xml_nodeset (1)}
#> [1] <text>hello there</text>

However if there is a namespace added to a node, all of its descendants will
inherit the namespace, which affects your XPath expressions.
Below we had the namespace of commonmark to the paragraph node.

d <- xml2::read_xml("<document>
    <paragraph xmlns='http://commonmark.org/xml/1.0'>
      <text>hello there</text>
      <text> ello  here</text>
    </paragraph>
  </document>")
xml2::xml_ns(d)
#> d1 <-> http://commonmark.org/xml/1.0
xml2::xml_find_all(d, "//document")
#> {xml_nodeset (1)}
#> [1] <document>\n  <paragraph xmlns="http://commonmark.org/xml/1.0">\n    <tex ...

Using the same XPath query as before no longer works, our call to
xml2::xml_find_all() returns nothing.

xml2::xml_find_all(d, "//text[contains(text(), 'hello')]") # does not work
#> {xml_nodeset (0)}

When a namespace is specified with xmlns=&lt;URI&gt;, {xml2} assigns it a
default namespace prefix, which is d1. Therefore editing our XPath query
like so will work:

xml2::xml_find_all(d, "//d1:text[contains(text(), 'hello')]")
#> {xml_nodeset (1)}
#> [1] <text>hello there</text>

But is it a good idea to use d1 as a namespace prefix? No, the {xml2}
documentation recommends to rename the namespace as soon as you read in a
document and use the namespace object to semantically prefix your XPath
expressions:

ns <- xml2::xml_ns(d)
ns <- xml2::xml_ns_rename(ns, d1 = "md") 
ns
#> md <-> http://commonmark.org/xml/1.0

Now we can modify our XPath query to use md as a prefix, but we also need to
supply the namespace as an argument to the command:

xml2::xml_find_all(d, "//md:text[contains(text(), 'hello')]", ns)
#> {xml_nodeset (1)}
#> [1] <text>hello there</text>

You might be wondering, why isn’t it recommended to prefix the namespace from
the start to avoid needing to rename and specify the namespace? The reason is
because the prefixed namespaces only apply to nodes with that prefix. Here’s
an example. Let’s take our previous example and modify the namespace attribute
to have an md prefix:

dc <- as.character(d)
cat(dc <- gsub("xmlns=", "xmlns:md=", dc))
#> <?xml version="1.0" encoding="UTF-8"?>
#> <document>
#>   <paragraph xmlns:md="http://commonmark.org/xml/1.0">
#>     <text>hello there</text>
#>     <text> ello  here</text>
#>   </paragraph>
#> </document>
dc <- xml2::read_xml(dc)
xml2::xml_ns(dc)
#> md <-> http://commonmark.org/xml/1.0
xml2::xml_find_all(dc, "//document")
#> {xml_nodeset (1)}
#> [1] <document>\n  <paragraph xmlns:md="http://commonmark.org/xml/1.0">\n    < ...

We can see that the XPath query without the prefix works.

xml2::xml_find_all(dc, "//text[contains(text(), 'hello')]") 
#> {xml_nodeset (1)}
#> [1] <text>hello there</text>

However, the XPath query with the prefix no longer works.

xml2::xml_find_all(dc, "//md:text") 
#> {xml_nodeset (0)}

You might be wondering, when we specified the prefix earlier with a default
namespace, the prefixed XPath query worked, but now with a namespace that
explicitly defines the prefix, that query is no longer working. Isn’t everything
below the paragraph node in the commonmark namespace?

You might notice that we can access the document node AND the text
node without a prefix even though the text node is in the commonmark namespace
and the document node is outside of that namespace. It’s because neither of
these nodes actually have a namespace!

This is demonstrated when we add a new node with the md prefix

pgp <- xml2::xml_find_first(dc, "//paragraph")
xml2::xml_add_child(pgp, "md:text", "hello from the md namespace")
dc
#> {xml_document}
#> <document>
#> [1] <paragraph xmlns:md="http://commonmark.org/xml/1.0">\n  <text>hello there ...

Now we can see that there are three text nodes, one of which has the md
namespace prefix. If we select the nodes with that prefix and without the
prefix, we will get one and two nodes, respectively.

xml2::xml_find_all(dc, "//md:text") # one node
#> {xml_nodeset (1)}
#> [1] <md:text>hello from the md namespace</md:text>
xml2::xml_find_all(dc, "//text")    # two nodes
#> {xml_nodeset (2)}
#> [1] <text>hello there</text>
#> [2] <text> ello  here</text>

If a namespace is defined in the document with a prefix, only nodes with that
prefix are considered to be inside the namespace
. This becomes important when
we want to pass our XML document through a stylesheet that expects the incoming
nodes to have a specific namespace, which is exactly how we transform the XML
representation of markdown back to markdown.

Commonmark

The {tinkr} package streamlines the process of markdown to xml and back again.
We use commonmark::markdown_xml() as a starting point to generate valid XML:

cat(cmk <- commonmark::markdown_xml("this is a **test**"))
#> <?xml version="1.0" encoding="UTF-8"?>
#> <!DOCTYPE document SYSTEM "CommonMark.dtd">
#> <document xmlns="http://commonmark.org/xml/1.0">
#>   <paragraph>
#>     <text xml:space="preserve">this is a </text>
#>     <strong>
#>       <text xml:space="preserve">test</text>
#>     </strong>
#>   </paragraph>
#> </document>
xml <- xml2::read_xml(cmk)
xml
#> {xml_document}
#> <document xmlns="http://commonmark.org/xml/1.0">
#> [1] <paragraph>\n  <text xml:space="preserve">this is a </text>\n  <strong>\n ...

Commonmark uses a default namespace

You can see from the commonmark output that it has a default namespace that
resolves to http://commonmark.org/xml/1.0, which means that we need to use
the default namespace if we want to munge the data:

xml2::xml_find_all(xml, "//d1:text")
#> {xml_nodeset (2)}
#> [1] <text xml:space="preserve">this is a </text>
#> [2] <text xml:space="preserve">test</text>

Using a semantic prefix with the default namespace

To make things more semantic, we could rename the namespace to have the “md”
prefix and carry around that object. Note: an xml_namespace object is a named
character vector, so we can create it with structure() and use it to introduce
semantically sensible XPath queries

ns <- structure(c(md = "http://commonmark.org/xml/1.0"), class = "xml_namespace")
xml2::xml_find_all(xml, "//md:text", ns)
#> {xml_nodeset (2)}
#> [1] <text xml:space="preserve">this is a </text>
#> [2] <text xml:space="preserve">test</text>

Of course, now if we want to make any semantic XPath query, we need to include
both a prefix and a namespace object.

Transforming XML to markdown with XSLT

The commonmark namespace allows us to transform our document to markdown using
an XSLT stylesheet, which is—that’s right—an XML document:

sty <- xml2::read_xml(tinkr::stylesheet())
sty
#> {xml_document}
#> <stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:md="http://commonmark.org/xml/1.0">
#>  [1] <xsl:import href="xml2md.xsl"/>
#>  [2] <xsl:template match="/">\n  <xsl:apply-imports/>\n</xsl:template>
#>  [3] <xsl:output method="text" encoding="utf-8"/>
#>  [4] <xsl:template match="md:emph[@asis='true']">\n  <!-- \n        Multiple  ...
#>  [5] <xsl:template match="md:text[@asis='true']">\n  <xsl:value-of select="st ...
#>  [6] <xsl:template match="md:link[@rel] | md:image[@rel]">\n  <xsl:if test="s ...
#>  [7] <xsl:template match="md:link[@anchor]">\n  <xsl:if test="self::md:image" ...
#>  [8] <xsl:template match="md:table">\n  <xsl:apply-templates select="." mode= ...
#>  [9] <xsl:variable name="minLength">3</xsl:variable>
#> [10] <xsl:variable name="maxLength">\n  <xsl:for-each select="//md:table_head ...
#> [11] <xsl:template name="n-times">\n  <xsl:param name="n"/>\n  <xsl:param nam ...
#> [12] <xsl:template match="md:table_header">\n  <xsl:text>| </xsl:text>\n  <xs ...
#> [13] <xsl:template match="md:table_cell">\n  <xsl:variable name="cell" select ...
#> [14] <xsl:template match="md:table_row">\n  <xsl:text>| </xsl:text>\n  <xsl:a ...
#> [15] <xsl:template match="md:table_row">\n  <xsl:text>| </xsl:text>\n  <xsl:a ...
#> [16] <xsl:template match="md:strikethrough">\n  <xsl:text>~~</xsl:text>\n  <x ...

Each xsl:template node in this stylesheet matches against a specific node in
the commonmark namespace (prefix: md) and emits text based on that node. This
allows us to write back to markdown:

cat(xslt::xml_xslt(xml, sty))
#> this is a **test**

We can in this way programatically transform the content of the markdown. In
this example, we can change the **test** to be an inline R code chunk that
emits _test_.

xml <- commonmark::markdown_xml("this is a **test**") %>%
  xml2::read_xml()

xml2::xml_find_all(xml, "//md:strong", ns) %>%
  xml2::xml_set_name("code") %>%
  xml2::xml_set_text("r cat('_test_')")
#> {xml_nodeset (1)}
#> [1] <code>\n  <text xml:space="preserve">r cat('_test_')</text>\n</code>

sty <- xml2::read_xml(tinkr::stylesheet())
cat(xslt::xml_xslt(xml, sty))
#> this is a `r cat('_test_')`

Perils: adding nodes

A default namespace is all fun and games until you need to add new nodes. Take
for example the situation where we want to add a code block. In commonmark, it’s
a code_block node with an info attribute stating the language and the text
inside is the code.

xml <- commonmark::markdown_xml("this is a **test**") %>%
  xml2::read_xml() 
xml2::xml_add_child(xml, "code_block", info = "{r}", "1 + rnorm(1)\n")
xml
#> {xml_document}
#> <document xmlns="http://commonmark.org/xml/1.0">
#> [1] <paragraph>\n  <text xml:space="preserve">this is a </text>\n  <strong>\n ...
#> [2] <code_block info="{r}">1 + rnorm(1)\n</code_block>
xml2::xml_find_all(xml, "//md:code_block", ns)
#> {xml_nodeset (0)}
sty <- xml2::read_xml(tinkr::stylesheet())
cat(xslt::xml_xslt(xml, sty))
#> this is a **test**

By all means, the node should have added correctly, but because we did not
specify a namespace, it is not recognized as part of the md namespace even
though we added it as a child of the document. The best way to handle this
situation is to reparse the document
:

xml %>%
  as.character() %>%
  xml2::read_xml() %>%
  xslt::xml_xslt(sty) %>%
  cat()
#> this is a **test**
#> 
#> ```{r}
#> 1 + rnorm(1)
#> ```

We could also try adding the namespace to the node when we add it:

xml <- commonmark::markdown_xml("this is a **test**") %>%
  xml2::read_xml() 
xml2::xml_add_child(xml, "code_block", 
  xmlns = "http://commonmark.org/xml/1.0", info = "{r}", "1 + rnorm(1)\n")
xml
#> {xml_document}
#> <document xmlns="http://commonmark.org/xml/1.0">
#> [1] <paragraph>\n  <text xml:space="preserve">this is a </text>\n  <strong>\n ...
#> [2] <code_block xmlns="http://commonmark.org/xml/1.0" info="{r}">1 + rnorm(1) ...
xml2::xml_find_all(xml, "//md:code_block", ns)
#> {xml_nodeset (1)}
#> [1] <code_block xmlns="http://commonmark.org/xml/1.0" info="{r}">1 + rnorm(1) ...
cat(xslt::xml_xslt(xml, sty))
#> this is a **test**
#> 
#> ```{r}
#> 1 + rnorm(1)
#> ```

It works, but let’s take a look at our namespaces:

xml2::xml_ns(xml)
#> d1 <-> http://commonmark.org/xml/1.0
#> d2 <-> http://commonmark.org/xml/1.0

Every node we add with an unnamed namespace adds another default and in the end,
if we are doing a lot of substitution, we can end up with hundreds of namespaces.

No Namespace?

What if we just tried to use no namespace?

xml <- commonmark::markdown_xml("this is a **test**") %>%
  xml2::read_xml() %>%
  xml2::xml_ns_strip()
xml2::xml_add_child(xml, "code_block", info = "{r}", "1 + rnorm(1)\n")
xml
#> {xml_document}
#> <document>
#> [1] <paragraph>\n  <text xml:space="preserve">this is a </text>\n  <strong>\n ...
#> [2] <code_block info="{r}">1 + rnorm(1)\n</code_block>
xml2::xml_find_all(xml, "//code_block")
#> {xml_nodeset (1)}
#> [1] <code_block info="{r}">1 + rnorm(1)\n</code_block>
sty <- xml2::read_xml(tinkr::stylesheet())
cat(xslt::xml_xslt(xml, sty))

We can now add new nodes and use XPath without namespace prefixes or objects,
but we have lost the ability to use our stylesheet :(

But! Maybe we can do this by adding the namespace at the last minute!

xml2::xml_set_attr(xml, "xmlns", "http://commonmark.org/xml/1.0")
xml
#> {xml_document}
#> <document xmlns="http://commonmark.org/xml/1.0">
#> [1] <paragraph>\n  <text xml:space="preserve">this is a </text>\n  <strong>\n ...
#> [2] <code_block info="{r}">1 + rnorm(1)\n</code_block>
cat(xslt::xml_xslt(xml, sty))
#> this is a **test**
#> 
#> ```{r}
#> 1 + rnorm(1)
#> ```

Harnessing the power of namespaces

When you know that namespaces with prefixes will only respond to nodes with that
prefix and all other nodes have no namespace, then you can add in nodes that can
serve as anchors in your document or hiding markdown elements. Let’s say we
wanted to hide all markdown elements except for code blocks. One way we could
do this is to set up a namespace and add a prefix to all non-code-block nodes:

xml <- commonmark::markdown_xml("this is a **test**") %>%
  xml2::read_xml() %>%
  xml2::xml_ns_strip()
xml2::xml_add_child(xml, "code_block", info = "{r}", "1 + rnorm(1)\n")
xml
#> {xml_document}
#> <document>
#> [1] <paragraph>\n  <text xml:space="preserve">this is a </text>\n  <strong>\n ...
#> [2] <code_block info="{r}">1 + rnorm(1)\n</code_block>
# Set the prefixed namespace in your document
xml2::xml_set_attr(xml, "xmlns:tnk", "https://docs.ropensci.org/tinkr")
# Find all nodes that are not code blocks
nocode <- xml2::xml_find_all(xml, ".//*[not(self::code_block)]")
nocode
#> {xml_nodeset (4)}
#> [1] <paragraph>\n  <text xml:space="preserve">this is a </text>\n  <strong>\n ...
#> [2] <text xml:space="preserve">this is a </text>
#> [3] <strong>\n  <text xml:space="preserve">test</text>\n</strong>
#> [4] <text xml:space="preserve">test</text>
# Change the namespace of these nodes
purrr::walk(nocode, xml2::xml_set_namespace, "tnk", "https://docs.ropensci.org/tinkr")
xml
#> {xml_document}
#> <document xmlns:tnk="https://docs.ropensci.org/tinkr">
#> [1] <tnk:paragraph>\n  <tnk:text xml:space="preserve">this is a </tnk:text>\n ...
#> [2] <code_block info="{r}">1 + rnorm(1)\n</code_block>
xml2::xml_set_attr(xml, "xmlns", "http://commonmark.org/xml/1.0")
sty <- xml2::read_xml(tinkr::stylesheet())
cat(xslt::xml_xslt(xml, sty))
#> ```{r}
#> 1 + rnorm(1)
#> ```

Conclusion

While developing {tinkr} we[1] struggled a lot with understanding namespaces.
This guide was our attempt at demystifying working with namespaces in {xml2}.
For the casual user of {tinkr} who is interested in extracting data from
markdown documents, this guide is not very useful, but we hope that this
guide provies useful for the user who wants to use this for cleaning and
standardizing their markdown documents.

[1] Well, mostly just Zhian.

Created on 2021-05-28 by the reprex package (v2.0.0)

@zkamvar zkamvar marked this pull request as ready for review May 27, 2021 23:18
Copy link
Member

@maelle maelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting 🤓 👏

My most important comment would be that the vignette lacks an introduction (what's the problem here?) and conclusion.

vignettes/namespaces.Rmd Outdated Show resolved Hide resolved

```{r}
d <- xml2::read_xml("<document>
<paragraph xmlns='http://commonmark.org/xml/1.0'>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalid URL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, that's going to be a fun one for CRAN. That's the namespace we have for the stylesheet:

xmlns:md="http://commonmark.org/xml/1.0">

Fun fact that I need to include in the vignette: namespaces must be a valid URI, but do not have to be a valid URL. The documentation on this is painfully obtuse: https://www.w3.org/TR/xml-names/#sec-namespaces

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably obfuscate this by adding a function that creates the URI so that CRAN doesn't pick it up in it's scans.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I now realize it won't be found cf https://github.com/wch/r-source/blob/277dc7c97155e7dcc3f0649bc1bc7731a9f26b74/src/library/tools/R/urltools.R#L78 (since the URL won't be a link in the HTML file).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However in the text we might want to explain it's an URI?

vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
vignettes/namespaces.Rmd Show resolved Hide resolved
vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
vignettes/namespaces.Rmd Show resolved Hide resolved
zkamvar and others added 2 commits May 28, 2021 07:25
Co-authored-by: Maëlle Salmon <maelle.salmon@yahoo.se>
@zkamvar
Copy link
Member Author

zkamvar commented May 28, 2021

Very interesting nerd_face clap

My most important comment would be that the vignette lacks an introduction (what's the problem here?) and conclusion.

This is a really good point and really highlights my writing style, it's a bit like building a sandwich inside-out: I start with a salad and add the bread at the end.

Copy link
Member

@maelle maelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, mostly nitpicky comments. 😺

vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
vignettes/namespaces.Rmd Show resolved Hide resolved
vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
cat(xslt::xml_xslt(xml, xslt_style))
```

Read on to find out more about XML namespaces and their implications on your
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tinkering" 😁

cat(xslt::xml_xslt(xml, sty))
```

### Perils: adding nodes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "perils" a common word? I understand it because it looks like the French word, but might it be a better idea to use "risks"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think peril gives a better sense of "something that can not be avoided on this path" as opposed to risk, which has a random component.


## Harnessing the power of namespaces

When you know that namespaces with prefixes will only respond to nodes with that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a practical use case (in words, not code necessarily) for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! I use the alternate namespace in {pegboard} to help me identify and label pandoc fenced-div sections by adding pairs of equivalently labeled tags that are not part of the markdown document so I can easily parse the content with find_between()

Otherwise, I can see the masking pattern useful if you wanted to create several versions of the same prose in a single document (e.g. if you were creating a quiz that you wanted randomized per student).

standardizing their markdown documents.

[reparse]: https://community.rstudio.com/t/adding-nodes-in-xml2-how-to-avoid-duplicate-default-namespaces/84870/2?u=zkamvar
[^1]: Well, mostly just Zhian.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truth is you took the time to understand them, I just had to read your vignette!


While developing {tinkr} we[^1] struggled a lot with understanding namespaces.
This guide was our attempt at demystifying working with namespaces in {xml2}.
For the casual user of {tinkr} who is interested in extracting data from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could still write the take-home message for them i.e. what they need to know for normal use?

vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
vignettes/namespaces.Rmd Outdated Show resolved Hide resolved
zkamvar and others added 2 commits May 31, 2021 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants