Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix escaping of argument lists in doc generator, expose JSON #10109

Merged
merged 10 commits into from
Jun 13, 2021

Conversation

oprypin
Copy link
Member

@oprypin oprypin commented Dec 20, 2020

(Click to view the original, obsolete description of this PR)

While working on a doc generator, I was checking out linkification and syntax of type restrictions, like this one

https://crystal-lang.org/api/0.35.1/Array.html#each_product(arrays:Array(Array),reuse=false,&)-class-method

def self.each_product(arrays : Array(Array), reuse = false, &)

Crystal's doc generator only puts this into JSON output:

"args_string": "(arrays : Array(Array), reuse = <span class=\"n\">false</span>, &)",

It's a really weird mix because I'm pretty sure that code is trying to avoid putting HTML into the output, but instead omits it only selectively.

I am changing it to put this:

"args_string": "(arrays : Array(Array), reuse = false, &)",
"args_html": "(arrays : <a href=\"Array.html\">Array</a>(<a href=\"Array.html\">Array</a>), reuse = <span class=\"n\">false</span>, &)",

The former is a fix (one that is likely to pop up in in some other places, not just the JSON output),
while the latter is an addition that I need for my doc generator, as I need to detect where Crystal decides to linkify the types and such.
One could argue that the generator should instead output some semantically defined structure to convey this information, not just HTML that it happens to have on hand, but really the HTML works just as well, and it will definitely unblock me.
Besides, the generator was already outputting HTML, just not in a useful way.

I am not developing any new code, just clearing up the mixup in conditionals between the "string" and "HTML" outputs that already exist (except for macro.cr, but that's also just an omission that I'm fixing).

I recommend to view the diffs commit-by-commit.


This pull request revolves around the args_to_html method. In Crystal doc generator it is used to convert a method's arg signature into HTML that is syntax-highlighted, and with cross-links inserted -- all escaped as appropriate.

E.g.

def pretty_inspect(width = 79, newline = "\n", indent = 0) : String
(width = <span class="n">79</span>, newline = <span class="s">&quot;\n&quot;</span>, indent = <span class="n">0</span>) : <a href="String.html">String</a>

Because docs also have "method summary" entries without links (but still with syntax highlighting), that method has an escape hatch links: false. The call with this arg is aliased as args_to_s.

But there are also several places where non-highlighted and non-escaped original text is wanted. Unfortunately, currently that is achieved by just deleting everything in angle brackets <...> in that args_to_s. That has some issues, but the biggest one is that this escaping is not being undone!

This args_to_s is then used to produce HTML heading IDs, and links to them, and currently they are malformed in many situations.

So, this pull request implements the missing third mode - to just not produce any HTML, and then args_to_s is changed to use that mode.

args_to_html

Tier Escaping Highlighting Links
None
Highlight + +
All + + +

On this table the top row is the new one. args_to_s used to be like Highlight, but is fixed now. And several workarounds that "post-processed" args_to_s are removed.


This args_to_s is also directly put into the JSON produced by the doc generator, presumably with the intention of being a plain string. But actually HTML ended up in it!

"args_string": "(width = <span class=\"n\">79</span>, newline = <span class=\"s\">&quot;\\n&quot;</span>, indent = <span class=\"n\">0</span>) : String",

So (actually as my original motivation for this PR) I am changing it to put this:

"args_string": "(width = 79, newline = \"\\n\", indent = 0) : String",
"args_html": "(width = <span class=\"n\">79</span>, newline = <span class=\"s\">&quot;\\n&quot;</span>, indent = <span class=\"n\">0</span>) : <a href=\"String.html\">String</a>",

In addition to a proper text mode, I am keeping a full HTML mode (even with links), because I actually need it for my own doc generator, to detect where Crystal decides to linkify the types and such.

@oprypin

This comment has been minimized.

@oprypin

This comment has been minimized.

@oprypin

This comment has been minimized.

@oprypin oprypin changed the title Fix consistency between arg_to_s and arg_to_html, expose args_html Fix escaping of argument lists in docs, better expose them to JSON Jan 2, 2021
@oprypin
Copy link
Member Author

oprypin commented Jan 2, 2021

Now all good. I had to pivot this PR to also fix some issues with wrong escaping of HTML ids.

Current doc diff (all changes are improvements)
One particularly interesting improvement
Another particularly interesting improvement

@oprypin
Copy link
Member Author

oprypin commented Jan 2, 2021

@jhass, may I request your review

@straight-shoota
Copy link
Member

I'm not convinced by the configuration enum in c359f8f.
I would propose to replace html with two independent bool args highlight and links. It's more explicit because I don't have to look up the enum definition to learn that :all actually means highlight and links.

@oprypin
Copy link
Member Author

oprypin commented Jan 2, 2021

There are actually 3 bool args then: escape, highlight, links.

Implementing 23 possibilities when actually there are only 3 is of dubious value. The implementation becomes more complicated (with many more edge cases) but those options are never used.

We could assume that no-highlight-no-links means no-escape (which is dirty enough already), but that still leaves the no-highlight-with-links branch which is kinda undefined but would need to be implemented.

I actually tried this first, btw.

As implemented now, we have tiers.

Tier Escaping Highlighting Links
None
Highlight
All

@oprypin
Copy link
Member Author

oprypin commented Jan 2, 2021

Additional diff for the latest added commit aabdf28 - basically ends up fixing IDs of <=> method

@oprypin
Copy link
Member Author

oprypin commented Jan 2, 2021

Reworked the description of the pull request to this:

This pull request revolves around the args_to_html method. In Crystal doc generator it is used to convert a method's arg signature into HTML that is syntax-highlighted, and with cross-links inserted -- all escaped as appropriate.

E.g.

def pretty_inspect(width = 79, newline = "\n", indent = 0) : String
(width = <span class="n">79</span>, newline = <span class="s">&quot;\n&quot;</span>, indent = <span class="n">0</span>) : <a href="String.html">String</a>

Because docs also have "method summary" entries without links (but still with syntax highlighting), that method has an escape hatch links: false. The call with this arg is aliased as args_to_s.

But there are also several places where non-highlighted and non-escaped original text is wanted. Unfortunately, currently that is achieved by just deleting everything in angle brackets <...> in that args_to_s. That has some issues, but the biggest one is that this escaping is not being undone!

This args_to_s is then used to produce HTML heading IDs, and links to them, and currently they are malformed in many situations.

So, this pull request implements the missing third mode - to just not produce any HTML, and then args_to_s is changed to use that mode.

args_to_html

Tier Escaping Highlighting Links
None
Highlight + +
All + + +

On this table the top row is the new one. args_to_s used to be like Highlight, but is fixed now. And several workarounds that "post-processed" args_to_s are removed.


This args_to_s is also directly put into the JSON produced by the doc generator, presumably with the intention of being a plain string. But actually HTML ended up in it!

"args_string": "(width = <span class=\"n\">79</span>, newline = <span class=\"s\">&quot;\\n&quot;</span>, indent = <span class=\"n\">0</span>) : String",

So (actually as my original motivation for this PR) I am changing it to put this:

"args_string": "(width = 79, newline = \"\\n\", indent = 0) : String",
"args_html": "(width = <span class=\"n\">79</span>, newline = <span class=\"s\">&quot;\\n&quot;</span>, indent = <span class=\"n\">0</span>) : <a href=\"String.html\">String</a>",

In addition to a proper text mode, I am keeping a full HTML mode (even with links), because I actually need it for my own doc generator, to detect where Crystal decides to linkify the types and such.

oprypin added a commit to mkdocstrings/crystal that referenced this pull request Jan 2, 2021
@oprypin
Copy link
Member Author

oprypin commented Jan 4, 2021

@straight-shoota Did my reply convince you or not?

@straight-shoota
Copy link
Member

Not totally yet. I'm a bit undecided. Maybe either way is okay. W

@oprypin

This comment has been minimized.

@oprypin
Copy link
Member Author

oprypin commented Apr 2, 2021

Ping

@straight-shoota straight-shoota added this to the 1.1.0 milestone Jun 10, 2021
@straight-shoota straight-shoota changed the title Fix escaping of argument lists in docs, better expose them to JSON Fix escaping of argument lists in doc generator, expose JSON Jun 13, 2021
@straight-shoota straight-shoota merged commit 16fd7df into crystal-lang:master Jun 13, 2021
@Sija
Copy link
Contributor

Sija commented Jun 14, 2021

master is broken by this PR.

In src/compiler/crystal/tools/doc/type.cr:511:46

 511 | node_to_html entry.value, io, links: links
                                            ^----
Error: undefined local variable or method 'links' for Crystal::Doc::Type

@Sija Sija mentioned this pull request Jun 14, 2021
straight-shoota added a commit to straight-shoota/crystal that referenced this pull request Jul 2, 2021
This fixes a regression from crystal-lang#10109 which removed the id property
straight-shoota added a commit that referenced this pull request Jul 3, 2021
This fixes a regression from #10109 which removed the id property
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants