-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing future opam files take 2 #44
Conversation
Completely reimplements the best-effort parsing mode. The previous attempt failed if a lexing or parsing error occured before the opam-version item had been fully parsed. This revised version begins by reading three tokens from the lexer directly to parse the opam-version, if it's present. It _then_ calls the ocamlyacc parser (feeding the three tokens back to it). If the opam-version parsed is greater than the library's internal version, then _any_ exception causes just the parsed opam-version header to be returned (which is sufficient for the client to display that the file is newer than itself). Finally, in order to permit, for example, opam 2.2 to have new fields but no new lexer or parser, exceptions cause both the opam-version variable to be returned and also a sentinel group with kind `#` which can be used by the client to determine that a parsing/lexing error occurred (and where).
Wow, quite complex but seems to do the job — I trust you have tested it in various scenarios ;) The parsing part is LGTM, but I think you should also have a look at the printer to enforce the invariants there too: may make some mistakes much easier to spot and fix. |
Oh, I hadn't checked Normalise. For the others (Preserved, etc.) I don't think anything wants doing - there's a certain of "garbage in garbage out" - so if you feed an invalid file to the printer (e.g. where you've put the |
For the correctness, I added some additional tests for it 🙂 |
opam-version with a value greater than 2.1 always comes first.
Extra commit pushed - I should concoct some tests for it, though. |
Does the |
🤦 |
We should probably refactor the printer code completely to convert the old format to the full pos one (with dummy locations) and then print that... |
That extra commit causes all of the I think this is ready to go now? |
This is complete reimplementation of #43 with the following changes:
The first attempt couldn't cope with, say,
opam-version: "42" <
because the parsing error occurs before theVariable
has been completely parsed. This revised version instead reads three tokens from the lexer and so "sniff" theopam-version
before parsing starts. With this approach, there's no need to do all of the global state mucking around - we already know if the opam file is a "future" one and so any exception can instead just return theopam-version
line we successfully parsed. In order to allow a future version of opam to intentionally stick with an old parser (e.g. because opam 2.2 adds no new lexing/parsing rules), as well as theopam-version
item, a sentinel section of kind#
is returned which opam can then use to determine that there was actually a parse error.Finally, I thought of a devious, yet curiously actually valid, way of raising
OpamLexer.Error
fromOpamBaseParser.main
which means thatopam-version
in the wrong place now includes a description of the problem rather than just "parse error".