Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal doc for codified VUs #2043

Closed
wants to merge 1 commit into from
Closed

Conversation

ShabbyX
Copy link
Contributor

@ShabbyX ShabbyX commented Jan 31, 2023

No description provided.

if (attachment.imageView != VK_NULL_HANDLE and
attachment.resolveMode != VK_RESOLVE_MODE_NONE):
resolveLayout = attachment.resolveImageLayout
require(resolveLayout != VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL and
Copy link
Contributor

@oddhack oddhack Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One readability thing we're likely to run into is that many Vulkan tokens are extremely long, and come near to running off the right side of the window as is, so will be much more common to see that when they are formatted as source code in a require() expression like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed that's partially why this example uses a variable to make the LHS of the comparison shorter. I would very much like to fix that, especially because line wraps and python scoping don't really mix well.

In fact, we already have this problem with our promoted enums (see wrapped lines at the bottom of these enums for example): https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkFormatFeatureFlagBits.html

IMO, the best way to deal with that is to have code blocks (not just these VUs, but also struct definitions, enums, etc) be under a div that can get horizontal scrolling. We could generally avoid that by structuring our sources to use shorter names for long expressions, but when not possible, it'll be good to have that automatically avoid overflow.

Another way to (mostly) address it by the way is to enhance the formatter to track length of expressions and automatically break expressions into multiple lines. We have these sort of code formatters in Chromium scripts, but I think a generic one like that is going to be too much complexity. Right now, what I have coded is unconditional line breaks between "and" and "or" expressions, and no line breaks otherwise. I'm hoping that would be generally sufficient for VUs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, currently in #2053 I've opted for disabling line wrapping for codified VUs.

@darksylinc
Copy link

Finally this is getting addressed :)

Yeah the current descriptions are cryptic. Using code is a valid proposal.

Sounds good, but I am bit skeptical that Python indentation could cause problems as these messages will be printed by validation layers, and stdout / text log formatting sometimes contains exotic formatting, e.g. on Android this is unfortunately common:

# What was meant to be printed:
if condition:
    if another_condition:
        some_function();
        another_function();

# What was actually printed:
if condition:
    if another_condition:
[APPLICATION NAME - LOG 9888 10:58:35]            some_function();
        another_function();

# With this, there's no confusion:
if condition {
    if another_condition {
[APPLICATION NAME - LOG 9888 10:58:35]            some_function();
        another_function();
    }
}

Thus I'd prefer if using braces to contain the bodies.

Actually now that I think about it the braces can be generated automatically, thus even if you don't want to use { the validation layer can be built to include them.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 22, 2023

Sounds good, but I am bit skeptical that Python indentation could cause problems as these messages will be printed by validation layers, and stdout / text log formatting sometimes contains exotic formatting, e.g. on Android this is unfortunately common:

Thank you. The reason I went with python indent vs braces is to avoid inventing a language here, which reduces the load on readers. Also without braces you get fewer lines which is a bonus given how big the spec already is :)

You're right that VVL can change to braces, but ultimately the application itself can reformat the message as it wants. That said, regarding Android, I think it's best if the application doesn't output multi-line messages to logcat in the first place.

In other words, I would split the message by lines, then output line by line. So basically you would get this:

[APPLICATION NAME - LOG 9888 10:58:35]  if condition:
[APPLICATION NAME - LOG 9888 10:58:35]    if another_condition:
[APPLICATION NAME - LOG 9888 10:58:35]      some_function();
[APPLICATION NAME - LOG 9888 10:58:35]      another_function();

which I hope you can agree looks fine!

Examples include:

- `require`: The equivalent of `assert`, describes the condition that Vulkan
requires.
Copy link
Contributor

@krOoze krOoze Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of having require() over the VU expression simply returning true or false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting turning require(cond) to if not cond: return false? require() is simpler IMO.

Copy link
Contributor

@krOoze krOoze Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WTF, of course not. Proper way to write if not cond: return false would be return cond (minus the return if we inplicitly assume every VU is boolean expression). Even so, a direct require() equivalent to your example would be if not cond: require(false), which helps nothing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that doesn't allow you to support multiple require calls in the VU if needed.

See this very simple example: af37bc9#diff-09069e339671f4a6dd446abad39770bcb7b8907bcbb780ed636ec822e38da95d

if we inplicitly assume every VU is boolean expression

That is one idea, but it'll be too restrictive compared with allowing the VU to have any logic.

Copy link
Contributor

@krOoze krOoze Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that shouldn't be a thing. If more than one is really necessary, it points to some problem with it (e.g. the VU not being atomic enough, or the long predicate should be named for better reading pleasure).

PS: what's the diff over simply using && in the above linked example?

PPS: what would be interesting though is batching multiple VUs together if they share the same prolog like:

for( elem in pArray ):
    VUID-1_requires(elem.a == valid_a);
    VUID-2_requires(elem.b == valid_b);

which probably should be how the validation code should look like (although compilers should perform loop merging). And also might help the reader if the VUs are thematically batched together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re PS: That VVL could tell you which of those requires failed, which makes it easier for the reader to understand where exactly the problem is. If VVL says a && b fails, you can't immediately see if it's a or b that has failed.

Re PPS: Yes, indeed that's what I have in mind. It can make the VUs a bit less repetitive.

[source,python]
~~~~
* [[VUID-VkRenderingInfo-colorAttachmentCount-06097]]
for attachment in pColorAttachments:
Copy link
Contributor

@krOoze krOoze Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be something like for a in c_arr(pColorAttachments, colorAttachmentCount) to permit using the great foreach iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

colorAttachmentCount is implicitly derived from the xml!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. Not sure why, but something rubs me the wrong way about that.

It is somewhat brittle. What if noautovalidity gets in the way (e.g. VkWriteDescriptorSet)? If one adds noautovalidity later, he then needs to remember to doctor all the VUs using the length so they still work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for VkWriteDescriptorSet, that's an interesting one for VVL codegen I should deal with!

It seems to me like this is more a problem for VVL though than the spec. Like the spec VUs would say for info in pWriteInfo:, and semantically it's talking about all infos in pWriteInfo, however many that may be. The fact that VVL has to go through hoops to figure that out is unfortunate, but not a spec problem.

(in this particular case, what would the VU put in as the count anyway?)

Copy link
Contributor

@krOoze krOoze Mar 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, exactly, what would be put as count?

Implicit behavior is cute and all, but it makes the exceptions all the more surprising. How would you know there is a problem, unless you are forced to write the count explicitly each time?

:sectnums:

In this proposal, a new procedural syntax for writing Valid Usage (VU)
statements in the spec is introduced.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I wonder how daring would be to do it the other way around, rather than inventing yet another language.

I.e. take the C++ layers and generate something reasonable into the spec. It would require some discipline though with the validation layer architecture and codestyle, which so far was little bit organic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite daring I'd say :)

So far, VVL doesn't seem very amenable to large-scale changes, and autogen for these is a smaller scale than that. It was suggested before though, that validation could live outside the spec (but probably still in this syntax).

It's quite hard to go from C++ to something simple to read I'd say. The other way around is much easier (even if it's still quite a big change)

Copy link
Contributor

@krOoze krOoze Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way feels like starting from scratch. Starting with C++ would though mean we ideally end up with one language and one source instead of two. (BTW, is the part of this to still attempt to back-translate to human-readable language, or given the audience we can afford to stick to codified VUs only?)

C++ is simple to read to me. Well, depends on the writer. The examples in this proposal are relatively C89-istic , and not quite leveraging the more beautiful parts of Python anyway.

As I was saying in the forums, I think it would be great if the codified VUs were more semantic than procedural. And ideally more one-liney. E.g. I think most of the array VUs should look something like all( filter( pArray, only_elements_with_this_property ), element_be_valid );, rather than a web of raw fors and ifs. Similarly so we don't keep repeating if parameter is not NULL: require( parameter == valid_value ), there should be some high-level semantic shorthand way of expressing that like optional(parameter) == valid_value.

Not sure if this is too esthetic concern at this point. Then again, might be hard to do if not done in this style from the start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, is the part of this to still attempt to back-translate to human-readable language, or given the audience we can afford to stick to codified VUs only?

That's the assumption, yes, that we're all programmers so we can read simple code just fine.

As I was saying in the forums, I think it would be great if the codified VUs were more semantic than procedural. And ideally more one-liney. E.g. I think most of the VUs should look something like all( filter( pArray, only_elements_with_this_property ), element_be_valid );, rather than a web of raw fors and ifs.

There's a balance to be had here and I think we can adjust over time. Requiring more of python (say lambdas) makes this technically more difficult (bear in mind that I'm tackling VVL code generation as well), but that aside it also means Vulkan developers would have to have some non-trivial familiarity with Python.

As much as I'd love to have that, I'm hesitant to add anything that makes VUs unreadable to people who don't know Python. You are right that the syntax is basic, and that was on purpose. Anyone can read if x > 0:, even if they've never seen Python before.

That said, I've already noticed some cases that become too verbose with this language, like the one before last in this file: af37bc9#diff-6c1b4ac11212d29616a39eb88d49ade5da7def6269874081730c261ae2f61562
which is exactly where I'd like to be able to use any(...). I'm still considering how to best approach that, but definitely as a "next step", as this is already a big job.

Copy link
Contributor

@krOoze krOoze Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pretty much modern C++ too (not necessarily Python), so it is not that unreadable. Actually it may be more readable to people outside programming, because it looks more English or more like standard math expression: ∀elem∈Arr: elem = valid_elem.

Problem with procedural style is it is too preoccupied to describe how things should be executed rather than to transfer information\meaning from one reader to another. Even if everyone\me understands how (raw) for and if works, it takes longer and is strenous to mentally parse if in large numbers (e.g. scrolling through that link seeing 420 fors, my brain immediately died like if I was looking at white noise). It is like seeing raw new; it is slightly archaic and annoying code style (your brain tries to simulate the iteration to interpret what it means, instead of just directly read what it conceptually means. Some property applying to all elements of an array is a concept\meaning. Iteration\for is an irrelevant implementation detail.)

Copy link
Contributor

@krOoze krOoze Oct 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncesario-lunarg It helps testing in the way that it makes it nearly irrelevant. The tests largely test whether we manually translated the human-readable VU to code correctly. That largely falls if VU is code, right? What would newly be needed is other quality controls; e.g. checking if performance characteristics of validation is sensible, and if the error messages are as useful and legible as can be (this may require some further smarts, and likely some manual interventions).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests largely test whether we manually translated the human-readable VU to code correctly.

Yes, but it also exercises the checks on a number of different drivers/platforms/etc. Even if the checks are 100% accurate and correct, I would say the checks are only correct in theory without actually running them on some HW/drivers. That said, if it were possible, generating the spec from code does seem more appealing the more I think about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the subject of testing, it feels to me like if mock ICD could somewhat accurately represent different devices (i.e. respond to complex queries the same way the real driver does), you could rely on it for a lot of the testing (other than for GPU-AV of course). Like you could reduce your costs a lot by testing on different emulations of devices, and then, like, run a nightly test on real hardware.

It'll probably be a lot of work to make that happen though.

That said, if it were possible, generating the spec from code does seem more appealing the more I think about it.

Glad to hear that 🎉

Did you see my presentation at the last F2F?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to hear that 🎉

Just to clarify, this is w.r.t. to the general idea of generating the spec from the C++ code that actually makes the runtime check. Which is somewhat ideal, but I'm not sure how one would achieve it.

I do think having VUs codified would yield some great benefits (like the ones you've highlighted at the F2F, which I did see ;)), but making that change this late in Vulkan 1.x's development still seems too disruptive IMHO.

Copy link
Contributor

@krOoze krOoze Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncesario-lunarg Not sure what VVL tests have to do with HW. VVL runs against API usage not against HW\driver. Conformace suite is for testing the HW\driver.

At most VVL touches the driver is querrying some limit value. The classes of errors caused by codified VUs would be different though, and exhaustive per-VU tests do not seem like the correct quality control strategy. There should be like only some whitebox tests that verify the actual conversion to "real" code does not include integer overflow or something silly like that. I.e. unit of the test should basically be each code generation function, and not have per-VU (much less per-HW) tests like it is now.

@ShabbyX Somewhat related. I think there should also be structured Valid Driver Behavior in the spec: #2004. And perhaps become somewhat codified too.

@darksylinc
Copy link

In other words, I would split the message by lines, then output line by line. So basically you would get this:

[APPLICATION NAME - LOG 9888 10:58:35]  if condition:
[APPLICATION NAME - LOG 9888 10:58:35]    if another_condition:
[APPLICATION NAME - LOG 9888 10:58:35]      some_function();
[APPLICATION NAME - LOG 9888 10:58:35]      another_function();

which I hope you can agree looks fine!

Oh I think you misunderstand.

THIS:

if condition:
  if another_condition:
[APPLICATION NAME - LOG 9888 10:58:35]      some_function();
    another_function();

is how it looks if you try to print it line by line. Android is a disaster.

Internally the raws in logcat print the [APPLICATION NAME - LOG 9888 10:58:35] on every line, but Android Studio will filter that out and sometimes and randomly it gets it wrong on some lines so we end up with the broken version.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 23, 2023

Internally the raws in logcat print the [APPLICATION NAME - LOG 9888 10:58:35] on every line, but Android Studio will filter that out and sometimes and randomly it gets it wrong on some lines so we end up with the broken version.

Yeesh! I don't have any experience with Android Studio, I've generally been doing adb logcat | grep $MYLOGPREFIX

As you said though, changing scope to braces could always be done as a postprocess step (even in the app) if the developer prefers it.

@Izhido
Copy link

Izhido commented Mar 25, 2023

I would like to propose something completely different.

If complex prose cannot be done right in one line of text (understandable), then instead of killing the prose, why not kill the "one line" part?

Instead of explaining the problem in one line, let's generate an URL to a site containing the multi-line (multi-paragraph? multi-page??) explanation. Since validation messages contain sometimes data from the app itself, included in the message, let's send these pieces if data as parameters in the URL - the site itself should be able to parse them and substitute them into the explanation template, thus creating quite a complete explanation, with links even to other sites for further help.

Having to learn a second language to understand validation errors, in my opinion, is going to be more frustrating that what we have right now.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 26, 2023

If complex prose cannot be done right in one line of text (understandable), then instead of killing the prose, why not kill the "one line" part?

Instead of explaining the problem in one line, let's generate an URL to a site containing the multi-line (multi-paragraph? multi-page??) explanation. Since validation messages contain sometimes data from the app itself, included in the message, let's send these pieces if data as parameters in the URL - the site itself should be able to parse them and substitute them into the explanation template, thus creating quite a complete explanation, with links even to other sites for further help.

I've been toying with that idea too, and I completely agree that'd be a useful thing. I think that's orthogonal though. The valid usage language (VUs) are basically assertions; they define what assumptions the driver can make about its Vulkan input (and obviously the application must adhere to). As such, they need to be as precise as can get. On the other hand, it's useful for application developers to understand why those assertions are there, and I agree a wiki of some sort that explains a VU in details would be nice to have.

It's not an easy task though. For one, the VUs keep getting refined and VUIDs change (in my observation, mostly because the VUs can be ambiguous and imprecise right now). I'm hopeful that codified VUs would improve that, but nevertheless this also requires someone to actually write helpful explanations for thousands of VUs, and unless any company is willing to pay for that, it's unlikely to materialize unfortunately.

If that does happen one day, for the record my idea is for the validation layers to output the VU and then a url like https://vulkan-vus-explained.com/VUID-command-param-1234 which would take the developer to the explanation page for that VUID. All this said, this would be a project completely outside the scope of the Vulkan spec.

Having to learn a second language to understand validation errors, in my opinion, is going to be more frustrating that what we have right now.

I have to admit this is the first time anyone has mentioned that reading VUs in the proposed language would be harder than prose. In truth, the language reads like prose (because Python!), but generally easier to understand for programmers. Take this VU for example

If the flags member of any element of pCreateInfos contains the VK_PIPELINE_CREATE_DERIVATIVE_BIT flag, and the basePipelineIndex member of that same element is not -1, basePipelineIndex must be less than the index into pCreateInfos that corresponds to that element

The very first apparent issue is the back-references, which can easily get confusing as the VU gets more complex. In the above, "same element" and "that element" are referencing "an element of pCreateInfo". In mathematics, you'd assign a symbol to "an element of pCreateInfo" so you can avoid ambiguous references like that (well, not really ambiguous in this simple VU, but not all VUs are this simple). In programming, we use a variable.

Also, if you pay close attention, the above VU is effetively a program itself, just written in an unusual way:

  • any element of pCreateInfos: this is a for loop
  • the flags member of X contains the VK_PIPELINE_CREATE_DERIVATIVE_BIT flag: This is a bit test of X.flags against VK_PIPELINE_CREATE_DERIVATIVE_BIT
  • the basePipelineIndex member of X is not -1: This is X.basePipelineIndex != -1
  • basePipelineIndex must be less than the index into pCreateInfos that corresponds to that element: This is assert(X.basePipelineIndex < i) (where i is the loop counter)

Notice how everything above is the usual programming constructs we see everyday. So, the above VU can be written as:

for info in pCreateInfos:
  if (info.flags.has_bit(VK_PIPELINE_CREATE_DERIVATIVE_BIT) and
      info.basePipelineIndex != -1):
    require(info.basePipelineIndex < loop_index(info))

Is that less readable in your opinion?


I agree that there'd be some learning involved. In particular, these functions added to the VUs (require, has_bit etc) need a definition (a la 79bfe41), and you'd have to click through to their definition to know what they do (though hopefully we can make sure the names are descriptive enough in most cases).

However, I'd argue that it's a price worth paying. My logic basically is that, in analogy, it's better to pay the cost of learning what ∫ is so you could talk about integrals succinctly, than to have to take a paragraph to describe integrals on every reference. In other words, just like in mathematics, this proposal is introducing symbols and conventions which need learning, but once done make life so much easier.

What do you think?

@krOoze
Copy link
Contributor

krOoze commented Mar 26, 2023

@Izhido I don't think there is any word limit now. But with English the issue is mo words mo problems. Not many recognize this unfortunately, but with some effort it is possible to write code that is self-explanatory. Therefore code might have higher limit of what can be achieved on this front. (Plus benefits that with code "what-you-see-is-what-is-validated" could be achieved)

As for language, since spec is C, I would kinda prefer C++ subset.

@ShabbyX I presume Python is chosen largely because of bracephobia (which personally I don't understand, and actually like structure). Python feels like it would lead to awkwardness with pointers and whatnot. E.g. say I want to express pointer must not be zero and what it points to must also not be zero? Plus, means it cannot be used in layers verbatim.

@polarina
Copy link
Contributor

Is that less readable in your opinion?

Has it been considered or explored the possibility of generating English VU statements automatically from the codified VUs?

@krOoze
Copy link
Contributor

krOoze commented Mar 26, 2023

@polarina I was asking that above, and the holy grail is that we ideally wouldn't need to, as programmers prefer code.

I assume low quality English should be relatively easy. High quality would be tricky.

Assuming the code is written decently(say something like all( inArray(pCreateInfos).thatAre(DerivativePipeline), HaveBasePipelineEarlierInTheArray)), then what do we even need English for?

Copy link

@jzulauf-lunarg jzulauf-lunarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading over this it's gel'd to me that computer parsable VU's statements are a good idea, and expanding the regime of code-generated validation is important, that formalizing a state/state-query model is a great idea. However, an imperative "the-spec-is-code" approach isn't going to get us to the goals of greater automation and coverage.

  1. You're changing where validation coding is done, not expanding code generation, that's really the opposite of the goal of greater automation and coverage.
    a. If you don't actually attempt to run these code snippets, you're generating masses of untested code. The first rule of testing is "Any code that isn't tested is broken."
    b. In order to test the code snippets you have to pull in the whole of the vvl build/test CI infrastructure, which is high impact to the spec generation process.
  2. There are rules/requirements description languages that are capable of representation correctness/validity statements. These languages (with a robust state/query model) would support code generation as well as natural language renderings. There are several that appear to have active academic and industrial development behind them. The approaches vary from declarative notations (including XML representation), to "constrained natural language" approaches -- which while directly readable is in fact in a non-ambiguous computer readable formalization. None of these use an imperative approach AFAICT from a very brief survey.
  3. The imperative/python is (when seen as as examples) not more readable than the current English language statements, and natural language rendering of the rules isn't possible (or easily possible)
  4. A "white-space significant" language isn't well suited to error message presentation.

Having said all that, continuing to move forward, especially in an incremental, minimally disruptive way to find a good, automatable, rule based approach, and a solid state/query model (perhaps even an extension implementable by layer) are important.

:refpage: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/
:sectnums:

In this proposal, a new procedural syntax for writing Valid Usage (VU)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Procedure (imperative) vs. declarative is something that seems worrisome in terms of complexity, comprehensibility. If we can't declare the requirement without defining the procedure to establish the validity, then perhaps the requirement is not well defined.

Semantic rule definition languages seem far better suited that defining the valid usage algorithmically. They are also far more susceptible to conversions to natural language (with the rule language as a final arbiter for meaning).


== Problem Statement

Originally, Vulkan VUs were written in prose (and continue to do so).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worse, some portion of the validity checks require information that only occurs within the general prose of the text.

Other issues that vex VUID's include redundancy. There are classes of VUID's that repeat the same the requirements across several calls which could be better represented as "resource enumerated within pThings, must comply with concept Thing"

Not clear that the prose statement of VU's is the most critical issue...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worse, some portion of the validity checks require information that only occurs within the general prose of the text.

Yes, part of the complexity of having codified VUs is that we actually have to fix the spec!

Other issues that vex VUID's include redundancy. There are classes of VUID's that repeat the same the requirements across several calls which could be better represented as "resource enumerated within pThings, must comply with concept Thing"

Not clear that the prose statement of VU's is the most critical issue...

That's a good point, and I'm sure the spec can be reworked to "refactor" things into concepts where applicable. I'd like to point out however that codified VUs unlock more than accurate assertions.

For example, I'd really like to ultimately be in a place where VVL is autogenerated from the spec, such that CTS writes can have VVL right away. That both makes CTS writing easier, and we'd avoid amassing validation errors in CTS.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 27, 2023

@ShabbyX I presume Python is chosen largely because of bracephobia (which personally I don't understand, and actually like structure). Python feels like it would lead to awkwardness with pointers and whatnot. E.g. say I want to express pointer must not be zero and what it points to must also not be zero? Plus, means it cannot be used in layers verbatim.

Well no, I chose Python because:

  • It reads like English
  • I didn't want to invent a new language (i.e. less learning burden on readers)
  • ast.parse() makes parsing a one-liner, and AST traversal is a breeze, this greatly reduced the complexity of this work

Speaking of layers, there's always a solution for corner cases (like deref(pointer) != NULL if we end up having to). And so far pointer-level being implicit hasn't given me grief in KhronosGroup/Vulkan-ValidationLayers#5438 🤷

And if anyone is interested, this is what I have coming out of VVL from that PR now:

image

(stuff in {{...}} are the auto-extracted values, and yes the highlighting is real)

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 27, 2023

Reading over this it's gel'd to me that computer parsable VU's statements are a good idea, and expanding the regime of code-generated validation is important, that formalizing a state/state-query model is a great idea. However, an imperative "the-spec-is-code" approach isn't going to get us to the goals of greater automation and coverage.

First off, the time for this feedback was three months ago: https://gitlab.khronos.org/vulkan/vulkan/-/issues/3321 I'm sure everyone has a different preference for the syntax (and there will never be one that satisfies everyone), but I certainly won't personally find the time to redo everything from scratch. I'd be more than happy to hand this off to validation experts feeling that I've proven this is not an impossible thing.

  1. You're changing where validation coding is done, not expanding code generation, that's really the opposite of the goal of greater automation and coverage.
    a. If you don't actually attempt to run these code snippets, you're generating masses of untested code. The first rule of testing is "Any code that isn't tested is broken."
    b. In order to test the code snippets you have to pull in the whole of the vvl build/test CI infrastructure, which is high impact to the spec generation process.

Aha, but what we have now is also code! It's code written in English; it's untested and sometimes ambiguous code. With codified VUs you get the new option to bring in VVL to CI; something you currently can't do.

My suggestion is to take one step at a time. We continue to produce untested code (in another syntax), and as VVL codegen matures pull it to CI as an improvement. In KhronosGroup/Vulkan-ValidationLayers#5438 I've gated codegen to an allowlist so the ecosystem is never blocked on codegen bugs, but I expect over time that would turn into a blocklist.

I can't stress this enough, what we have today in English is also code and all your arguments apply to it. Or are you concerned that code written in English is less error prone? FYI, in #2081 I've implemented a type checking pass run on the VUs at build time to try and catch at least every bug other than logical errors.

  1. There are rules/requirements description languages that are capable of representation correctness/validity statements. These languages (with a robust state/query model) would support code generation as well as natural language renderings. There are several that appear to have active academic and industrial development behind them. The approaches vary from declarative notations (including XML representation), to "constrained natural language" approaches -- which while directly readable is in fact in a non-ambiguous computer readable formalization. None of these use an imperative approach AFAICT from a very brief survey.

Ack, though like I said I'm not looking forwards to redoing everything. I wish you had brought this up earlier.

As a tangent, I'm not sure I get the appeal of generating natural language out of code. We read code 8+ hours a day, aren't we used to it?

  1. The imperative/python is (when seen as as examples) not more readable than the current English language statements, and natural language rendering of the rules isn't possible (or easily possible)

I wish we could have statistics on this, as otherwise it feels like whether this is more readable or less is just personal opinions of a few of us 😞 (and me and you, we live in the spec and are definitely both biased w.r.t the readability of the spec compared to the average Vulkan user for whom every VU they see is a new thing).

Like for me, something like this needs a few readings to understand:

If the pname:pNext chain includes a
slink:VkRenderPassInputAttachmentAspectCreateInfo structure, the
pname:inputAttachmentIndex member of each element of its
pname:pAspectReferences member must: be less than the value of
pname:inputAttachmentCount in the element of pname:pSubpasses identified
by its pname:subpass member

But this is much easier to follow (again, for me):

if has_pnext(VkRenderPassInputAttachmentAspectCreateInfo):
  aspectInfo = pnext(VkRenderPassInputAttachmentAspectCreateInfo)
  for aspect in aspectInfo.pAspectReferences:
    subpass = pSubpasses[aspect.subpass]
    require(aspect.inputAttachmentIndex < subpass.inputAttachmentCount)

As a spec writer, I also find writing the code easier than English (but granted, it's more important for this to be readable than writable)

  1. A "white-space significant" language isn't well suited to error message presentation.

Any specific reason why formatting cannot be retained in the log? If the formatting is to be garbled, I think the message would be too hard to read anyway even if syntactically correct.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 27, 2023

  1. You're changing where validation coding is done, not expanding code generation, that's really the opposite of the goal of greater automation and coverage.
    a. If you don't actually attempt to run these code snippets, you're generating masses of untested code. The first rule of testing is "Any code that isn't tested is broken."
    b. In order to test the code snippets you have to pull in the whole of the vvl build/test CI infrastructure, which is high impact to the spec generation process.

Aha, but what we have now is also code! It's code written in English; it's untested and sometimes ambiguous code. With codified VUs you get the new option to bring in VVL to CI; something you currently can't do.

My suggestion is to take one step at a time. We continue to produce untested code (in another syntax), and as VVL codegen matures pull it to CI as an improvement. In KhronosGroup/Vulkan-ValidationLayers#5438 I've gated codegen to an allowlist so the ecosystem is never blocked on codegen bugs, but I expect over time that would turn into a blocklist.

I can't stress this enough, what we have today in English is also code and all your arguments apply to it. Or are you concerned that code written in English is less error prone? FYI, in #2081 I've implemented a type checking pass run on the VUs at build time to try and catch at least every bug other than logical errors.

Oh, and as we discussed before with Karen, you'd probably want to have a "negative API" CTS written for the VUs. That's basically the tests that live in VVL today which are produced together with the implementation of the VUs. With codified VUs, you could have both the CTS and the negative CTS (i.e. tests for VVL) developed as part of the spec release (for example by contractors), with VVL coming out for free as autogenerated code.

(or at least that's the ideal, I suspect manual intervention would still be required for a long time, same with VUs that can't yet be expressed as code)

@jzulauf-lunarg
Copy link

jzulauf-lunarg commented Mar 27, 2023

@ShabbyX --

First off, the time for this feedback was three months

Not really. The point of this PR was to get a look at what this would look like, and the proposed usage (code-gen, not code-gen'd, etc.) has been evolving. Sometimes it's tough to really grok something until you see it. The difference isn't mere syntax. Declarative and imperative description are semantically different. We don't want to define steps to validate, we want to define validity.

I certainly won't personally find the time to redo everything from scratch

The proposal isn't the plan of record, and the time put into defining the state/query interface is time well spent, totally conserved, and likely only a fraction of the needed interface design and execution. The lost effort is the handful of python exemplars only.

Aha, but what we have now is also code! It's code written in English;

Except it isn't. No one expect the English to be executable, without human interpretation. Yes, both the English and the python code suffer from issues with possible issues of correctness in representing the requirement, but additionally, the code needs to be syntactically correct, and correctly implement the intended algorithm. It has a higher bar for correctness. Unless we actually run the snippets, they're just untested code.

With codified VUs, you could have both the CTS and the negative CTS (i.e. tests for VVL) developed as part of the spec release (for example by contractors), with VVL coming out for free as autogenerated code.

I think this is possible and very interesting, but I think the imperative form will make this far harder (if not impossible), as somehow you have to decode a requirement from an implementation of the validation algorithm, than from a declarative statement of the validity requirement. A machine readable rule would contain the information needed without attempting to reverse it out from the implementation.

Also note the correct comparison for end user readability is:

if has_pnext(VkRenderPassInputAttachmentAspectCreateInfo):
  aspectInfo = pnext(VkRenderPassInputAttachmentAspectCreateInfo)
  for aspect in aspectInfo.pAspectReferences:
    subpass = pSubpasses[aspect.subpass]
    require(aspect.inputAttachmentIndex < subpass.inputAttachmentCount)

vs.

If the pNext chain includes a VkRenderPassInputAttachmentAspectCreateInfo structure, the subpass member of each element of its pAspectReferences member must be less than subpassCount

not vs. the spec source.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 29, 2023

Declarative and imperative description are semantically different. We don't want to define steps to validate, we want to define validity.

Don't get me wrong, I would obviously be supportive of a syntax that reads even better. But this: "There are rules/requirements description languages that are capable of representation correctness/validity statements" or the above statement are not much to go on with.

Do you have a concrete proposal of what language Vulkan could use that would improve readabilty and "define validity" as you put it?

@krOoze
Copy link
Contributor

krOoze commented Oct 29, 2023

Also not sure about IDLs. Kinda too academic, and overkill here (or at least also like to see specific recommendations to some IDL framework). Maybe we should clearly specify requirements in order to see what best fits them. As I see it they are:

  1. Readability to users (i.e. they should uderstand the API contract from these VUs as best and effortlessly as possible)
    1. Some thought optionally should be given how to present Valid Usage to users in a better form than we found it previously. E.g. permit thematic sorting, invoke the smallest set of most fundamental concepts (so user seeing one VU understands the whole class of them), consolidate VUs (e.g. not iterate arrays multiple times).
  2. Implementability of VVL, with the goal of reduce manual labor and make them more complete.
  3. Present decent error messages when users do make an error anyway

Anything else?


Request for some procedual info from an outsider:

There seem to be relatively large amount of manual labor continuously bound to VVL. Meanwhile @ShabbyX seems to be alone in this R&D that is supposed to largely eliminate all this work.

What I mean to ask, what is the bar here in order to commit to this direction. Or, we have something like a proposal, so when and how it is decided this is a desired direction (and it seems theoretically doable), so we can proceed to specifics of how to do it, and then perhaps not lay it all on one guy to implement it on the side.


I am still kinda stuck up on the idea of doing this in moderm-style non-procedural subset of C++. The hope being I will have small set of concepts and algorithms and certain form of writing VUs, that would be amenable for convincing conversion to English for overwhelming majority of VUs. And that being C++, it can just be copy-pasted in VVL impl.

Would competing proposal be in order for that, or are we still at the stage that this would just be implementation detail of this proposal?

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Oct 29, 2023

@krOoze just FYI, this is being actively discussed internally in Khronos.

Regarding C++, it can always be generated from the current VU language (limited Python). I have that for English (or for that matter, any language potentially) in my fork of the repo. The spec itself will probably not show the VUs in C++, but if there's enough demand, VVL could be made to do that under a flag.

The following is implemented in this change:

- Codified VU parsing
- Codified VU reformatting in source (reflow.py)
- Codified VU formatting for build either in code or english
- Codified VU type checking
- Codified VU customization for build with/without extensions/versions
  * Removes the need for ifdef
- Unit tests

There is no mention of codified VUs in the spec text however at this
point.  That will be added as VUs are converted.
@ShabbyX
Copy link
Contributor Author

ShabbyX commented May 6, 2024

Folded in #2081

@ShabbyX ShabbyX closed this May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants