Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Autogen for Codified VU #5438

Closed
wants to merge 1 commit into from
Closed

Conversation

ShabbyX
Copy link
Contributor

@ShabbyX ShabbyX commented Mar 14, 2023

Per @danginsburg's request, this is work in progress for codegeneration for codified VUs. At this point, it produces actual working code for most codified VUs (though it's still missing some pieces, like calling validation for structs).

DO NOT MERGE

@ShabbyX ShabbyX requested a review from a team as a code owner March 14, 2023 15:44
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 37566.

@juan-lunarg juan-lunarg marked this pull request as draft March 14, 2023 16:06
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 11131 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 11131 failed.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 41082.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 11232 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 11232 failed.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 44100.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 24, 2023

Reworked this to use lvl_genvk and OutputGenerator similarly to other generators.

Currently, VVL has validation functions (virtual PreCallValidate*) for entry points, but not structs. The generator in this PR automatically handles calling validation for structs (which are local to the class, not virtual). It'll be a big change, but you might want to consider moving that to chassis, to organize the rest of VVL better; i.e. add validation for structs inside their own functions instead of adding ad hoc helpers and inline validation.

Early work though, so definitely corner cases to deal with.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 11335 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 11335 failed.

Copy link
Contributor

@ncesario-lunarg ncesario-lunarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal @ShabbyX!

Some general comments/questions/concerns:

  • Would it be possible to either split this commit up and/or provide an outline of specific parts here you would like comments on?
  • It would be helpful to have some links/comments illustrating what the spec looks like that generates the code seen here. Along these lines, seeing how one goes from editing the spec to running VVL code would be very helpful and probably useful for spec authors to see (since spec authors would effectively be involved in VVL development in this proposal).
  • This proposes a large shift from C++ "development" to python "code generation." I don't think anyone here is very familiar with the python ast module and associated APIs, and we would need a non-trivial amount of time (and more importantly "global buy-in," which does not exist yet to my knowledge) to review and understand the python code that is proposed here.
  • Automating VUID checks would be nice, but it generally costs significantly more time and effort to create "good tests" for these VUIDs than implementing their validation check. Knowing that tests still need to be written for all new VUIDs, it's not clear to me how much savings this approach adds considering the amount of upfront investment.
  • I believe LunarG provided some example VUIDs that could prove to be challenging for automation. Are there any examples of these?

@@ -1335,6 +1335,8 @@ bool CoreChecks::ValidateImageSubresourceRange(const uint32_t image_mip_count, c
const VkImageSubresourceRange &subresourceRange, const char *cmd_name,
const char *param_name, const char *image_layer_count_var_name, const VkImage image,
const SubresourceRangeErrorCodes &errorCodes) const {
return false;
#if 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this disabled to avoid duplicate checks, or other reasons? Same question for other parts of existing validation that is commented out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The autogen code validates this, so there's not need for this function (well, at least the VUs I was interested in testing against). The #if 0 here is just for demo purposes.

There's also a bit I commented out in param validation because I have an unofficial flag added to add syntax highlighting to autogenerated messages

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These additions look like they could be beneficial independently of this proposal, but I'm not sure I see how this is a pre-requisite for generating VUIDs from python.

Would you be able to provide an example message for us to see (text or screenshot)? Improving error message output has been on "todo list" for a long time, but is very much a separate effort from this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example I demoed to the WG:

image

(the stuff in {{..}} are expression values that are auto-extracted. The list of "interesting objects" is also auto-extracted)

I've been thinking these past few days by the way that if the WG proceeds with this proposal but LunarG is not interested in taking autogen for it, a good chunk of this work can still be taken by basically having the codegen generate helper functions per VU to output it formatted (like in this CL) and those could be called from manually written code.

In fact, that's probably a good thing to do even with autogen because some blocklisted / not-allowlisted VUs would still be written by hand.

return '', 'const auto', Type('const auto')

# TODO: complete this list
statefulObjectTypeMap = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good initial step, but is this a long term solution, or just a stopgap until state tracking is also generated from the spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That really depends on LunarG. I think I've exhausted all my credit for doing non-Google work already :D so I don't believe I'd be tackling autogenerated state tracking; not in the near term at least.

So for now, this'll have to plug into whatever state tracking there is. Some changes would still be necessary though to make autogen easier (like I noticed there's no state for the pipeline cache?)

@@ -804,6 +807,10 @@ class ValidationObject {
local_object_dispatch.emplace_back(new CoreChecks);
}

if (!local_disables[explicit_validation]) {
local_object_dispatch.emplace_back(new ExplicitValidation);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Putting this in a separate VO is probably the best approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤝

@@ -320,7 +320,7 @@ class DescriptorSetLayout : public BASE_NODE {
* descriptor type, but all descriptors in a set can be accessed via the common Descriptor*.
*/

// Slightly broader than type, each c++ "class" will has a corresponding "DescriptorClass"
// Slightly broader than type, each c++ "class" will have a corresponding "DescriptorClass"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: probably best to leave unnecessary changes out of this PR, as there is a lot of code and it makes it a bit difficult to know what is a request for comment and what is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, I'll separate things out whenever this is seriously considered for merge.

self.newline()

# Load validusage.json and extract VUs out of it.
vu_json_filename = os.path.join(genOpts.valid_usage_path, 'validusage.json')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have missed it, but can you please provide a link (e.g., updating scripts/known_good.json to point to a Vulkan-Headers fork/branch) to the file used for code generated here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have that readily available, but you can take this PR in Vulkan-Docs: KhronosGroup/Vulkan-Docs#2064

Then in the docker image used to build the spec run make validusage. This gives the validusage.json file I copy into VVL's external/Vulkan-Headers/registry/ directory. When building VVL, I add the path to Vulkan-Docs (not Vulkan-Headers) to PYTHONPATH so that the VU-related scripts from Vulkan-Docs can be picked up.

Obviously after the work lands in Vulkan-Docs, those extra scripts need to be copied to Vulkan-Headers on updates.

Comment on lines +22 to +23
Once the tooling around autogenerated codified VUs is stabilized, this should turn into a blocklist
as a safeguard to remove problematic VUs, if any."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 29, 2023

  • Would it be possible to either split this commit up and/or provide an outline of specific parts here you would like comments on?

Maybe when the time comes when LunarG is convinced to accept the code, but first the WG needs to be convinced. I put this up per Dan's request so there's visibility into the code I have produced, but not with the intention to merge it for the time being.

  • It would be helpful to have some links/comments illustrating what the spec looks like that generates the code seen here. Along these lines, seeing how one goes from editing the spec to running VVL code would be very helpful and probably useful for spec authors to see (since spec authors would effectively be involved in VVL development in this proposal).

It's generated from KhronosGroup/Vulkan-Docs@af37bc9

I might have had some minor local modifications when generating things.

  • This proposes a large shift from C++ "development" to python "code generation." I don't think anyone here is very familiar with the python ast module and associated APIs, and we would need a non-trivial amount of time (and more importantly "global buy-in," which does not exist yet to my knowledge) to review and understand the python code that is proposed here.

I also had no familiarity with ast when I started. It turned out to be quite trivial however.

But yes, I understand your points. If this is to be taken, I'd be more than happy to go over this and explain things as many times as necessary.

  • Automating VUID checks would be nice, but it generally costs significantly more time and effort to create "good tests" for these VUIDs than implementing their validation check. Knowing that tests still need to be written for all new VUIDs, it's not clear to me how much savings this approach adds considering the amount of upfront investment.

That was brought up in Karen's doc too (and Hans-Kristian pointed out the same). One process change that might be interesting to see IMO (even regardless of codified VUs) is a separation of "negative CTS" (i.e. VVL tests) from the implementation. There are a couple of reasons for this:

  • If the same entity (human, script, etc) implements validation and writes tests for it, they may make the same mistake or miss the same thing in both places. It's more robust if entity A writing tests and entity B writing the code being tested are different
  • Currently, VVL tests are developed at the same time as VVL code. Anyone implementing VVL code needs to write tests too, and test writers need to be familiar with VVL. If the two are separated, then you could have people more familiar with VVL do the VVL work (or a script generating VVL), and people familiar with testing (like Mobica) do the tests.

In other words, developing the negative CTS could be a part of extension development just as developing the CTS is.

One more point. Karen pointed out that nearly half the validation engineers' time is spent on dealing with github issues. Does your statement that writing validation code doesn't take much time take into account the fact that they are followed by bug triage and fixes?

  • I believe LunarG provided some example VUIDs that could prove to be challenging for automation. Are there any examples of these?

Not yet. Those were very helpful, and greatly appreciated. They pointed out some of the harder things the VU language needs to support, and no the VU language is not yet developed enough to support them. I already responded in Karen's doc, but TL;DR is that I see "classes" of VUs which can be tackled one at a time. I started with validating things against createInfo as somewhat the easier class. I have ideas for the class of pipeline state checks, as well as query-based checks. The rest I'd rather defer until these are done.

I don't expect to solve any of those other classes before the WG has even signed off on this work.

@juan-lunarg
Copy link
Contributor

juan-lunarg commented Mar 29, 2023

Karen pointed out that nearly half the validation engineers' time is spent on dealing with github issues. Does your statement that writing validation code doesn't take much time take into account the fact that they are followed by bug triage and fixes?

Writing the validation code is generally not difficult. The difficult part is the testing/maintenance.

Out of 203 issues only 35 are labeled as bugs. Even further we still need to investigate if the issues are legit, or have already been fixed. EX: It's not unusual for users to report 'bugs' which are generally bugs in their code. I spent a decent chunk of time on a users app to only find their application was doing invalid pointer arithmetic.

I have not had to go back and fix my validation aside from the additional VUID I requested being added which I found and pointed out to the extension author.

Here is how most of my time is spent as a validation engineer (In no particular order):

  • Build system issues (2 Android build systems, BUILD.gn, and CMake)
    • The Android/BUILD.gn systems being particularly problematic
  • Dealing with multiple compilers, operating systems, linux distros, MinGW, niche edge cases, etc.
  • CI (GitHub actions)
  • GPU CI (LunarG has it's own solution just like AMD has it's own solution)
  • Package manager issues (Both system/language package managers)
  • Repo integration
  • SDK integration
  • Improving existing testing infrastructure
  • Code reviews (This is a big chunk of time I'm actively trying to reduce with developers tools like clang-tidy)
  • The VKlayer_utils issue <- This is a whole can of worms.
  • Trying to repro users issues which may or may not be valid.
  • Creating tests to repro user issues
  • Improving external repos (VMA, vkBasalt, headers, profiles, etc)
  • Improving documentation
  • Dealing with driver bugs (I wish drivers were required to pass CTS with ASAN/TSAN)
  • Extensions being rushed out the door by hardware manufacturers only caring about features/performance and not debug/test friendliness
  • The Vulkan documentation being downright unhelpful/ambiguous
  • Adding functionality to the mock driver / vulkan profiles to test new extensions
  • Working in conjunction with CTS
  • Reporting bugs to relevant repos/companies
  • Adding code sanitization (address/thread sanitization)
  • Adding positive/negative testing
  • Talking to the author of the extension for clarification due to ambiguity in the extension documentation

I do see your point with the codified VUIDs but it's not going to be saving me much time. If anything it looks like it's going to be eating more of my time.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Mar 30, 2023

  • Talking to the author of the extension for clarification due to ambiguity in the extension documentation

I do see your point with the codified VUIDs but it's not going to be saving me much time. If anything it looks like it's going to be eating more of my time.

Thank you, this is very helpful insight. It sounds like the positive impact codified VUs would have on you is removing the VU ambiguity from the spec then. I guess the negative is you'd be dragged into dealing with autogen?

@juan-lunarg
Copy link
Contributor

It sounds like the positive impact codified VUs would have on you is removing the VU ambiguity from the spec then.

MikeB was very responsive to questions/feedback so it was a very pleasant experience writing the validation for their extension. I didn't really have a problem with the process to be honest. In fact communicating with them made me understand the extension/intent much better since they also sent me plenty of material to work with.

@juan-lunarg
Copy link
Contributor

Closing old PR with multiple merge conflicts.

@juan-lunarg
Copy link
Contributor

Please re-open this MR on the GitLab. We try and keep old PRs to a minimum on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants