-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checkstyle rule for uppercase constant fields #10673
Conversation
@attilakreiner thanks for doing this and generally this looks good. Could you please extract the Spark + Flink changes into a separate PR (where checkstyle doesn't enforce this naming yet)? I'll review the Spark + Flink changes first (as those typically get quite large) and once those changes are in, we can continue with this PR (containing the changes for all other modules + enforcing the naming). |
@nastra, got it. Done.
|
@@ -284,6 +284,10 @@ | |||
<property name="format" value="^[a-z][a-zA-Z0-9]+$"/> | |||
<message key="name.invalidPattern" value="Member name ''{0}'' must match pattern ''{1}''."/> | |||
</module> | |||
<module name="ConstantName"> | |||
<property name="format" value="^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use possessive quantifiers. That way the matching might be simpler.
I didn't benchmark it, but checksyle can take considerable time overall, and small things like this add up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the regexp from
^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$
to
^[A-Z][A-Z0-9]+(_[A-Z0-9]+)*$
Pls mind the *
vs +
char at pos 15. Pls clarify whether or not this is what you were pointing out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I reverted this change as it breaks for Z_COLUMN
, Z_SCHEMA
and Z_SORT_ORDER
.
Pls point out if you suggest a different regexp here. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to leverage possessive quantifiers replace
^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$
with
^[A-Z][A-Z0-9]*+(_[A-Z0-9]++)*+$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I see. I experimented a bit with the possessive quantifiers in the regexes as well as removing some custom messages that seem redundant.
I created a draft PR with these changes: #10681. Even though it seems like just a few lines of changes, it has an impact for the whole codebase and I don't want to hold up this one. (I'd prefer to keep the scope of this PR tight, really just renaming the constants to uppercase and enforcing that rule.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this PR we should just stick with the default regex pattern of checkstyle, which I believe is what you had initially. Improving the regex can be done in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nastra, yep, same thought here, hence the spin-off draft PR.
In this PR the definition I used is this (this is the current state right now):
<property name="format" value="^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$"/>
the default you linked is this:
<property name="format" value="^log(ger)?$|^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$"/>
So the only difference is I removed the first part that allows for the lowercase log(ger)
as this codebase doesn't use it, so it was redundant.
Also we have the custom message in place to be consistent with the rest of the definitions for now.
Pls LMK if this is all good as it is right now or we need some changes here.
@@ -284,6 +284,10 @@ | |||
<property name="format" value="^[a-z][a-zA-Z0-9]+$"/> | |||
<message key="name.invalidPattern" value="Member name ''{0}'' must match pattern ''{1}''."/> | |||
</module> | |||
<module name="ConstantName"> | |||
<property name="format" value="^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$"/> | |||
<message key="name.invalidPattern" value="Constant name ''{0}'' must match pattern ''{1}''."/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this property needed? What error message does checkstyle produce when we omit it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message without this property is:
Name 'factory' must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'. [ConstantName]
with this property it's:
Constant name 'factory' must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'. [ConstantName]
My preference is to keep it as both the definition and the output is more consistent with the existing rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both messages are clear and actionable. My preference would be not to add configuration that is not needed.
I noticed we do set this for other checks, and I believe we can remove those other configurations too, in places where they do not add much value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Pls see my previous comment about #10681.
Merged in the |
@@ -34,11 +34,11 @@ public enum Support { | |||
this.lowerCaseName = name().toLowerCase(Locale.ROOT); | |||
} | |||
|
|||
public static final Map<String, Support> nameToSupportMap = Maps.newHashMap(); | |||
public static final Map<String, Support> NAME_TO_SUPPORT_MAP = Maps.newHashMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes a publicly accessible field - a breaking ABI change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. I reverted this change. I think it's fair to think about this one as an exceptional case, so I added @SuppressWarnings("checkstyle:ConstantName")
to overcome the issue. Pls LMK if this fix is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked the PR for similar cases and I did not find any more public static
fields that were affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just adding some additional context that Iceberg only provides API/ABI stability for certain modules (iceberg-mr
module isn't included) which is enforced by revapi
, but great if we can avoid breaking the API/ABI for modules where we don't enforce it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WFM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @attilakreiner
No description provided.