Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas to improve "yr fmt" #192

Open
4 of 10 tasks
wxsBSD opened this issue Sep 9, 2024 · 6 comments
Open
4 of 10 tasks

Ideas to improve "yr fmt" #192

wxsBSD opened this issue Sep 9, 2024 · 6 comments

Comments

@wxsBSD
Copy link
Contributor

wxsBSD commented Sep 9, 2024

I haven't dug into it in detail yet but as discussed elsewhere, we should add various options to allow people to control the formatting (and even other things, like variable names, maybe?) in "yr fmt" - I'm happy to do the work here but want to open this issue to get input on it.

For example, I'm thinking of things like

  • Use tabs vs spaces
  • Controllable number of spaces for identation
  • Curly brace on new line for rule declaration (eg: rule foo {\n vs rule foo\n{)
  • Newline after section headers
  • How loops should be formatted (eg: for any i in (0..1): (\n vs for any i in (0..1):\n()
  • Optional indentation of rule body
  • Optional alignment of metadata and patterns
  • Alphabetize metadata
  • Alphabetize strings
  • Remove trailing whitespace

I'm sure there are plenty more here to be found, especially as I dig into the code, but hoping to collect your thoughts @plusvic (and others) on ways to improve "yr fmt" so I can go off and start implementing.

EDIT: I'm putting the things mentioned below into the list above so I can edit it to mark the ones I have done.

@plusvic
Copy link
Member

plusvic commented Sep 9, 2024

I like those and add a few more:

  • Optional indentation of rule body:
rule foo {
strings:
  $a = ...
condition:
  $a
}

vs

rule foo {
  strings:
    $a = ...
  condition:
    $a
}
  • Optional alignment of metadata and patterns.
rule foo {
  meta: 
    short = "..."
    very_long = "..."
  strings:
    $short = "..."
    $very_long = "..."
  condition:
    all of them
}

vs

rule foo {
  meta: 
    short     = "..."
    very_long = "..."
  strings:
    $short     = "..."
    $very_long = "..."
  condition:
    all of them
}

My plan was having a TOML configuration file not only for the code formatter, but for the CLI as a whole, where code formatting is a section. This crate could be useful for finding the user's home directory: https://crates.io/crates/home

@wxsBSD
Copy link
Contributor Author

wxsBSD commented Sep 9, 2024

I haven't looked into it yet, but if possible I think all of this should be done in a library so it can be easily used by other tools. This way we get consistent formatting behavior across a variety of tools if they all use the library. I don't know what the equivalent of "libyara" is in YARA-X but it would make sense to expose it there, I think.

@plusvic
Copy link
Member

plusvic commented Sep 9, 2024

I haven't looked into it yet, but if possible I think all of this should be done in a library so it can be easily used by other tools. This way we get consistent formatting behavior across a variety of tools if they all use the library. I don't know what the equivalent of "libyara" is in YARA-X but it would make sense to expose it there, I think.

Yes, of course. The config file will be used only by the CLI, for setting the right configuration for the yara-x-fmt crate. The Formatter object in that crate could have methods for adjusting each setting.

@import-pandas-as-numpy
Copy link

Control over 'segmentation' of hex identifiers.

Seems reasonable to me that someone might want to have control over newline behavior in hex identifiers, and a rational default of 16 seems non-contemptible to settle on for this behavior (if not already implemented.)

@plusvic
Copy link
Member

plusvic commented Sep 10, 2024

Introduction of line breaks at appropriate places, specially in the case of rule condition, is a pending improvement too. That's a bit more complicated, it's not a low hanging fruit like the rest. At his moment the formatter let you control how the lines are broken.

@tlansec
Copy link

tlansec commented Sep 16, 2024

I just spotted this nd thought I'd add a few things we implement externally to YARA that might be worth considering in a tool like this:

  • Enforce certain metadata fields (e.g. they must exist and be of type x)
  • Line breaks between strings using different prefixes, e.g.
strings:
   $s1 = "abc"
   $s2 = "def"
   
   $t1 = "ghi"
   $t2 = "jkl"
...
  • Alphabeticaly sort metadata.
  • Remove trailing spaces after defined strings. $s = "abc" <-last space should get stripped here.

I'm sure there's more, but I can't think of any right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants