Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Species Type: Clarifications on Molecule Syntax #261

Open
s9105947 opened this issue Jan 4, 2022 · 2 comments
Open

Species Type: Clarifications on Molecule Syntax #261

s9105947 opened this issue Jan 4, 2022 · 2 comments
Labels
EXT: SpeciesType physical particle species extension

Comments

@s9105947
Copy link

s9105947 commented Jan 4, 2022

The current version of the species type extension demands for molecule syntax:
"Use standard chemical notation, e.g.: H2O."

What is "standard chemical notation" specifically?

Notable cases:

  • Molecules with charge, e.g. OH-
  • complexes, e.g. [NiCN]+
  • organic compounds, e.g. C2O5OH, CH3COOH, maybe even CH3-COOH, CH3-CH2-CH3
  • pedantic there are more complex formulae which are "standard" too, but can't easily be translated to text, e.g. structural formulae
  • Are parenthesis allowed? e.g. SO2(OH)2 (admittedly non-standard)
    • Note that parenthesis need to be balanced, which increases parsing complexity: With paranthesis, this can't be parsed by regex.
    • Semantics: Are (H2)O and H2O considered equal?

This is a bottomless hole, so I'd like to suggest as a minimal version:

  • ("An implementation must support at least the following notation. Additionally, more specific notation may be supported:")
  • A molecule is given as a sum formula, e.g. semantically only the amount per atom is extracted.
  • Parenthesis, brackets, (and braces) of any form are forbidden.
  • If an atom occurs twice its quantities are added. ("An implementation may derive additional information from this, e.g. anorganic groups like OOH.")
  • Only the total charge of a molecule can be specified by appending "^" *DIGIT ("+" / "-"), e.g. HCO^-, SO4^2-
    • If the digit is omitted it is assumed to be one, i.e. HCO^1- and HCO^- are equivalent.
  • Note: A total quantity of a molecule can not be given, i.e. 2H^+ is invalid.

Note: Depending on if an isotope is also an atom maybe use "atom or isotope" repectively.
Note: Potentially replace the charge separator character.

@ax3l ax3l added the EXT: SpeciesType physical particle species extension label Jan 6, 2022
@ax3l
Copy link
Member

ax3l commented Jan 6, 2022

The answer for this is similar to #260 (comment):

We don't have a use case with a committed chemistry code yet and leave this open to the first one(s) that would like to adopt openPMD. I like your suggestion, but would not standardize it until we have a concrete use case/adopter.

Note for the future reader: @s9105947 works on PIConGPU, which is an electromagnetic particle-in-cel code in laser-plasma physics (keV to MeV range physics, while molecules are eV physics and not covered by em. PIC.)

@s9105947
Copy link
Author

s9105947 commented Jan 7, 2022

In this case I would suggest to remove the paragraph about molecules alltogether:

"The base standard does not encode molecules. However, an implementation MAY support molecules with an implementation-defined syntax."

In my opinion the current "use standard chemical notation" is not sufficiently clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EXT: SpeciesType physical particle species extension
Projects
None yet
Development

No branches or pull requests

2 participants