Skip to content

WeeklyTelcon_20240611

Tommy Janjusic edited this page Jun 11, 2024 · 5 revisions

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Tommy Janjusic (NVIDIA)
  • Jeff Squyres (CISCO)
  • Luke Robison (Amazon)
  • Edgar Gabriel (AMD)
  • Todd Kordenbrock
  • Manu Shantharam (AMD)
  • Howard Pritchard (LANL)
  • Wenduo Wang (Amazon)
  • George Bosilca (NVIDIA)
  • Josheph Schuhart (SBU)
  • Todd Kordenbrock
  • Aurelien Bouteiller (UTK)

v4.1.x

v5.0.x

  • New issues since v5.0.3

  • Open PRs since v5.0.3

  • Closed/Merged PRs since v5.0.3

  • v5.0.x Open PRs

  • v5.0.x Issues; v5.0.x Questions

  • PRTE issue if PMIx upgrades from v4.x to v5.x without also rebuilding PRTE

    • PMIx changed some internal data structures between v4.x and v5.x (specifically, they are not part of the PMIx public API).
    • The PMIx public API did not change between v4.x and v5.x, so the libpmix .so number did not change to reflect an incompatibility.
    • PRTE uses the internal PMIx data structures, however, and is affected by the changes between PMIx v4.x and v5.x.
    • Meaning: if you build+install PRTE against PMIx v4.x and then upgrade to PMIx v5.x without also re-building PRTE, PRTE will almost certainly segv.
    • Since PRTE is the back-end of mpirun, this affects Open MPI as well.
    • Upstream PRTE is likely to put in a run-time check to print an error and abort (before segv'ing) when detecting this case.
    • See https://github.com/openpmix/prrte/pull/1982 , pull this PR into our fork
    • Runtime test suite, from IBM check with Josh H.
      • Check requirements, there was an IBM CI running on PRRTE (sept. 2023),

Main branch

Additional Agenda Items

TODO

  • Need to discuss MPI 4.1
  • Need to discuss MPI 4.0
  • Discuss set v5.0.4 timeline
  • Discuss v6.0 feature list and timeline
  • Discuss 6.0: BigCount
  • Explore cherry-pick bots
  • Explore an MTT alternative

Notes

Clone this wiki locally