Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Caliper integration and instrumentation #130

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

slabasan
Copy link
Collaborator

@slabasan slabasan commented Feb 15, 2024

Supersedes #118

TODO:

  • fix conflicts -- looks like commits related to building PFA in a container are included in this
  • call to find_package(caliper) should follow current pattern with other dependencies (this was initially done before we refactored the find_package calls after running into issues on turing)
  • add config variable to config/*.cmake files to indicate to external software that perfflowaspect was built (or not built) with caliper --> bigger lift as this will require testing integration of PFA+caliper with other software tools, maybe this becomes a second PR, but realistically should be a single PR since external tools will link against these cmake files
  • test external software linking against PFA built with caliper to confirm correctness of config files
  • debug unit test builds in github actions -- boilerplate and release builds are failing, but they are building without caliper support, so builds shouldn't be impacted

Copying over notes from #118:

Couple of things before we can merge:

  • LLVM function names that get passed to Caliper (eg _Z3fooRKs instead of just foo) look confusing and don't match the original function name. Let's see if we can do better.

  • Caliper should not be "required" library. Many users may not need low-level perf counter details on their workflow and may just want the timeline traces. So, we need to add in an option/flag that determines if PFA should be built with Caliper for users who want something fine grained. With multiple workflow components eg in MuMMI or AHA Moles, we don't need this detail all the time and simple timelines are sufficient.

  • We need to update the CI to match these build changes (we need to build caliper in the CI tests)

  • We need to add detailed docs:

    • Explain the need and use case for integration with Caliper,
    • Add information on how to build caliper first (point to repo etc),
    • Add an example on how to use this with an application (eg CALI_CONFIG=runtime-report ./smoketest or CALI_CONFIG=runtime-report,output=test.cali ./smoketest or something with cali-query) and show the corresponding generated output, and
    • Explain how we can conduct end-to-end analysis of a workflow with a combination of PFA files and .cali files. For example, showcasing that we can get a quick big picture with the 800 configs of AMS with PFA and then dive deeper into 2-3 of those in detail with Caliper perf counters is a good use case to explain. We need to be clear in explaining that we don't want to end up with a data overload and that with a workflow with many configurations and components, we don't always want a low-level analysis for all configs. That is too much data (multiple metrics, time, cpu/mem usage, several perf counters) that we can't reasonably analyze at a function level or visualize.
    • Explain that using Caliper is only a C/C++ feature and not applicable to python workflows or hybrid workflows.

@slabasan slabasan changed the title Caliper integration WIP Caliper integration Feb 15, 2024
@slabasan slabasan changed the title WIP Caliper integration WIP Caliper integration and instrumentation May 10, 2024
@slabasan slabasan force-pushed the caliper-integration branch 4 times, most recently from 5504b98 to efc3e39 Compare May 10, 2024 17:45
@tpatki
Copy link
Member

tpatki commented Sep 28, 2024

@slabasan
One more thing in the to-do list (or may be we open an issue on this and revisit it later): resolve errors on newer clang versions, I will take a look at this once I am done resolving the issues with #167.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants