Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Logging in libcudf #7861

Closed
devavret opened this issue Apr 5, 2021 · 8 comments
Closed

[FEA] Logging in libcudf #7861

devavret opened this issue Apr 5, 2021 · 8 comments
Assignees
Labels
1 - On Deck To be worked on next cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.

Comments

@devavret
Copy link
Contributor

devavret commented Apr 5, 2021

Certain libcudf functions need to send warnings to users.
Examples:

  1. [BUG] Fixed-point types with precision < 10 (for Spark) cannot be successfully read back from parquet #7152 : We deviate from typical parquet writer behaviour and as a result the written file is unreadable by certain other libraries. We'd prefer to warn the user about it but not throw an exception. [BUG] Fixed-point types with precision < 10 (for Spark) cannot be successfully read back from parquet #7152 (comment)
  2. We want to warn the user when despite specifying, GDS is not available to be used Use cuFile for Parquet IO when available #7444 (comment)

These warnings can be part of a log and in general libcudf can have logging similar to RMM.

@devavret devavret added feature request New feature or request Needs Triage Need team to review and classify labels Apr 5, 2021
@jlowe
Copy link
Member

jlowe commented Apr 5, 2021

Similar to rapidsai/rmm#564 I'd like to see the ability for the application to provide a logging sink to better control the handling of any log messages. Emitting uncontrollable messages to stdout or creating files in the current directory will be problematic for some use-cases. Providing a log sink adapter or some similar interface allows the libcudf log messages to be funneled to any existing logging mechanism the app may be using.

@kkraus14 kkraus14 added libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Apr 6, 2021
@github-actions
Copy link

github-actions bot commented May 6, 2021

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@github-actions
Copy link

github-actions bot commented Feb 7, 2022

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@vuule vuule added the cuIO cuIO issue label Sep 28, 2022
@vuule
Copy link
Contributor

vuule commented Sep 28, 2022

Some new potential uses in cuIO:
warn on fallback to host JSON tree traversal
warn on compression failure in Parque and ORC writers
info about nvCOMP use (vs internal implementations)
info about KvikIO use (vs internal implementations)
info about cuFile use

@vuule
Copy link
Contributor

vuule commented Jan 11, 2023

Finally revisiting this :)

Initial design thoughts:

  • Use spdlog library (header-only).
  • Have a global logger that's initialized on first use.
  • Default logging level is WARNING (or maybe OFF initially).
  • Provide a way to customize logging level and sink. One or more of the following:
    • At build time - CMake flags to set level (maybe also sink, e.g. file vs stdout);
    • During intialization - environment variables, again can control level and basic sink options.
    • At runtime (any point) - libcudf API that allows more control, like providing custom sink(s).

@jlowe is a libcudf API what you had in mind in your request? It would need to be called before any other libcudf functions to avoid potentially logging with default settings.

@GregoryKimball GregoryKimball added 1 - On Deck To be worked on next and removed inactive-30d labels Jan 11, 2023
@vyasr
Copy link
Contributor

vyasr commented Jan 12, 2023

Just a note, spdlog can be used header-only or precompiled. Given that we use it header-only in rmm since rmm itself is header-only, it may not help compile times in libcudf to use its compiled mode here, but if we do go the spdlog route it's worth checking on the compile times before @davidwendt comes after you 😄

@GregoryKimball
Copy link
Contributor

Now that we have merged #12637, I'd like to close this issue and continue the discussion for additional logging items in new issue(s).

@vuule
Copy link
Contributor

vuule commented Apr 3, 2023

@GregoryKimball regarding the additional logging items: I opened a PR for cuIO warnings that are not related to integrations #13043. cuIO ntegrations should be covered as well in 23.06.

rapids-bot bot pushed a commit that referenced this issue Apr 24, 2023
Issue #7861

Added the following log messages:
- Info message about nvCOMP use for a given compression type (once per type).
- Error message when a file cannot be read/written with cuFile, and fallback is disabled (per file).
- Info messages about GDS use vs. bounce buffer (per file).

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Karthikeyan (https://github.com/karthikeyann)

URL: #13132
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - On Deck To be worked on next cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

6 participants