Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfcs: proposal for a verbose logging mechanism #1827

Open
wants to merge 1 commit into
base: rfcs
Choose a base branch
from

Conversation

avmanerikar
Copy link
Contributor

Link to rendered document.

@avmanerikar avmanerikar added the RFC A design document label Mar 12, 2024
Copy link
Contributor

@mgouicem mgouicem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a few things are to be clarified here:

  • spdlog exposes its own environment variables: how would those interact with ONEDNN_VERBOSE setting?
    For example if a user does not set ONEDNN_VERBOSE, but sets SPDLOG_LEVEL=TRACE, would oneDNN print ? Or if a user sets both, do they combine/override each other?
  • if we expose new environment variables, matching APIs will need to be exposed to programatically control that
  • in few of the links you provide, spdlog v1 is used, but it seems there are newer releases. Is the recommendation to go with v1 or a specific version?
  • Which kind of logger would we use in oneDNN? IIUC, spdlog provides multiple loggers with different properties? In particular, flushing at every message can be necessary for debugging purpose, is there control for that?

rfcs/20240311-verbose-logging-support/README.md Outdated Show resolved Hide resolved
rfcs/20240311-verbose-logging-support/README.md Outdated Show resolved Hide resolved

### 2. Runtime Logging Controls
A basic requirement for implementing logging support will be to define the environmental control variables which the user can specify to manage oneDNN data logging.
For the simple case where the logging mechanism involves directly dumping the verbose outputs into a logfile, this can be accomplished with two control variables, one for enabling the data logging (`ENABLE_ONEDNN_LOGGING`) and the other for specifying the logfile path (`ONEDNN_LOGFILE=/path/to/file`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify why is ENABLE_ONEDNN_LOGGING needed?

IIUC, we are switching to spdlog internally for all prints. Based on that, I would expect that only ONEDNN_VERBOSE_FILE is needed, with stdout its default value.

Also, environment variable typically need a matching API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! The specification of a logfile path should be enough to enable logging in oneDNN. Updating RFC accordingly.

| 2 | SPDLOG_LEVEL_INFO | `check`,`profile_create`, `profile_exec`, `profile` |
| 3 | SPDLOG_LEVEL_WARNING | --- |
| 4 | SPDLOG_LEVEL_ERROR | `error` |
| 5 | SPDLOG_LEVEL_CRITICAL | - |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which direction does the mapping go here?

  • Is it that oneDNN verbose error messages will be logged with spdlog using SPDLOG_LEVEL_ERROR
  • or is it that if a user sets an SPDLOG env variable, we will activate those ONEDNN_VERBOSE flags?

Copy link
Contributor Author

@avmanerikar avmanerikar Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direction of logging will be from oneDNN to spdlog since no SPDLOG environmental variables are to be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the matching might be a little bit different:

error -> CRITICAL
check -> ERROR
dispatch -> INFO

@avmanerikar avmanerikar force-pushed the amanerik/rfcs/verbose-logging-support branch from 7528ec8 to 223e571 Compare March 20, 2024 18:40
@avmanerikar
Copy link
Contributor Author

I think a few things are to be clarified here:

  • spdlog exposes its own environment variables: how would those interact with ONEDNN_VERBOSE setting?
    For example if a user does not set ONEDNN_VERBOSE, but sets SPDLOG_LEVEL=TRACE, would oneDNN print ? Or if a user sets both, do they combine/override each other?
  • if we expose new environment variables, matching APIs will need to be exposed to programatically control that
  • in few of the links you provide, spdlog v1 is used, but it seems there are newer releases. Is the recommendation to go with v1 or a specific version?
  • Which kind of logger would we use in oneDNN? IIUC, spdlog provides multiple loggers with different properties? In particular, flushing at every message can be necessary for debugging purpose, is there control for that?
  • The expectation is to not use any of the SPDLOG environmental variables to control logging in oneDNN. With a header-based approach, one can build spdlog without the support for logging using argv /environment var. That way, logging is only controlled programmatically and by using the ONEDNN_VERBOSE settings.
  • The latest stable release for spdlog is v1.13 and that is the recommended version to use for logging support. While there is a master branch for spdlog, it is old and deprecated - v1.x is the actual master branch to be used. (See Question: What are the differences in version v.1, master and release gabime/spdlog#1346).
  • The proposal is to use a rotating logger for oneDNN as defined here for spdlog. The rotating logger type is useful for keeping the maximum file size in check as the data is logged from several sources.

@avmanerikar avmanerikar force-pushed the amanerik/rfcs/verbose-logging-support branch from 223e571 to 7d2c271 Compare June 27, 2024 17:17
@avmanerikar
Copy link
Contributor Author

The document has been updated to reflected suggested changes as well as additions to the proposed design.


Logging options in oneDNN will be enabled as an experimental feature by setting the variable EXPERIMENTAL_LOGGER=ON during build-time. This will also take an header-based apporach by building the fmtlib library and spdlog headers into the code to enable the logger mechanism.

### 2. Runtime Logging Controls - DNNL_LOGFILE, DNNL_LOGFILE_SIZE, DNNL_LOGFILE_AMT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Propose to use DNNL_VERBOSE_ as prefix be specific to what part of functionality these macros applied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch - updated accordingly.

- For the simple case where the logging mechanism involves directly dumping the verbose outputs into a logfile, this can be accomplished with one control variable for specifying the logfile path (`DNNL_LOGFILE=/path/to/file`).
Specifying `DNNL_LOGFILE` automatically enables logging of the verbose output to the user-specified file while in the default case, the data is directly printed to `stdout`.
In this scenario, the data recorded in the logfile mirrors the printed verbose mode information, hence, the logged data can be managed using oneDNN [runtime controls](https://oneapi-src.github.io/oneDNN/dev_guide_verbose.html?highlight=onednn_verbose#run-time-controls) for the verbose mode.
- By default, data will be recorded using a rotating lazy logger with a file size specified by `ONEDNN_LOGFILE_SIZE(=1024*1024)` and the number of rotatiing files specified by DNNL_LOGFILE_AMT(=5).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 MB for a log file is definitely not enough, I propose at least 25 MB, maybe even 50.

Suggest to rename DNNL_LOGFILE_AMT into DNNL_VERBOSE_NUM_LOGFILES. Not obvious shortcut from AMT to amount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

| 2 | SPDLOG_LEVEL_INFO | `check`,`profile_create`, `profile_exec`, `profile` |
| 3 | SPDLOG_LEVEL_WARNING | --- |
| 4 | SPDLOG_LEVEL_ERROR | `error` |
| 5 | SPDLOG_LEVEL_CRITICAL | - |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the matching might be a little bit different:

error -> CRITICAL
check -> ERROR
dispatch -> INFO

| 6 | SPDLOG_LEVEL_OFF | `none` |

With this alignment, the tracing information printed out for each verbose mode can be also logged at the aligned level. Obviously, the logging level here is determined from the value of the `ONEDNN_VERBOSE` variable.
To exercise better control over the logged data, it is recommended to introduce another variable `ONEDNN_LOG_VERBOSE` which specifies the verbose mode for the logged data and also additionally allows the user to use the [`filter`]((https://oneapi-src.github.io/oneDNN/dev_guide_verbose.html?highlight=onednn_verbose#run-time-controls)) option for the verbose mode to filter the information being logged for supported components and primitives.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ONEDNN_LOG_VERBOSE meant to match spdlog levels? If yes, what's the benefit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ONEDNN_LOG_VERBOSE is proposed so that the verbose levels can be separately specified for the logfiles and the console.
For example, setting ONEDNN_VERBOSE=error and ONEDNN_LOG_VERBOSE=all will print all data to the logfiles but only display errors on the console.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of solutions that would provide an information both to stdout and file, this might be misleading for an end-user to see stdout messages when logging into a file was requested.

Is there a specific request to support such mode?

Copy link
Contributor Author

@avmanerikar avmanerikar Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no specific ask for printing to both stdout and logfiles, but recommended solutions provide an option for the user to enable outputs to the console while logging.
The proposed design prints both to stdout and the logfiles by default.

```

When logging is enabled, the data will be recorded in the logfiles with the same format and field albeit with a few updates:
- ONEDNN_VERBOSE_TIMESTAMP will be enabled for the logger and will be displayed in the logfile in the format `[%Y-%M-%D %H:%M:%S]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely wouldn't be appropriate for users who requested timestamp feature. The whole point of it was having a straightforward and linear order of values in ms that is easy to work with in csv. The new format is definitely not helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be two alternative formats that can keep the timestamps:

  • print the verbose outputs as they are with the spdlog fields. This will allow one to process the logfiles in csv without losing any data.
[2024-07-03 16:18:23.632] [onednn] [info] onednn_verbose,1693533460193.346924,primitive,create:cache_miss,cpu,convolution,jit:avx512_core,forward_training,src_f32:a:blocked:aBcd16b::f0 wei_f32:a:blocked:ABcd16b16a::f0 bia_f32:a:blocked:a::f0 dst_f32:a:blocked:aBcd16b::f0,,alg:convolution_direct,mb2_ic16oc16_ih7oh7kh5sh1dh0ph2_iw7ow7kw5sw1dw0pw2,0.709961
  • print the verbose outputs while removing all spdlog fields. The printed data will then be identical to what is displayed on the console.
onednn_verbose,1693533460193.346924,primitive,create:cache_miss,cpu,convolution,jit:avx512_core,forward_training,src_f32:a:blocked:aBcd16b::f0 wei_f32:a:blocked:ABcd16b16a::f0 bia_f32:a:blocked:a::f0 dst_f32:a:blocked:aBcd16b::f0,,alg:convolution_direct,mb2_ic16oc16_ih7oh7kh5sh1dh0ph2_iw7ow7kw5sw1dw0pw2,0.709961

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First one should be fine.

@avmanerikar avmanerikar force-pushed the amanerik/rfcs/verbose-logging-support branch from 7d2c271 to 8d18991 Compare July 3, 2024 22:40
@avmanerikar avmanerikar force-pushed the amanerik/rfcs/verbose-logging-support branch from 8d18991 to f201740 Compare July 3, 2024 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC A design document
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants