Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #2095, Automatic suppression of flooding events #2109

Closed
wants to merge 11 commits into from
Closed

Fix #2095, Automatic suppression of flooding events #2109

wants to merge 11 commits into from

Conversation

jhnphm
Copy link

@jhnphm jhnphm commented May 25, 2022

Checklist (Please check before submitting)

Describe the contribution

Testing performed

  1. Built with SIMULATION=i686-native ENABLE_UNIT_TESTS=true
  2. Ran unit tests, all passing except time due to local time configuration changes
  3. Built for VxWorks
  4. Ran cFS on vxWorks and observed events being squelched and the squelch event message being emitted

Expected behavior changes
A clear and concise description of how this contribution will change behavior and level of impact.

  • API Change: Squelch error code gets returned by event send functions if a squelch happens
  • Behavior Change: Events get squelched after 16 events within 1 second per app (default in cpu1_platform_cfg); tokens are returned at a rate of 8/sec (default in cpu1_platform_cfg) to allow a sustained event rate of 8/sec/app. Event is emitted upon squelching, and counters incremented. Counters added to event data stored to file and realtime telemetry.

System(s) tested on

  • Hardware: SP0-s (Simics)
  • OS: VxWorks 6.9
  • Versions: cFE 7.0.0+dev127, OSAL v6.0.0-rc4+dev66, OSAL v6.0.0-rc4+dev66 w/ JSC changes

Additional context
N/A

Third party code
N/A

Contributor Info - All information REQUIRED for consideration of pull request
John N Pham (@jhnphm), Northrop Grumman, Space Systems

@jhnphm
Copy link
Author

jhnphm commented May 25, 2022

CLA incoming; will have to submit to my management to fill out

@skliper
Copy link
Contributor

skliper commented May 26, 2022

@jhnphm We just got a new consolidated form that includes all the apps, I'll email it.

@jhnphm
Copy link
Author

jhnphm commented May 26, 2022

Updated unit test to improve coverage, and fixed doc errors. Updated commit message to add more detail.

Copy link
Contributor

@jphickey jphickey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overall there are some concerns that need to be resolved here. Simplify the logic, mainly.
I didn't look too closely at the unit tests but it seems to require modification to every test case. This is definitely not ideal, I'd try to avoid that if at all possible. I would think that it could be worked around by giving each app a huge amount of credit during the common setup function (e.g. INT32_MAX if you also use my suggestion of going with a signed credit value and a decrement)

modules/evs/fsw/src/cfe_evs_utils.c Show resolved Hide resolved
modules/evs/fsw/src/cfe_evs_task.h Outdated Show resolved Hide resolved
modules/evs/fsw/src/cfe_evs_task.h Outdated Show resolved Hide resolved
modules/evs/fsw/src/cfe_evs_utils.c Show resolved Hide resolved
@jhnphm
Copy link
Author

jhnphm commented May 27, 2022

I think overall there are some concerns that need to be resolved here. Simplify the logic, mainly. I didn't look too closely at the unit tests but it seems to require modification to every test case. This is definitely not ideal, I'd try to avoid that if at all possible. I would think that it could be worked around by giving each app a huge amount of credit during the common setup function (e.g. INT32_MAX if you also use my suggestion of going with a signed credit value and a decrement)

I simplified it a bit, but still requires resetting the token count to INT32_MAX after each successful call to CFE_EVS_Register, since that resets the token counter to CFE_PLATFORM_EVS_MAX_APP_EVENT_BURST, and I had to put a case in to ignore adding credits if the existing token count was > CFE_PLATFORM_EVS_MAX_APP_EVENT_BURST, which seems messy if it's just to satisfy tests. I guess I could add a flag to CFE_EVS_Global or one of it's substructs to properly disable squelching globally at runtime (and at that point might as well make it a command), but that would add more complexity.

This adds automatic event suppression, which works by decrementing a
token counter every time an event message is sent. When it becomes negative
an event message is emitted and further event messages are squelched
until the tokens increase above 0.

Tokens are returned (incremented) at the rate of
CFE_PLATFORM_EVS_APP_EVENTS_PER_SEC, upon the next event message call
for the app, based on time since last token return. This avoids a
dependency on HK wake messages, which are not guaranteed to be at 1Hz,
or received at all.

There are limits in place to prevent tokens from being above
CFE_PLATFORM_EVS_MAX_APP_EVENT_BURST and below
-CFE_PLATFORM_EVS_MAX_APP_EVENT_BURST. The initial value upon
registration is CFE_PLATFORM_EVS_MAX_APP_EVENT_BURST.
@jhnphm jhnphm requested a review from jphickey May 27, 2022 00:20
modules/evs/fsw/src/cfe_evs.c Outdated Show resolved Hide resolved
 * Use EVS global lock
 * Use increments of 1000 for tokens to avoid integer division
 * Use global variable instead of using platform_cfg file defines
   directly to allow unit tests to override value
 * Fix EVS event message ID
…kspace/phamj/CSIC9/hfi/cfe into auto-event-suppression
@jhnphm
Copy link
Author

jhnphm commented Jun 7, 2022

@skliper @jphickey Added the changes requested/bugs pointed out during today's meeting. I'm using the global EVS mutex if that's okay. Also updated docs and took a first cut at requirements in the CSV.

Not sure if the diagram included is detailed enough; it's a notional state diagram, but there really isn't a state machine in the code. I have this activity diagram but it might be too "busy":
evs_squelch

@jhnphm jhnphm requested a review from skliper June 7, 2022 20:57
@skliper
Copy link
Contributor

skliper commented Jun 10, 2022

Closed in favor of #2117

@jusvit
Copy link

jusvit commented Jun 13, 2022

Closed in favor of #2117

Damn skliper you really did jhnphm dirty...

@skliper
Copy link
Contributor

skliper commented Jun 13, 2022

@justinvvitale - I should have been more verbose in my closure statement. Transfer was coordinated w/ all parties involved to straighten out contribution ownership based on the various organizations and contracts involved with this work. There's significant process involved behind the scenes in some external contribution situations, and I realize the optics don't look good w/o context. :)

@nasa nasa locked as off-topic and limited conversation to collaborators Jun 13, 2022
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic suppression of flooding events
4 participants