Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of operations on OS_DeleteAllObjects #471

Closed
jphickey opened this issue May 19, 2020 · 3 comments · Fixed by #711 or #750
Closed

Order of operations on OS_DeleteAllObjects #471

jphickey opened this issue May 19, 2020 · 3 comments · Fixed by #711 or #750
Assignees
Milestone

Comments

@jphickey
Copy link
Contributor

Is your feature request related to a problem? Please describe.
OS_DeleteAllObjects() is used when shutting down the system, such as after an exception, a commanded processor restart, or CTRL+C, etc.

This simply deletes resources based on their numeric OS_OBJECT_TYPE value, meaning tasks (1) are first, followed by queues, bin/count, semaphores, mutexes, etc. and eventually timers (9).

A recent issue described in nasa/cFE#701 observed a potential problem with this. As the task and semaphore are deleted, a timer could still be running. If that timer executes during shutdown, it may interrupt the deletion, and attempt to use semaphore objects.

Normally this shouldn't be an issue because OSAL will return an error and reject the call. However due to an underlying issue in Binary Semaphores (#470) after task cancellation, this caused deadlock.

Describe the solution you'd like
It would be preferable to delete timers first, then tasks, then semaphores, files, and other resources. This should be a safer ordering in general, as it will reduce the potential for resources to be used as they are being deleted.

Describe alternatives you've considered
Leave as-is.

Additional context
The real fix for #470 prevents deadlock, this is just more future-proofing.

Requester Info
Joseph Hickey, Vantage Systems, Inc.

@CDKnightNASA
Copy link
Contributor

It wouldn't hurt to also delete objects in reverse order...For short-lived instances they will likely be in reverse order (eldest first). Of course this is no guarantee, but easy to implement in the code.

@jphickey jphickey self-assigned this Jun 4, 2020
@skliper
Copy link
Contributor

skliper commented Dec 9, 2020

@jphickey want to resolve this for Caelum?

@jphickey
Copy link
Contributor Author

jphickey commented Dec 9, 2020

Sure, I think I already have a patch for this somewhere that never got put into a final PR. I can find it and submit.

@skliper skliper added this to the 6.0.0 milestone Dec 9, 2020
jphickey added a commit to jphickey/osal that referenced this issue Dec 28, 2020
When cleaning up for shutdown, delete resources that have
a task/thread first, followed by other resource types.

This helps avoid possible dependencies as running threads
might be using the other resources.
astrogeco added a commit that referenced this issue Jan 12, 2021
Fix #471, order of operations for delete all
jphickey added a commit to jphickey/osal that referenced this issue Aug 10, 2022
Add test cases to exercise all functions, lines, and branches to
the extent reasonably possible.  Improves the coverage stats
significantly:

  functions 98.9% -> 100%
  lines 96.4% -> 99.8%
  branches 87.1% -> 94.9%

Remaining uncovered lines/branches are not possible to be reached
due to the way the code is structured, or because it would require
an alternate implementation of SBR (note that SB+SBR are currently
tested as a single unit, even though they are technically separate
modules now).  For example, the "direct" SBR implementation cannot
have collisions, hence the collision handling in SB cannot be
reached.  Making stubs for SBR may allow this to be tested.
jphickey pushed a commit to jphickey/osal that referenced this issue Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants