Add Out_channel STM tests #431

jmid · 2024-01-06T00:11:34Z

The Lin Out_channel test is a sore point of the test suite:

it required several revisions to get to the current state Lin Out_channel shrink cleanup #387 In/Out_channel Lin test revision #370 Out_channel frequency adjustment #319
it takes too long to run on macOS and has therefore been disabled Disable the negative Out_channel tests on macOS #326
it doesn't always trigger on FreeBSD Lin Out_channel test fails to trigger on FreeBSD #401
it occasionally takes very long to run or shrink which can cause timeouts Long Out_channel shrinking runs cause timeout #378

Out_channels buffer internally, meaning length's result can vary.
This is a bad fit for Lin's sequential consistency test, because we may end up finding counterexamples of different buffering (between a parallel and any interleaved sequential run) - or just spend a long time searching for one.
As such, a model-based STM test is a better fit - and this PR offers one.

The new model-based test

tests sequential and parallel usage
runs across all CI platforms as a positive test
covers as much of the Out_channel module as the existing Lin test
runs much more predictably
enables us to test Out_channel on macOS again (without any issues)

Many things thus speak for switching to it.

On the other hand, despite its downsides the old Lin test has also managed to stress the runtime into triggering defects #412

Technically, there are a few nuggets in the test code:

we model both closed and open Out_channels with suitable preconditions. As such the test can generate open-close-open-close cmd sequences.
the uncertainty of buffering affecting length's output is handled by comparing less-or-equal-to the model's length
binary-mode (only available on Windows) is modeled in the state
- when enabled on MinGW/Cygwin it means length increases by 2 for every '\n' output (since '\n' maps to "\r\n")
- position does not reflect this translation though
- the tests finally found a counterexample illustrating how buffering may interfere with set_binary_mode: it may not be the mode enabled at output-call-time that takes effect (example below). This is solved by always calling flush before set_binary_mode on MinGW/Cygwin where this matters:
```
Open_text : Ok (())
Set_binary_mode true : ()
Output_char '\n' : Ok (())
Set_binary_mode false : ()
Flush : ()
Length : 2
```

Closes #378 and #401

jmid · 2024-01-09T14:17:36Z

CI summary for 2150652:

All 6 Cygwin/MinGW trunk workflows failed with a version mismatch, since trunk is now 5.3
linux-arm64-5.2 aborted domain_spawntree with Fatal error: Failed to create domain [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428

Out of 59 workflows 7 failed with 1 genuine failure and 6 ci setup issues

jmid · 2024-01-09T14:18:15Z

I've rebased the PR on main after the merge of #429

jmid · 2024-01-09T21:53:24Z

CI summary for 024dd85

4 workflows aborted domain_spawntree with Fatal error: Failed to create domain [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
- Cygwin 5.2
- Cygwin trunk
- FP trunk
- MinGW 5.2
MinGW bytecode 5.2 timed out during threadomain [ocaml5-issue] Windows failures on threadomain #203

Out of 44 workflows 5 failed with all 5 being genuine failures.

shym

A real test for Out_channels, where we explicitly take care of the buffering, this is very nice! Congratulations!

src/io/stm_tests.ml

shym · 2024-01-11T10:30:59Z

On the other hand, despite its downsides the old Lin test has also managed to stress the runtime into triggering defects #412

Do you think the STM tests are less likely to trigger such defects?

jmid · 2024-01-11T16:57:47Z

Thanks for the review!

Discussing this PR with you triggered me to look up previous counterexamples reported by the Lin Out_channel test.
Indeed, the last merge to main a571972 triggered, e.g., the following for the Linux 5.2 debug workflow:

Messages for test Lin Out_channel test with Domain:

  Results incompatible with sequential execution

                                           |                  
                         Out_channel.output_byte t 2 : Ok (())
                                           |                  
                       .--------------------------------------.
                       |                                      |                  
         Out_channel.length t : Ok (1)         Out_channel.close_noerr t : ()

This should have a possible sequentialization (below) which we seem to miss:

Out_channel.output_byte t 2;;
Out_channel.length t;;
Out_channel.close_noerr t;;

Looking at previous counterexamples also made me realize that

some of them are triggered by using close/close_noerr in parallel. However we won't generate such inputs in this STM test version ATM unfortunately, because they don't have a well-balanced open-close serialization. I want to experiment with lifting this limitation.

other Lin Out_channel counterexamples are triggered by weird output unlinearizable behaviours arising from running other cmds on already closed channels. This is a gray-zone specification-wise, which does not offer an immediate benefit AFAICS. As such, I'm more reluctant to invest in lifting it.

                                                                                                                 |                                                       
                                                                                                   Out_channel.flush t : Ok (())                                         
                                                               Out_channel.output_bytes t "7)\016)\170\245A{(2\255\000mr\185\243\206\2475-\192i\135\2216}w" : Ok (())    
                                                                                           Out_channel.set_binary_mode t false : Ok (())                                 
                                                                                                                 |                                                       
                                                         .---------------------------------------------------------------------------------------------------------------.
                                                         |                                                                                                               |                                                       
                                           Out_channel.length t : Ok (27)                                                                                  Out_channel.close_noerr t : ()                                        
                                       Out_channel.is_buffered t : Ok (true)                                                                               Out_channel.close t : Ok (())                                         
                                       Out_channel.output_byte t 6 : Ok (())                                                                               Out_channel.pos t : Ok (65536)                                        
                                                                                                                                                       Out_channel.is_buffered t : Ok (true)                                     
                                                                                                                                   Out_channel.output_bytes t "K\226" : Error (Sys_error("Bad file descriptor"))                 
                                                                                                                                                       Out_channel.is_buffered t : Ok (true)                                     
                                                                                                                   Out_channel.output_substring t "\197\180\2484g\180\230\193" 4 7 : Error (Invalid_argument("output_substring"))
                                                                                                                                                           Out_channel.flush t : Ok (())

Our discussion also made me check my recollection regarding locking:
https://github.com/ocaml/ocaml/blob/cedff5854ac91dccf5847b527192223ef506b1e2/runtime/io.c#L60

All operations on channels first take the channel lock.

As such, they Out_channel operations should act atomically when used in parallel and be sequential consistent.

On the other hand, despite its downsides the old Lin test has also managed to stress the runtime into triggering defects #412

Do you think the STM tests are less likely to trigger such defects?

Possibly, but I'm unsure.

A single Lin input will call the Out_channel interface (up to) 50 times * 1 parallel run * #interleavings times.
In comparison a single STM input will call the interface (up to) 25 times * 1 parallel run.

Since #interleavings can be quite high, a corner case bug is more likely to trigger more often when it is run more.
On the other hand, when considering this behaviour across a count of 1000, the negative Lin test will finish sooner (and report a "counterexample") whereas the current STM test runs to completion (1000 inputs * 25 repetitions).

jmid · 2024-01-11T17:50:50Z

In 8236876 I extend the STM tests to be able to generate close and close_noexn in state Closed too.

shym · 2024-01-12T10:42:08Z

This should have a possible sequentialization (below) which we seem to miss:

In the sequentialized version, the length will most probably be 0, as the buffer is not flushed; the output we see is due to the close_noerr being half-run, having flushed but not closed yet (as those two steps are explicitly not run atomically), isn’t it?

This is a gray-zone specification-wise, which does not offer an immediate benefit AFAICS.

To make sure that I understand correctly what this is doing, because I think I missed it in my first review: due to the postcond function, we’ll never test close or close_noerr in one of the parallel branches except when the other branch is only doing open and close calls, or am I mistaken in how postcond is used in all_interleavings_ok? This would mean those tests would never run into issues such as ocaml/ocaml#11878 I suppose?

jmid · 2024-01-12T14:03:00Z

This should have a possible sequentialization (below) which we seem to miss:

In the sequentialized version, the length will most probably be 0, as the buffer is not flushed; the output we see is due to the close_noerr being half-run, having flushed but not closed yet (as those two steps are explicitly not run atomically), isn’t it?

This is a probable explanation indeed! 👍
However it seems a bit unsatisfying that this stops testing and is reported as a counterexample of parallelism-safety.

This is a gray-zone specification-wise, which does not offer an immediate benefit AFAICS.

To make sure that I understand correctly what this is doing, because I think I missed it in my first review: due to the postcond function, we’ll never test close or close_noerr in one of the parallel branches except when the other branch is only doing open and close calls, or am I mistaken in how postcond is used in all_interleavings_ok? This would mean those tests would never run into issues such as ocaml/ocaml#11878 I suppose?

Point well taken - and yes, you are right 👍
Following the Erlang version described in the ICFP'09 paper, STM will generate candidate "Y"-triples and only keep those which satisfy all preconditions in all interleavings (which is what all_interleavings_ok should express). This indeed leaves out two parallel close* cmds.

I've therefore taken a pass over the STM test and extended the cmds so that all of them can also occur in state Closed. The only generator and precond limitation now is that Open_text is not allowed in state Open (this was also a limitation of the Lin test).

This revealed a divergence from the spec as I shared offline:

val close : t -> unit
(** Close the given channel, flushing all buffered write operations.  Output
    functions raise a [Sys_error] exception when they are applied to a closed
    output channel, except {!close} and {!flush}, which do nothing when applied
    to an already closed channel.  Note that {!close} may raise [Sys_error] if
    the operating system signals an error when flushing or closing. *)

Yet output_string, output_bytes, output, and output_substring of length 0 on a Closed channel does not fail, e.g.:

--- Failure --------------------------------------------------------------------

Test STM Out_channel test sequential failed (22 shrink steps):

   Close
   Output_string ""


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test STM Out_channel test sequential:

  Results incompatible with model

   Close : Ok (())
   Output_string "" : Ok (())

For now, I've adjust postcond to accept these cornercases.
We should consider submitting a PR to adjust the specification though...

jmid · 2024-01-15T08:31:32Z

This would mean those tests would never run into issues such as ocaml/ocaml#11878 I suppose?

We can now experimentally confirm that the latest changes can indeed find issues like that.
All 5.2 and 5.3/trunk workflows show a regression on outputting on a closed Out_channel, where only the first cmd now triggers an exception: 😮

+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test STM Out_channel test sequential:

  Results incompatible with model

   Close_noerr : Ok (())
   Output_string "}\169)gN\255\017" : Error (Sys_error("Bad file descriptor"))
   Output_byte 48 : Ok (())

jmid · 2024-01-16T14:37:24Z

Some overdue CI summaries...

For 9c63209:

Cygwin trunk aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
MinGW bytecode 5.1 crashed during threadomain [ocaml5-issue] Windows failures on threadomain #203
MinGW bytecode 5.2 aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
MinGW trunk aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
macOS trunk failed to build dune [dune-issue] dune fails build under trunk/5.3 on macOS #433
Cygwin 5.2 was cancelled after 138m during Dynlink [ocaml5-issue] Deadlock in Dynlink test on Cygwin+MinGW+MSVC #307

Out of 44 workflows 5 failed and 1 was cancelled, all 6 due to genuine issues

For 8236876:

Cygwin 5.2 aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
Cygwin trunk aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
MinGW 5.2 aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
MinGW bytecode trunk aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
macOS trunk failed to build dune [dune-issue] dune fails build under trunk/5.3 on macOS #433

Out of 44 workflows 5 failed with all 5 being genuine issues

For 0c8fdcb:

MinGW bytecode 5.1 timed out in threadomain [ocaml5-issue] Windows failures on threadomain #203
4 workflows aborted during domain_spawntree [ocaml5-issue] Fatal error: Failed to create domain on s390x, MinGW, Cygwin, ... #428
- Cygwin 5.2
- Cygwin trunk
- MinGW 5.2
- macOS 5.2
macOS trunk failed to install dune [dune-issue] dune fails build under trunk/5.3 on macOS #433
MinGW 5.1 failed the new STM Out_channel test with flush raising Sys_error("Bad file descriptor") when used in parallel
all 21 5.2 and trunk/5.3 workflows failed the new STM Out_channel due to the exception regression [ocaml5-issue] Output to closed Out_channel fails to raise Sys_error #432
- 32bit 5.2
- 32bit trunk
- Bytecode 5.2
- Bytecode trunk
- Cygwin 5.2
- Cygwin trunk
- FP 5.2
- FP trunk
- Linux 5.2
- Linux 5.2 debug
- Linux trunk
- Linux trunk debug
- MinGW 5.2
- MinGW bytecode 5.2
- MinGW bytecode trunk
- MinGW trunk
- macOS 5.2
- linux-arm64-5.2
- linux-s390x-5.2
- linux-ppc64le-5.2
- macos-arm64-5.2

Out of 44 workflows 24 failed with a total of 28 failures (4 workflows failed two tests) with all of them being genuine

jmid · 2024-01-16T14:40:28Z

The PR's latest test revision is so effective at triggering the #432 regression that we should consider temporarily relaxing the test to accept the current behaviour, to avoid having to check ~20 failing workflows on each CI run... 😬

shym

Looks very good, thank you for the updates!

…on state Closed

jmid · 2024-01-23T22:13:57Z

I rebased on main to avoid more false alarms and added 86f2ab2 to temporarily disable the #432 regression reports.

jmid · 2024-01-23T22:16:47Z

CI summary:

Cygwin 5.2 timed out during the Lin Dynlink test [ocaml5-issue] Deadlock in Dynlink test on Cygwin+MinGW+MSVC #307
macOS trunk failed to build dune [dune-issue] dune fails build under trunk/5.3 on macOS #433

Out of 44 workflows 2 failed, with both being genuine issues

jmid · 2024-01-24T12:15:50Z

CI summary for merge to main:

macOS trunk failed to build dune [dune-issue] dune fails build under trunk/5.3 on macOS #433

Out of 45 workflows 1 failed with a genuine issue

jmid linked an issue Jan 6, 2024 that may be closed by this pull request

Lin Out_channel test fails to trigger on FreeBSD #401

Closed

jmid force-pushed the io-stm-tests branch from 2150652 to 024dd85 Compare January 9, 2024 14:09

shym approved these changes Jan 10, 2024

View reviewed changes

src/io/stm_tests.ml Outdated Show resolved Hide resolved

src/io/stm_tests.ml Outdated Show resolved Hide resolved

src/io/stm_tests.ml Outdated Show resolved Hide resolved

src/io/stm_tests.ml Show resolved Hide resolved

src/io/stm_tests.ml Outdated Show resolved Hide resolved

jmid mentioned this pull request Jan 16, 2024

[ocaml5-issue] Output to closed Out_channel fails to raise Sys_error #432

Closed

shym approved these changes Jan 22, 2024

View reviewed changes

jmid added 14 commits January 23, 2024 14:48

Initial STM tests of Out_channel

71083dd

More Out_channel STM hacking

5e3f2e8

Update positions accordingly

72a5458

Length before flushing may not be up-to-date

fc97895

Whitespace/comment polish

46774d6

Add Seek cmd

894c643

Add close_noerr cmd

c3677ad

Add output_byte cmd

04a0f95

Add output_bytes cmd

c69ec6d

Add output cmd

f4aede1

Add output_substring cmd

f89ee56

Adjust weights

a90cd18

Clean up next_state

bfa89df

Remove dead code

1bbef88

jmid added 21 commits January 23, 2024 14:48

Allow Pos in Closed state

c6e6d04

Allow Flush in Closed state

066025a

Allow Output_char in Closed state

a047e3d

Allow Output_byte in Closed state

41be61b

Allow Output_string in Closed state

192a3d0

Bend the spec for Output_string in Closed state to match reality

f4bfced

Allow Output_bytes in Closed state

474c25d

Bend the spec for Output_bytes in Closed state to match reality

378725b

Allow Output in Closed state

805a634

Bend the spec for Output in Closed state to match reality

12a3e80

Allow Output_substring in Closed state

41fdf08

Allow Set_binary_mode in Closed state

8e1ec3b

Allow Set_buffered in Closed state

1adfd60

Allow Is_buffered in Closed state

9500753

Adjust frequencies in Closed state

a11d754

Simplify precond

544d564

Adjust Open_text postcond

15d6225

Fix indentation

1b76830

Sharpen the relaxed spec to only accept length 0 output w/o erroring …

bacb33e

…on state Closed

Small indentation fix

2abefbd

Temporarily accept Ok on a closed Out_channel

86f2ab2

jmid force-pushed the io-stm-tests branch from 0c8fdcb to 86f2ab2 Compare January 23, 2024 13:49

jmid merged commit b53236d into main Jan 23, 2024
32 of 34 checks passed

jmid deleted the io-stm-tests branch January 23, 2024 22:17

This was referenced Jan 23, 2024

[ocaml5-issue] Out_channel Lin test takes very long on macOS #321

Closed

Disable Lin Out_channel test under FreeBSD #430

Closed

jmid mentioned this pull request Mar 8, 2024

Revert temporary acceptance in STM Out_channel test #441

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Out_channel STM tests #431

Add Out_channel STM tests #431

jmid commented Jan 6, 2024

jmid commented Jan 9, 2024

jmid commented Jan 9, 2024

jmid commented Jan 9, 2024

shym left a comment

shym commented Jan 11, 2024

jmid commented Jan 11, 2024

jmid commented Jan 11, 2024

shym commented Jan 12, 2024

jmid commented Jan 12, 2024

jmid commented Jan 15, 2024

jmid commented Jan 16, 2024

jmid commented Jan 16, 2024

shym left a comment

jmid commented Jan 23, 2024

jmid commented Jan 23, 2024

jmid commented Jan 24, 2024

Add Out_channel STM tests #431

Add Out_channel STM tests #431

Conversation

jmid commented Jan 6, 2024

jmid commented Jan 9, 2024

jmid commented Jan 9, 2024

jmid commented Jan 9, 2024

shym left a comment

Choose a reason for hiding this comment

shym commented Jan 11, 2024

jmid commented Jan 11, 2024

jmid commented Jan 11, 2024

shym commented Jan 12, 2024

jmid commented Jan 12, 2024

jmid commented Jan 15, 2024

jmid commented Jan 16, 2024

jmid commented Jan 16, 2024

shym left a comment

Choose a reason for hiding this comment

jmid commented Jan 23, 2024

jmid commented Jan 23, 2024

jmid commented Jan 24, 2024