Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SslStreamNetworkStreamTest failures on Mac #53597

Closed
antonfirsov opened this issue Jun 2, 2021 · 15 comments · Fixed by #57381
Closed

SslStreamNetworkStreamTest failures on Mac #53597

antonfirsov opened this issue Jun 2, 2021 · 15 comments · Fixed by #57381
Labels
area-System.Security os-mac-os-x macOS aka OSX test-bug Problem in test source code (most likely) test-run-core Test failures in .NET Core test runs
Milestone

Comments

@antonfirsov
Copy link
Member

antonfirsov commented Jun 2, 2021

There are severeal SslStream test failures on Mac which seem to be similar to the Windows failures tracked by #29683 (same tests failing with an exception coming from the PAL)

Tests:

System.Net.Security.Tests.SslStreamNetworkStreamTest

  • SslStream_TargetHostName_Succeeds
  • SslStream_NestedAuth_Throws
  • SslStream_RandomSizeWrites_OK

Failures 6/13-8/13 (incl. PRs):

Day Run OS
6/17 Official run OSX.1013.Amd64.Open
6/18 Official run OSX.1013.Amd64.Open - Mono
6/23 PR #54621 OSX.1015.Amd64.Open
7/2 PR #54063 OSX.1013.Amd64.Open
7/12 PR #55220 OSX.1015.Amd64.Open - Mono
7/16 PR #55785 OSX.1015.Amd64.Open
8/4 PR #54640 OSX.1014.Amd64.Open
8/5 PR #56354 OSX.1014.Amd64.Open
8/6 PR #56983 OSX.1014.Amd64.Open
8/10 PR #57159 OSX.1015.Amd64.Open - Mono

Failure:

System.AggregateException : One or more errors occurred. (The specified item could not be found in the keychain.) (The following constructor parameters did not have matching fixture data: CertificateSetup setup)
---- Interop+AppleCrypto+AppleCommonCryptoCryptographicException : The specified item could not be found in the keychain.
---- The following constructor parameters did not have matching fixture data: CertificateSetup setup

----- Inner Stack Trace #1 (Interop+AppleCrypto+AppleCommonCryptoCryptographicException) -----
   at Interop.AppleCrypto.X509CopyWithPrivateKey(SafeSecCertificateHandle certHandle, SafeSecKeyRefHandle privateKeyHandle, SafeKeychainHandle targetKeychain) in /_/src/libraries/Common/src/Interop/OSX/System.Security.Cryptography.Native.Apple/Interop.X509.cs:line 422
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(SafeSecKeyRefHandle privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.cs:line 649
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.cs:line 566
   at System.Security.Cryptography.X509Certificates.RSACertificateExtensions.CopyWithPrivateKey(X509Certificate2 certificate, RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/System/Security/Cryptography/X509Certificates/RSACertificateExtensions.cs:line 56
   at System.Net.Security.Tests.TestHelper.GenerateCertificates(String targetName, String testName, Boolean longChain, Boolean serverCertificate) in /_/src/libraries/System.Net.Security/tests/FunctionalTests/TestHelper.cs:line 176
   at System.Net.Security.Tests.CertificateSetup..ctor() in /_/src/libraries/System.Net.Security/tests/FunctionalTests/SslStreamNetworkStreamTest.cs:line 29
----- Inner Stack Trace #2 (Xunit.Sdk.TestClassException) -----

Sample Kusto query

TestResults 
| join kind=inner WorkItems on WorkItemId 
| join kind=inner Jobs on JobId
| where Finished >= datetime(2021-05-01) and Method == "SslStream_TargetHostName_Succeeds" and Result == 'Fail'
| order by Finished desc 
| project Finished, Branch, Type, Method, Arguments, QueueName, MachineName, Message, StackTrace

/cc @wfurt @aik-jahoda

@antonfirsov antonfirsov added area-System.Net.Security test-run-core Test failures in .NET Core test runs labels Jun 2, 2021
@ghost
Copy link

ghost commented Jun 2, 2021

Tagging subscribers to this area: @dotnet/ncl, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

There are severeal SslStream test failures on Mac which seem to be similar to the Windows failures tracked by #29683 (same tests failing with an exception coming from the PAL)

Test class:

System.Net.Security.Tests.SslStreamNetworkStreamTest

Test cases:

SslStream_TargetHostName_Succeeds, SslStream_NestedAuth_Throws, SslStream_RandomSizeWrites_OK

Message:

System.AggregateException : One or more errors occurred. (The specified item could not be found in the keychain.) (The following constructor parameters did not have matching fixture data: CertificateSetup setup)
---- Interop+AppleCrypto+AppleCommonCryptoCryptographicException : The specified item could not be found in the keychain.
---- The following constructor parameters did not have matching fixture data: CertificateSetup setup

Stack trace:

----- Inner Stack Trace #1 (Interop+AppleCrypto+AppleCommonCryptoCryptographicException) -----
   at Interop.AppleCrypto.X509CopyWithPrivateKey(SafeSecCertificateHandle certHandle, SafeSecKeyRefHandle privateKeyHandle, SafeKeychainHandle targetKeychain) in /_/src/libraries/Common/src/Interop/OSX/System.Security.Cryptography.Native.Apple/Interop.X509.cs:line 422
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(SafeSecKeyRefHandle privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.cs:line 649
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.cs:line 566
   at System.Security.Cryptography.X509Certificates.RSACertificateExtensions.CopyWithPrivateKey(X509Certificate2 certificate, RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/System/Security/Cryptography/X509Certificates/RSACertificateExtensions.cs:line 56
   at System.Net.Security.Tests.TestHelper.GenerateCertificates(String targetName, String testName, Boolean longChain, Boolean serverCertificate) in /_/src/libraries/System.Net.Security/tests/FunctionalTests/TestHelper.cs:line 176
   at System.Net.Security.Tests.CertificateSetup..ctor() in /_/src/libraries/System.Net.Security/tests/FunctionalTests/SslStreamNetworkStreamTest.cs:line 29
----- Inner Stack Trace #2 (Xunit.Sdk.TestClassException) -----

Sample Kusto query

TestResults 
| join kind=inner WorkItems on WorkItemId 
| join kind=inner Jobs on JobId
| where Finished >= datetime(2021-05-01) and Method == "SslStream_TargetHostName_Succeeds" and Result == 'Fail'
| order by Finished desc 
| project Finished, Branch, Type, Method, Arguments, QueueName, MachineName, Message, StackTrace

/cc @wfurt @aik-jahoda

Author: antonfirsov
Assignees: -
Labels:

area-System.Net.Security, test-run-core

Milestone: -

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jun 2, 2021
@ghost
Copy link

ghost commented Jun 2, 2021

Tagging subscribers to this area: @bartonjs, @vcsjones, @krwq, @GrabYourPitchforks
See info in area-owners.md if you want to be subscribed.

Issue Details

There are severeal SslStream test failures on Mac which seem to be similar to the Windows failures tracked by #29683 (same tests failing with an exception coming from the PAL)

Test class:

System.Net.Security.Tests.SslStreamNetworkStreamTest

Test cases:

SslStream_TargetHostName_Succeeds, SslStream_NestedAuth_Throws, SslStream_RandomSizeWrites_OK

Message:

System.AggregateException : One or more errors occurred. (The specified item could not be found in the keychain.) (The following constructor parameters did not have matching fixture data: CertificateSetup setup)
---- Interop+AppleCrypto+AppleCommonCryptoCryptographicException : The specified item could not be found in the keychain.
---- The following constructor parameters did not have matching fixture data: CertificateSetup setup

Stack trace:

----- Inner Stack Trace #1 (Interop+AppleCrypto+AppleCommonCryptoCryptographicException) -----
   at Interop.AppleCrypto.X509CopyWithPrivateKey(SafeSecCertificateHandle certHandle, SafeSecKeyRefHandle privateKeyHandle, SafeKeychainHandle targetKeychain) in /_/src/libraries/Common/src/Interop/OSX/System.Security.Cryptography.Native.Apple/Interop.X509.cs:line 422
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(SafeSecKeyRefHandle privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.cs:line 649
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.cs:line 566
   at System.Security.Cryptography.X509Certificates.RSACertificateExtensions.CopyWithPrivateKey(X509Certificate2 certificate, RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/System/Security/Cryptography/X509Certificates/RSACertificateExtensions.cs:line 56
   at System.Net.Security.Tests.TestHelper.GenerateCertificates(String targetName, String testName, Boolean longChain, Boolean serverCertificate) in /_/src/libraries/System.Net.Security/tests/FunctionalTests/TestHelper.cs:line 176
   at System.Net.Security.Tests.CertificateSetup..ctor() in /_/src/libraries/System.Net.Security/tests/FunctionalTests/SslStreamNetworkStreamTest.cs:line 29
----- Inner Stack Trace #2 (Xunit.Sdk.TestClassException) -----

Sample Kusto query

TestResults 
| join kind=inner WorkItems on WorkItemId 
| join kind=inner Jobs on JobId
| where Finished >= datetime(2021-05-01) and Method == "SslStream_TargetHostName_Succeeds" and Result == 'Fail'
| order by Finished desc 
| project Finished, Branch, Type, Method, Arguments, QueueName, MachineName, Message, StackTrace

/cc @wfurt @aik-jahoda

Author: antonfirsov
Assignees: -
Labels:

area-System.Security, os-mac-os-x, test-run-core, untriaged

Milestone: -

@wfurt
Copy link
Member

wfurt commented Jun 2, 2021

This fails to copy certificate with key. may be related to #46837.
Let me know @bartonjs if you see anything wrong with the code generating the certificates.

@vcsjones
Copy link
Member

vcsjones commented Jun 2, 2021

I have some suspicion that #51976 may be related, I will try to reproduce and bisect.

@bartonjs
Copy link
Member

bartonjs commented Jul 2, 2021

Looks like this is still happening:

System.AggregateException : One or more errors occurred. (UNIX[Undefined error: 0]) (The following constructor parameters did not have matching fixture data: CertificateSetup setup)
---- Interop+AppleCrypto+AppleCommonCryptoCryptographicException : UNIX[Undefined error: 0]
---- The following constructor parameters did not have matching fixture data: CertificateSetup setup

The latter exception is caused by the first, so it's ignorable. The first exception callstack:

   at Interop.AppleCrypto.X509CopyWithPrivateKey(SafeSecCertificateHandle certHandle, SafeSecKeyRefHandle privateKeyHandle, SafeKeychainHandle targetKeychain) in /_/src/libraries/Common/src/Interop/OSX/System.Security.Cryptography.Native.Apple/Interop.X509.macOS.cs:line 277
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(SafeSecKeyRefHandle privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.Keys.cs:line 167
   at Internal.Cryptography.Pal.AppleCertificatePal.CopyWithPrivateKey(RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/Internal/Cryptography/Pal.OSX/AppleCertificatePal.Keys.cs:line 99
   at System.Security.Cryptography.X509Certificates.RSACertificateExtensions.CopyWithPrivateKey(X509Certificate2 certificate, RSA privateKey) in /_/src/libraries/System.Security.Cryptography.X509Certificates/src/System/Security/Cryptography/X509Certificates/RSACertificateExtensions.cs:line 56
   at System.Net.Security.Tests.TestHelper.GenerateCertificates(String targetName, String testName, Boolean longChain, Boolean serverCertificate) in /_/src/libraries/System.Net.Security/tests/FunctionalTests/TestHelper.cs:line 177
   at System.Net.Security.Tests.CertificateSetup..ctor() in /_/src/libraries/System.Net.Security/tests/FunctionalTests/SslStreamNetworkStreamTest.cs:line 30

That same callstack (including throwing from Interop.X509.macOS.cs#277) also manifests with a different message:

System.AggregateException : One or more errors occurred. (The specified item could not be found in the keychain.) (The following constructor parameters did not have matching fixture data: CertificateSetup setup)
---- Interop+AppleCrypto+AppleCommonCryptoCryptographicException : The specified item could not be found in the keychain.
---- The following constructor parameters did not have matching fixture data: CertificateSetup setup

For SslStream_TargetHostName_Succeeds this fires about once a week (not consistently enough to be "aha! it's a Sunday afternoon maintenance bug!", or anything like that).

@bartonjs bartonjs added this to the 6.0.0 milestone Jul 2, 2021
@bartonjs bartonjs removed the untriaged New issue has not been triaged by the area owner label Jul 2, 2021
@vcsjones
Copy link
Member

I spent some time trying to reproduce it with no luck.

UNIX[Undefined error: 0]

This feels very reminiscent of #39603. Not immediately suggesting it’s related or the same cause, just connecting two dots that may have nothing to do with each other.

Do we capture macOS system logs during CI at all? I seem to remember the keychain giving nice logs for that particular issue, but I was looking at the logs locally since I could reproduce it on my machine.

@danmoseley
Copy link
Member

I am not aware that we capture system logs on Mac. But probably here is the place that we would capture them
https://github.com/dotnet/runtime/blob/main/eng/testing/RunnerTemplate.sh
then either write them to stdout (as I did in there for the last lines of dmesg on Linux) or if larger, put them in a file at $HELIX_WORKITEM_UPLOAD_ROOT which I believe will cause them to appear in the Azure DevOps UI.

@danmoseley
Copy link
Member

It would be nice if there was an xunit mechanism to do this but there isn't.

@bartonjs
Copy link
Member

Assuming I'm interpreting the data correctly, literally every test failure involving a CopyWithPrivateKey operation comes from the networking tests CertificateSetup fixture. So either it's doing something wrong, or it's uncovering a weird GC hole, or something.

@MihaZupan
Copy link
Member

These 3 tests are back to failing daily #54640 #56354 #56983

@karelz
Copy link
Member

karelz commented Aug 13, 2021

This is still happening (see updated table in top post) ... do we know how to address it, or should we disable the tests in 6.0 branch?
@bartonjs @wfurt

@bartonjs
Copy link
Member

I should have time to look at this today.

@bartonjs
Copy link
Member

bartonjs commented Aug 13, 2021

OK, I was able to coerce failure. 3 of them, all at the same time.

  • Every failure in Kusto that involves CopyWithPrivateKey comes from this test fixture
  • This test fixture calls a System.Net.Security.Tests helper with a "longChain: true" option.
  • When longChain is true the helper does something more than CertificateAuthority.BuildPrivatePki
  • CertificateAuthority.BuildPrivatePki is used oodles of times in the X.509 tests. So the oddness is probably in the "if (longChain) path".
  • In that path two new intermediates are generated, and the EE cert is replaced with this new, longer, cert path.
  • The line that fails is always the line that binds the new EE cert's private key.
  • I'm now like 99% sure that what's happening is
                    RSA eeKey = endEntity.GetRSAPrivateKey();
                    endEntity = intermediateAuthority3.CreateEndEntity(
                        $"CN=\"A SSL Test\", O=\"testName\"",
                        eeKey,
                        extensions);

// You lose the lottery.  the previous endEntity just got finalized.
/// Since cert+key always lives in a keychain, and we just destroyed the keychain, eeKey is now reference to a destroyed key.

                    endEntity = endEntity.CopyWithPrivateKey(eeKey);

Sure enough, if I save the reference to the previous endEntity and dispose it where that comment goes, this moves to deterministic failure.

Shortest workaround, save a reference to endEntity and keep it alive past the rebind with GC.KeepAlive.

@wfurt is there a reason that you did all the rework for the long chain, instead of

            CertificateAuthority.BuildPrivatePki(
                PkiOptions.IssuerRevocationViaCrl,
                out RevocationResponder responder,
                out CertificateAuthority root,
-               out CertificateAuthority intermediate,
+               out CertificateAuthority[] intermediateAuthorities,
                out X509Certificate2 endEntity,
+               intermediateAuthorityCount: longChain ? 2 : 1,
                subjectName: targetName,
                testName: testName,
                keySize: keySize,
                extensions: extensions);

@wfurt
Copy link
Member

wfurt commented Aug 13, 2021

There was no specific reason besides getting longer chain (if I remember well, this was added in work to make sure the chain is sent correctly on Ssl) If the intermediateAuthorityCount existed back than I miss it @bartonjs .

@bartonjs
Copy link
Member

OK, if it's just for the longer chain I'll put up a PR to convert it to just pass the longer path count.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 13, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 14, 2021
@bartonjs bartonjs added the test-bug Problem in test source code (most likely) label Aug 15, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Security os-mac-os-x macOS aka OSX test-bug Problem in test source code (most likely) test-run-core Test failures in .NET Core test runs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants