Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent SqlConnection from being held past an async boundary #22658

Merged
merged 4 commits into from
Oct 13, 2017

Conversation

sharwell
Copy link
Member

@sharwell sharwell commented Oct 11, 2017

Fixes #22650

Rather than place a hard cap on the number of connections, I updated the code to ensure database connections are only held for the duration where they are actually in use.

CC: @dotnet/roslyn-ide @heejaechang @CyrusNajmabadi @onyxmaster

Customer scenario

A customer opens a large solution with 15.3 or 15.4. The IDE is at high risk for encountering OOM errors.

Bugs this fixes:

#22650
DevDiv 507560

Workarounds, if any

None

Risk

Relatively low.

Performance impact

For scenarios where database writes to disk are slow, performance is improved by allowing database connection sharing.

Is this a regression from a previous update?

Yes.

Root cause analysis:

As a best guess:

  1. Our drives are faster, so we didn't observe the stalls locally (a local reproducer would mean we could sample the native allocations at runtime to get call stacks, as opposed to looking at a heap dump post-mortem)
  2. The memory allocation impacting the process was a native memory allocation, which we were aware in statistics but did not have insight into the cause (we now know to treat the managed SqlConnection and sqlite3 objects as heavier to account for this, similar to what we already did for MemoryMappedView)

How was the bug found?

The 425600 byte allocation pattern was discovered by Lee Culver on the .NET Fundamentals team. Text contents in some of these blocks made me suspicious of SQLite, leading me to search for "425600 SQLite". The value appeared twice in interesting results:

https://forums.plex.tv/discussion/256008/pms-crashes-randomly-under-load

ERROR - SQLITE3:1B3E069A, 7, failed to allocate 425600 bytes of memory

http://iamoracleadmin.blogspot.com/p/sqlite-cheat-sheet.html

Largest Allocation: 425600 bytes

I then used PerfView to review the number of connections present in the managed heap, and found it exactly matched the number of native allocations.

Test documentation updated?

N/A

@heejaechang
Copy link
Contributor

heejaechang commented Oct 11, 2017

I think the pool must be still upper bounded. there is no reason it is unbounded and cache 100 of connections when most of time 80 of them is just sitting there doing nothing.

@heejaechang
Copy link
Contributor

oops button clicked by mistake sorry.

@heejaechang
Copy link
Contributor

by the way, how many connections where there? surprised even though connections were held cross async, that lead to that many concurrent connectinos to be needed.

@sharwell
Copy link
Member Author

sharwell commented Oct 11, 2017

I think the pool must be still upper bounded

The pool has a soft upper bound of the number of managed threads. I was actually originally planning to add a second commit to add a hard limit on the pool itself, but realized there's no longer a code path to do anything interesting there.

The specific limit is the number of threads attempting to access the database concurrently. Given the threading mode, this is a good limit. We could further reduce resources over time by using a MRU and cache eviction, but I don't believe it will get us any further than we made it so far. Plus there seem to be other areas of the SQLite cache that could be improved first.

@sharwell
Copy link
Member Author

by the way, how many connections where there?

The issue was found in a heap that had 2740. The screenshot for the issue was another heap with 3429. I've seen OOM heap dumps with ~900 all the way up to a winner of ~7000.

@heejaechang
Copy link
Contributor

I still don't see why upper limit is bad? if you really don't want, put crash there so that if limit is upper than we expect, we explicitly crash VS.

@heejaechang
Copy link
Contributor

hmmm.. still can't understand how there can be 7000 pooled connections. are we having thread starvation issue again?

@sharwell
Copy link
Member Author

sharwell commented Oct 11, 2017

@heejaechang There is no need to crash VS since it typically manifests as UI delays, etc. The telemetry team has new red flags in place and will alert us if they see low memory conditions with a large number of these instances. Also, the use of the shared cache flag (which we should talk about separately) should prevent the cache from growing even if the number of connections does.

@sharwell
Copy link
Member Author

hmmm.. still can't understand how there can be 7000 pooled connections. are we having thread starvation issue again?

We don't allow reads while writes are pending. Previously the read operations would claim a connection before waiting (asynchronously) for pending operations to complete. Now read operations return the connection to the pool before waiting.

@Pilchie
Copy link
Member

Pilchie commented Oct 11, 2017

In other words, when FAR starts a per-document search, if the SyntaxTreeIndex is being written, then there will be a connection per document?

@heejaechang
Copy link
Contributor

I don't understand, if we are sure connection won't be pooled to 7000, then we should put that assumption in the code, if we are not sure, we should guard from it. otherwise, we are just hiding.

@@ -53,7 +53,7 @@ public static SqlConnection Create(IPersistentStorageFaultInjector faultInjector

// Enable shared cache so that multiple connections inside of same process share cache
// see https://sqlite.org/threadsafe.html for more detail
var flags = OpenFlags.SQLITE_OPEN_CREATE | OpenFlags.SQLITE_OPEN_READWRITE | OpenFlags.SQLITE_OPEN_SHAREDCACHE;
var flags = OpenFlags.SQLITE_OPEN_CREATE | OpenFlags.SQLITE_OPEN_READWRITE | OpenFlags.SQLITE_OPEN_NOMUTEX | OpenFlags.SQLITE_OPEN_SHAREDCACHE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag says we are only using connection instances on one thread at a time. We were already doing this, but since we didn't specify the flag the database fell back to SQLITE_OPEN_FULLMUTEX which added a second layer of locks.

@heejaechang
Copy link
Contributor

heejaechang commented Oct 11, 2017

current implementation, if we are hit 100 at one point and never hit more than 20 again, the pool still hold onto 100 wasting 80. improvement on pool to do eviction and etc, we should remove upper limit when we implement that. no reason we don't do since we might do the eviction in the future. especially when putting upper limit is so simple.

...

also, I still can't understand there is 7000 connection pooled. are we sure we are not leaking something?

@sharwell
Copy link
Member Author

sharwell commented Oct 11, 2017

@Pilchie

In other words, when FAR starts a per-document search, if the SyntaxTreeIndex is being written, then there will be a connection per document?

I don't know enough about the reasons why we read from the database to answer this directly. However, I do know two things:

  1. I'd like to improve the overall cache behavior to make reads not dependent on flushing writes to disk
  2. SyntaxTreeIndex has shown up as a red flag recently (SyntaxTreeIndex holding a large working set #22605)

@heejaechang

I don't understand, if we are sure connection won't be pooled to 7000, then we should put that assumption in the code, if we are not sure, we should guard from it. otherwise, we are just hiding.

There are some edge cases which make it very difficult to predict the hard upper bound. I'm willing to implement telemetry and perhaps a time-based pool eviction policy so we are aware of regressions, but to me it's not worth causing a user to crash (coded assertions) or severe performance degradation (SQL connections getting created due to exceeding pool limits).

@heejaechang
Copy link
Contributor

heejaechang commented Oct 11, 2017

severe performance degradation (SQL connections getting created due to exceeding pool limits).

are you sure this cause severe performance degradation. currently we have evidence that pooling cause severe performance degradation - oom, but not the other way?

especially with the change you made here, I think upper limit won't affect perf as much.

@heejaechang
Copy link
Contributor

by the way, any pool or cache that doesn't have upper limit is I think code smell.

@sharwell
Copy link
Member Author

by the way, any pool or cache that doesn't have upper limit is I think code smell.

I agree. But one step at a time is good, especially on a rush change like this.

@onyxmaster
Copy link
Contributor

The nomutex flag should be beneficial in terms of performance and allow for cross-thread connection migration when lock is held on this connection, but is it needed for shared cache scenario?

As for original case -- when developing #22340 I had file logging enabled, and I've seen a large chunk of connections running amok (I had a log file for each pid+thread_id combo on connection open, so I noticed lots of files quickly appearing), but it appears that having a fast CPU/disk prevents process death. I didn't report it in the #22340 because I believed it's the "normal" mode of operation and I had this even without proposed locking changes.

With the connection pool limit, I'd put a cap on a connection pool. Crashing is of course not an option, but logging (not sure what policy Roslyn have for these kinds of errors) + wait queue/reverting to non-pooled approach looks fine to me (yes, it will slow down, and it would be another case to solve, but it's still an improvement over crashing).

@sharwell
Copy link
Member Author

The nomutex flag should be beneficial in terms of performance and allow for cross-thread connection migration when lock is held on this connection, but is it needed for shared cache scenario?

Given that we use more than one connection object, it appears nomutex is the behavior we always coded for. I'm not sure why shared cache has been enabled - it appears to be more relevant for embedded applications. However, its primary impact involves cases with multiple writers which appears to be "less common" (though not impossible) with our code.

With the connection pool limit, I'd put a cap on a connection pool.

There is a soft cap in place equivalent to the maximum number of active threads concurrently accessing the database. Given that the threads are generally within the .NET thread pool, this is loosely related to the number of logical processors on the machine, but can vary a bit from that. I would not expect this to ever exceed the number of logical processors times two, and plan to verify this through telemetry.

@heejaechang
Copy link
Contributor

shared cache is beneficial to us since it doesn't cause each connection to be so expansive.
https://sqlite.org/sharedcache.html

@sharwell
Copy link
Member Author

@heejaechang

shared cache is beneficial to us since it doesn't cause each connection to be so expansive.

Shared cache also causes concurrent write attempts on separate connection objects to block on each other. I'm hoping this is rare but we should be watching to see if performance reports deviate from expectation. For now it is acting as a secondary safeguard against runaway memory problems due to connection pool growing (and for the 15.5 Preview 1 release, will be the only remediation in place).

@heejaechang
Copy link
Contributor

by the way, with exclusive mode and shared cache, we might no longer need to wait for writer since only 1 process uses the db and doesn't require file system or mutex to communicate db's state with other process or other connection.

@heejaechang
Copy link
Contributor

not sure how write can't be serialized. I think WAL basically does that for us. before it was issue since multiple process can write to db, but I believe that is no longer possible?

@heejaechang
Copy link
Contributor

anyway, okay, if other people are okay with not having upper limit.

@CyrusNajmabadi
Copy link
Member

It put changes in memory cache and then journals and then db.
I guess SQLite doesn't have that kind of mode? I think it might have one.

The problem was simply that it was effectively 100s of K of transactions. Each write by some component into the persistence layer needed the full cost of being handled by sqlite properly. A single transactoin that inserts 10k rows is way better than 10k transactions that insert one row each.

Even with WAL that remains true. If there's a way to configure sqlite to make it the same, then by all means switch :) But i could find nothing that would give the same perf.

--

To repeat: i would love to remove the buffering layer :) it's just added complexity for a perf win. If we can get the perf win without it, be my guest :D

@heejaechang
Copy link
Contributor

I searched a bit and looks like what you did was the recommendation. :)

@sharwell sharwell changed the base branch from master to dev15.4.x October 12, 2017 00:14
@sharwell
Copy link
Member Author

@heejaechang Rebased this on dev15.4.x, and included #21532.

@jasonmalinowski This should be the first PR we were after.

@heejaechang
Copy link
Contributor

I think we should bring this too (#22340)

@sharwell
Copy link
Member Author

@heejaechang After this is merged, we will rebase #22340 and bring it in as a separate change.

@onyxmaster
Copy link
Contributor

onyxmaster commented Oct 12, 2017

A word of caution. It appears that the problems I encountered with #22340 are related not to exclusive locking, they but to shared cache.

With shared cache enabled, VS sometimes crashes (stacktrace below), with SQLite returning SQLITE_LOCKED, which is an SQLITE_BUSY shared-cache counterpart. While the busy case is handled in many places in the code, the locked one is not.

Application: devenv.exe
Framework Version: v4.0.30319
Description: The application requested process termination through System.Environment.FailFast(string message).
Message: Microsoft.CodeAnalysis.SQLite.Interop.SqlException: database table is locked: StringInfo1
database table is locked
   at Microsoft.CodeAnalysis.SQLite.Interop.SqlConnection.Throw(sqlite3 handle, Result result)
   at Microsoft.CodeAnalysis.SQLite.SQLitePersistentStorage.TryFetchStringTable(SqlConnection connection)
   at Microsoft.CodeAnalysis.SQLite.SQLitePersistentStorage.BulkPopulateProjectIds(SqlConnection connection, Project project, Boolean fetchStringTable)
   at Microsoft.CodeAnalysis.SQLite.SQLitePersistentStorage.TryGetDocumentDataId(SqlConnection connection, Document document, String name, Int64& dataId)
   at Microsoft.CodeAnalysis.SQLite.SQLitePersistentStorage.DocumentAccessor.TryGetDatabaseId(SqlConnection connection, ValueTuple`2 key, Int64& dataId)
   at Microsoft.CodeAnalysis.SQLite.SQLitePersistentStorage.Accessor`3.<ReadStreamAsync>d__9.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.CodeAnalysis.FindSymbols.SyntaxTreeIndex.<PrecalculatedAsync>d__55.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.CodeAnalysis.FindSymbols.SyntaxTreeIndex.<PrecalculateAsync>d__8.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService.WorkCoordinator.IncrementalAnalyzerProcessor.<>c__DisplayClass32_1`1.<<RunAnalyzersAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService.WorkCoordinator.IncrementalAnalyzerProcessor.<GetOrDefaultAsync>d__34`2.MoveNext()
Stack:
   at System.Environment.FailFast(System.String, System.Exception)
   at Microsoft.CodeAnalysis.FailFast.OnFatalException(System.Exception)
   at Microsoft.CodeAnalysis.ErrorReporting.FatalError.Report(System.Exception, System.Action`1<System.Exception>)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService+WorkCoordinator+IncrementalAnalyzerProcessor+<GetOrDefaultAsync>d__34`2[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].MoveNext()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService+WorkCoordinator+IncrementalAnalyzerProcessor+<GetOrDefaultAsync>d__34`2[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].MoveNext()
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].Start[[Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService+WorkCoordinator+IncrementalAnalyzerProcessor+<GetOrDefaultAsync>d__34`2[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]], Microsoft.CodeAnalysis.Features, Version=42.42.42.42, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<GetOrDefaultAsync>d__34`2<System.__Canon,System.__Canon> ByRef)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService+WorkCoordinator+IncrementalAnalyzerProcessor.GetOrDefaultAsync[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]](System.__Canon, System.Func`3<System.__Canon,System.Threading.CancellationToken,System.Threading.Tasks.Task`1<System.__Canon>>, System.Threading.CancellationToken)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService+WorkCoordinator+IncrementalAnalyzerProcessor+<RunAnalyzersAsync>d__32`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].TrySetResult(System.__Canon)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SetResult(System.__Canon)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService+WorkCoordinator+IncrementalAnalyzerProcessor+<GetOrDefaultAsync>d__34`2[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].TrySetResult(System.__Canon)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SetResult(System.__Canon)
   at Microsoft.CodeAnalysis.SolutionCrawler.SolutionCrawlerRegistrationService+WorkCoordinator+IncrementalAnalyzerProcessor+<>c__DisplayClass32_1`1+<<RunAnalyzersAsync>b__0>d[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.Threading.Tasks.VoidTaskResult, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].TrySetResult(System.Threading.Tasks.VoidTaskResult)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.Threading.Tasks.VoidTaskResult, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SetResult(System.Threading.Tasks.VoidTaskResult)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.SetResult()
   at Microsoft.CodeAnalysis.Editor.Implementation.TodoComments.TodoCommentIncrementalAnalyzer+<AnalyzeSyntaxAsync>d__7.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.Collections.Immutable.ImmutableArray`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]], System.Collections.Immutable, Version=1.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]].TrySetResult(System.Collections.Immutable.ImmutableArray`1<System.__Canon>)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.Collections.Immutable.ImmutableArray`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]], System.Collections.Immutable, Version=1.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]].SetResult(System.Collections.Immutable.ImmutableArray`1<System.__Canon>)
   at Microsoft.CodeAnalysis.Editor.Implementation.TodoComments.TodoCommentIncrementalAnalyzer+<CreateItemsAsync>d__9.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].TrySetResult(System.__Canon)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SetResult(System.__Canon)
   at Microsoft.CodeAnalysis.TextDocumentState+<GetTextAsync>d__30.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].TrySetResult(System.__Canon)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SetResult(System.__Canon)
   at Microsoft.CodeAnalysis.TextDocumentState+<GetTextAndVersionAsync>d__37.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].TrySetResult(System.__Canon)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SetResult(System.__Canon)
   at Microsoft.CodeAnalysis.RecoverableTextAndVersion+<GetValueAsync>d__14.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.FinishContinuations()
   at System.Threading.Tasks.Task.FinishStageThree()
   at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].TrySetResult(System.__Canon)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SetResult(System.__Canon)
   at Microsoft.CodeAnalysis.Host.RecoverableWeakValueSource`1+<GetValueAsync>d__16[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.InvokeMoveNext(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run()
   at System.Threading.Tasks.AwaitTaskContinuation.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()

@sharwell
Copy link
Member Author

@onyxmaster Sounds like I need to bring #21800 back as well, and then update it to treat BUSY and LOCKED the same way.

@sharwell
Copy link
Member Author

This is now updated to include #21800. I then modified the two locations which handle SQLITE_BUSY to also handle SQLITE_LOCKED as a second return value with the same impact on our code.


return true;
}
catch (SqlException e) when (e.Result == Result.BUSY || e.Result == Result.LOCKED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little nervous about this. Do we need to make the change that causes these 'locked' errors to start happening. What gain are we getting to warrant it?

My specific concern is that this now means that many more sql calls may have to be protected from locked errors, where we didn't have to worry about that before. There may be more places that are missing that may end up leading to actual application errors in the wild.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is the BUSY and LOCKED are similar, and the one getting reported depends on the current connection configuration. The original configuration produced BUSY errors (which we saw in Watson), but the change to use a shared and exclusive locks (required for follow-up work) changed a subset of those cases to return LOCKED instead.

Copy link
Contributor

@onyxmaster onyxmaster Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CyrusNajmabadi I'd like to note that busy states are almost the same as locked, so the argument about unknown places in code that may fail are relevant to the busy.

Update: didn't see Sam's comment before posting, I agree with it.

Copy link
Member

@CyrusNajmabadi CyrusNajmabadi Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are effectively 1:1, then i'm ok with things. My concern was that 'locked' might show up in different places where 'busy' woudl not have. If you guys are convinced that that's not an issue, then i'm def ok with this then :) Thanks!

@onyxmaster
Copy link
Contributor

onyxmaster commented Oct 13, 2017

Minor note: the #22340 no longer proposes the use of exclusive locking (and dependence on shared cache), so at least a part of the discussion might need reconsideration.

@natidea
Copy link
Contributor

natidea commented Oct 13, 2017

Approved to merge via 507562

@sharwell sharwell changed the title Prevent SqlConnection from being held past an async boundary WIP Prevent SqlConnection from being held past an async boundary Oct 13, 2017
@sharwell sharwell changed the title WIP Prevent SqlConnection from being held past an async boundary Prevent SqlConnection from being held past an async boundary Oct 13, 2017
@sharwell sharwell merged commit a7c58d2 into dotnet:dev15.4.x Oct 13, 2017
@sharwell sharwell deleted the connection-leak branch October 13, 2017 20:35
@sharwell sharwell added this to the 15.4 milestone Oct 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants