Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal] SqlClientOptions #2870

Open
edwardneal opened this issue Sep 18, 2024 · 9 comments
Open

[API Proposal] SqlClientOptions #2870

edwardneal opened this issue Sep 18, 2024 · 9 comments
Labels
💡 Enhancement New feature request 🆕 Public API Use this label for new API additions to the driver.

Comments

@edwardneal
Copy link
Contributor

edwardneal commented Sep 18, 2024

SqlClientOptions is a proposed mechanism to provide strongly-typed runtime configuration of the SqlClient library prior to use. It doesn't add any new functionality, but I think it's needed before proceeding with some broader modernisation efforts. It also provides a coherent way to address situations where we want to be able to control library-wide defaults, but then to override them on some connections.

Three of the scenarios requiring this are:

  1. SqlClient isn't trim-safe, and the current configuration APIs currently block any progress towards this: the authentication providers and the SqlAuthenticationInitializer are both instantiated using Type.GetType from a string in a configuration file. A strongly-typed equivalent would remove that obstacle.
  2. We currently always load configuration from an app.config file. This isn't a problem for .NET Framework, but newer .NET code uses a different approach - usually DI, although those details aren't particularly relevant. By loosening the coupling between the runtime configuration and its startup source, we open a route for future libraries to integrate with SqlClient, loading their configuration from appsettings.json, Azure App Configuration or similar.
  3. There have been comments in the past about having a DbDataSource implementation, and SqlClientX has SqlDataSource. This is currently built from SqlConnectionStringBuilder, but this implies that any pay-for-play feature enablement such as Azure authentication currently needs to be visible from the connection string. I don't think that needs to be the case - Npgsql has an NpgsqlDataSourceBuilder class with a UseUserCertificateValidationCallback method, which can't cleanly be expressed as part of a connection string. It also has a NpgsqlSlimDataSourceBuilder class which includes methods like EnableIntegratedSecurity and EnableExtraConversions, for use in NativeAOT. SqlClient will likely need a strongly-typed "system options" class of some kind for SqlDataSource, and SqlClientOptions provides a base for that.

The same API surface would also be useful for future development, such as making Azure.Identity optional, custom AlwaysEncrypted enclave providers, custom lists of transient error codes, and possibly cross-platform support for SQL aliases. It effectively provides a platform for providing library-specific configuration where we might want to set global defaults, but have individual overrides of those settings on SqlConnection (and eventually, on SqlDataSource.)

I've based the initial API shape on the three types which load their configuration using System.Configuration:

  • SqlAuthenticationProviderManager. This loads the authentication providers and the SqlAuthenticationInitializer, as well as miscellaneous configuration (the application's OAuth client ID.)
  • LocalDBAPI in .NET Framework. If .NET Framework tries to connect to a LocalDB instance which doesn't exist, it'll create the instance based on this.
  • SqlConfigurableRetryLogicManager, which also uses reflection to locate a SqlRetryLogicBaseProvider derivative.

The API proposal also includes an AppContext switch. Any future trimming support would tell the trimmer to treat the switch as a feature switch and stub away all of the now-centralised references to System.Configuration.

API proposal

namespace Microsoft.Data.SqlClient.Configuration;

public sealed class SqlClientOptions
{
    // Upon first access, if this property hasn't already been set and the relevant AppContext switch is enabled,
    // this will populate an instance from the existing configuration file.
    public static SqlClientOptions Instance { get; set; }

    // Replaces relevant components of LocalDBAPI.
    // Keyed by LocalDB instance name, case-insensitive comparer.
    public Dictionary<string, LocalDBInstanceOptions> LocalDBInstanceOptions { get; set; }

    // Integrates with SqlAuthenticationProviderManager.
    // Maps authentication methods to authentication providers.
    public Dictionary<SqlAuthenticationMethod, SqlAuthenticationProvider> AuthenticationProviders { get; set; }

    // SqlAuthenticationInitializer doesn't have a direct parallel - it's invoked by the constructor.

    // Replaces SqlConfigurableRetryLogicManager
    public ResilienceOptions CommandResilienceOptions { get; set; }

    public ResilienceOptions ConnectionResilienceOptions { get; set; }

    public SqlClientOptions() { }

    // Implements support for SqlAuthenticationInitializer replacement.
    public SqlClientOptions(Action onAuthenticationProvidersInitializing) : this() { }

    public SqlClientOptions Clone();
}

public sealed class ActiveDirectoryAuthenticationProvider : SqlAuthenticationProvider
{
    // OAuth client ID - default to ActiveDirectoryAuthentication.AdoClientId.
    public string ApplicationClientId { get; set; }
}

public sealed class LocalDBInstanceOptions
{
    public string Name { get; }
	
    public Version Version { get; }
}

// Deliberately not sealed to allow a place for third-party resiliency libraries such as Polly to integrate
// in the future, if required.
public class ResilienceOptions
{
    public SqlRetryLogicOption RetryLogicOptions { get; set; }

    public SqlRetryLogicBaseProvider RetryLogicProvider { get; set; }
	
    public virtual ResilienceOptions Clone() { }
}

public class SqlConnection
{
    // Existing constructor
    public SqlConnection()
        : this(SqlClientOptions.Instance) { }
    public SqlConnection(string connectionString)
        : this(connectionString, SqlClientOptions.Instance) { }
    public SqlConnection(string connectionString, SqlCredential credential)
        : this(connectionString, credential, SqlClientOptions.Instance) { }

    // Performs a defensive clone of the SqlClientOptions parameter. This is to avoid the behavioural
    // configuration changing underneath an active SqlConnection.
    public SqlConnection(SqlClientOptions libraryConfiguration) { }
    public SqlConnection(string connectionString, SqlClientOptions libraryConfiguration) { }
    public SqlConnection(string connectionString, SqlCredential credential,
        SqlClientOptions libraryConfiguration) { }

    // Returns another defensive clone of the SqlClientOptions, for the same reason as the constructor.
    public SqlClientOptions Configuration { get; }
}

internal static class LocalAppContextSwitches
{
    // Controlled by the AppContext switch "Switch.Microsoft.Data.SqlClient.LoadSqlClientOptionsFromFile".
    // If this is false, the first access to SqlClientOptions.Instance will not use System.Configuration APIs.
    public static bool LoadSqlClientOptionsFromFile { get; } = true;
}
@JRahnama JRahnama added 🆕 Public API Use this label for new API additions to the driver. untriaged 💡 Enhancement New feature request labels Sep 18, 2024
@JRahnama
Copy link
Member

cc @roji

@roji
Copy link
Member

roji commented Sep 18, 2024

Great to see this @edwardneal.

I'm a bit curious about the need to add support for this to SqlConnection... The new overload would require code changes from users, including passing around the proposed SqlClientOptions instance. At that point, isn't it the same effort for users to just pass around the SqlDataSource instead, and to get the connection from it?

Similarly, once SqlDataSourceBuild supports your config options, I wonder whether it's necessary to introduce an additional "global" mechanism as well. Unlike the SqlConnection overload above, this one makes more sense to me as it allows existing users to just add some lines at program startup and not any other code; but it may make sense to introduce a global mechanism based on user demand rather than up-front, just to see if it's really necessary (see my comments below as well).

Related to the previous point, one thing that isn't clear is whether SqlClientOptions would contain a connection string (above there's an overload of SqlConnection that accepts only a SqlClientOptions, and another that accepts both SqlClientOptions and a connection string). It's important to note that connection options are generally specific to a database, and therefore are strongly tied to a particular connection string - so I think having a SqlClientOptions without a connection isn't likely to be very successful. All this is important especially if you have a global SqlClientOptions - would it close over a connection string (and effectively represent a global, default data source), or just be a bundle of extra config that's applied by default to all connections that get opened regardless of their connection strings (where again, I can see trouble and confusion down the road).

Overall, in Npgsql we've generally gone in the direction of "data source is the right new way to get connections", and have been deprecating/phasing out other ways. For example, there are various configuration options which are only available via NpgsqlDataSourceBuilder; people have been transitioning as they need those features. I get that SqlClient has a different set of users/needs though, so that path may not be appropriate.

@edwardneal
Copy link
Contributor Author

edwardneal commented Sep 18, 2024

Thanks @JRahnama and @roji.

Using the new overload would naturally need code changes, but I expect that the original overloads would remain untouched, behaving is though libraryConfiguration is SqlClientOptions.Instance. I've aimed to preserve the existing behaviour by default, so being able to set per-connection connection resilience options or authentication providers is pay-for-play.

I'm not strongly attached to having the extra constructors on SqlConnection - just having a strongly-typed set of client options which can be reconfigured at runtime unblocks most of the scenarios I'm thinking of: it removes a roadblock for splitting the Azure dependency into a separate package, and removes one of the larger dependencies on System.Configuration (for NativeAOT/trimming support). Those extra constructors are going to be necessary at some stage anyway though: when implemented, SqlDataSource will need to be able to pass its SqlClientOptions instance down to the connections it builds. Making those constructors public rather than internal is just a simple way to let people manually override those settings for a single connection.

To look at your broader point around the data which should be held in SqlClientOptions, the very rough rule of thumb I've used is essentially: this is the existing set of library-wide configuration, collected into one place and with extra functionality to allow per-connection overrides. If SqlClientOptions were ever to contain a connection string, something's gone very badly wrong during the API design.

I don't know where the transition points between connection string key, default value of a SqlConnection property and library-wide default configuration lie though. I actually went around that loop a few times during the initial design - I was originally thinking about including an extra piece of API surface. There's been quite a bit of organic development which blurs these transitions, so I've left the question alone and ported the existing set of configuration as-is.

One particularly strict demarcation might be that the connection string should have all of the information needed to connect securely to a database server under ideal circumstances, and that SqlClientOptions should be considered part of the environment - it should know where the circumstances aren't ideal, and implement a resilience pipeline. That perspective falls apart though: a connection string can contain "Connect Retry Count" and "Connect Retry Interval" keys. I'm not sure what happens if we configure both the SqlClientOptions and the connection string.

Another approach might be that a connection string should result in roughly identical connection processes wherever it's applied. This doesn't quite work either though: Entra authentication relies upon an OAuth client ID which has a sensible default, but can be overridden via an app.config file. Looking more widely, Windows authentication and impersonation can also lead to this failing in ways that a connection string wouldn't necessarily expect.

SqlConnection is also in a similar vein: it has an AccessToken property, which isn't reflected in the connection string. We might think this is because it's got security implications, but that same connection string can contain a SQL login's password (although Persist Security Info does change this slightly.)

If I approached it as a greenfield topic, I'd expect SqlClientOptions to refer to capability enablement and environmental configuration - so it'd be fine for it to contain predefined sets of retry logic, SQL alias configuration and available authentication providers; it wouldn't contain anything which is connection-specific. A connection string would refer to the authentication provider and set of retry logic by name. Any future community libraries would be able to add custom authentication providers or resilience pipelines, potentially requiring new, library-specific, connection string keys.

@roji
Copy link
Member

roji commented Sep 20, 2024

I'm not strongly attached to having the extra constructors on SqlConnection - just having a strongly-typed set of client options which can be reconfigured at runtime unblocks most of the scenarios I'm thinking of: it removes a roadblock for splitting the Azure dependency into a separate package, and removes one of the larger dependencies on System.Configuration (for NativeAOT/trimming support).

OK. Just to tease things apart a bit... If/when you split out the Azure dependency into a separate package, users will have to add some gesture to add it back in. IMHO the recommended way there is for users to switch to SqlDataSourceBuilder, where there will presumably be some extension API allowing the Azure plugin to be added.

However, requiring a switch to SqlDataSource in order to use Azure may be a bit much, since it requires a bit of an invasive code change. But here's the thing: if that's indeed true, then the change to use a new SqlConnection overload that accepts an SqlClientOptions is just as invasive; users would have to pass around SqlClientOptions to any place where an SqlConnection is created.

So if the goal is to allow users to e.g. use Azure after it's split away, but without invasive code changes, some sort of static API would be needed, allowing users to drop a line at the program startup that wires things together, and done. This is why I'm not sure about the point of adding a SqlConnection overload accepting SqlClientOptions: I'm not sure which user scenario it actually helps, and has the disadvantages of SqlDataSource (invasive code changes required) but without the advantages.

I'll also point out that the intersection of a SqlConnection constructor accepting SqlClientOptions and pooling is likely to be difficult... When you instantiate a SqlConnection and call Open(), the driver needs to look up the correct pool internally. I'm assuming that at least some of the options on SqlClientOptions would require a different pool to be used; if that's the case, the internal lookup code would have to include the SqlClientOptions (or a subset thereof) in the cache key for the pools. This will make Open() even slower, rather than just using the string. All this goes away with the data source design - no lookups are necessary since the user directly creates the pool (as a data source) and manages it as they want.

Those extra constructors are going to be necessary at some stage anyway though: when implemented, SqlDataSource will need to be able to pass its SqlClientOptions instance down to the connections it builds.

Npgsql indeed has an NpgsqlDataSourceConfiguration, which is probably loosely similar to SqlClientOptions; but it's an internal implementation detail (and one could imagine simply using NpgsqlDataSourceBuilder itself as the container type for the config).


Regarding the demarcation between the connection string and something like SqlClientOptions - I don't think one exists that would make sense and wouldn't be arbitrary. Both are simply mechanisms for specifying options/configuration for a database connection, and the reason we're discussing something beyond connection strings (SqlClientOptions, data source...) is that connection strings are simply insufficient, because they're simple strings.

If SqlClientOptions were ever to contain a connection string, something's gone very badly wrong during the API design.

I think that a SqlClientOptions that's "detached" from a connection string (i.e. doesn't contain it) would end up not working in the general case. The SqlClientOptions needs to be contain config that's specific to a host/database; for example, different databases may require different authentication providers, as well as various other varying configuration - but the host/database is part of the connection string. So IMHO the idea of e.g. a general "config" that's detached from any specific database, and is used with several connection strings, won't end up working/being useful.

I do fully agree that there needs to be one, single type that captures the entire options/configuration for a connection - host, port, logging, authentication, encryption, type mapping tweaks/plugins, everything. This was indeed one of the major ideas behind the DbDataSource concept, which also couples this concept to an actual factory for connections (as well as a connection pool).

One last thing... Taking an even wider view, I think connection strings can basically be considered a legacy concept from the past, and if I were designing .NET's data access story today, I probably wouldn't have them at all (thanks to @NinoFloris for suggesting this direction). In today's world there are good ways to represent configuration/options that are typed, hierarchical and arbitrary; there's something a bit ridiculous about having an appsettings.json with rich configuration, and then sticking a structured old connection string in it as a single JSON property - stuff like the host, database, port, etc. should simply be decomposed into the JSON just like any other configuration. In fact, with the .NET Options pattern, that configuration could even be changed and reloaded at runtime, with the application switching over to another database. I've summarized this direction in npgsql/npgsql#5067.

Of course, I'm not suggesting anyone do away with connection strings :) But I'm definitely interested in thinking about a future where people simply configure everything on NpgsqlDataSourceBuilder (no connection string!), and that can be loaded directly from JSON config.

@edwardneal
Copy link
Contributor Author

If/when you split out the Azure dependency into a separate package, users will have to add some gesture to add it back in. IMHO the recommended way there is for users to switch to SqlDataSourceBuilder, where there will presumably be some extension API allowing the Azure plugin to be added.

However, requiring a switch to SqlDataSource in order to use Azure may be a bit much, since it requires a bit of an invasive code change. But here's the thing: if that's indeed true, then the change to use a new SqlConnection overload that accepts an SqlClientOptions is just as invasive; users would have to pass around SqlClientOptions to any place where an SqlConnection is created.

So if the goal is to allow users to e.g. use Azure after it's split away, but without invasive code changes, some sort of static API would be needed, allowing users to drop a line at the program startup that wires things together, and done. This is why I'm not sure about the point of adding a SqlConnection overload accepting SqlClientOptions: I'm not sure which user scenario it actually helps, and has the disadvantages of SqlDataSource (invasive code changes required) but without the advantages.

That's a fair point. SqlDataSourceBuilder will need to be able to pass these options down to the SqlConnection, and my goal was essentially to expose that constructor so that developers can vary from the default library-wide options, even in the absence of the SqlDataSourceBuilder API surface. That'd be a new capability though, and it might make more sense to link it exclusively to SqlDataSourceBuilder.

Just for information around the Azure enablement: once it's split apart, the current plan is to try to use reflection to find the Azure authentication provider, so in a non-trimmed environment the only gesture a developer needs to make to re-enable Azure authentication is to add a reference the additional Azure authentication package. While I wouldn't normally want to introduce cross-package reflection with a greenfield API contract such as SqlDataSourceBuilder or a trimmed environment, I'm more supportive of it here; I think that minimising the breakage of that part of the API contract is more important, given how core Azure authentication is to the use of Azure SQL and how long it's been part of SqlClient.

I think that a SqlClientOptions that's "detached" from a connection string (i.e. doesn't contain it) would end up not working in the general case.

The functionality of SqlClientOptions already exists, it's just scattered and only sometimes accessible via config file. The plumbing of authentication methods to the authentication providers is already available via SqlAuthenticationProvider.SetProvider; authentication providers, LocalDB instances and the retry logic providers are already configured from an app.config file. SqlClientOptions simply brings these together, allowing them to be configured at runtime and decoupling the configuration from its source.

I agree that a lot of these configuration options should be exposed via SqlDataSource; they should override the library-level environmental defaults though. I don't think we should tether them solely to a specific connection string, because the net effect would be that we'd only be able to configure retry logic/etc. in .NET Core by using SqlDataSource or by specifying a .NET Framework-style app.config file. One of the goals of SqlClientOptions is to allow the newer appsettings.json (/etc.) file to be used, if required.

One last thing... Taking an even wider view, I think connection strings can basically be considered a legacy concept from the past, and if I were designing .NET's data access story today, I probably wouldn't have them at all (thanks to NinoFloris for suggesting this direction).

Yes - completely agree here. There are already gaps in the current string-based object model when we start to think about retry logic and server certificate validation, and using SqlDataSource as a strongly-typed model for data source configuration makes sense to me as a step in that direction.

It feels to me that most of the moving DI parts we'd need for that exist. IConfigureNamedOptions could obtain the correct setting values, and then the resultant DbDataSources could be passed through as keyed services.

@roji
Copy link
Member

roji commented Sep 22, 2024

Just for information around the Azure enablement: once it's split apart, the #1108 (comment) is to try to use reflection to find the Azure authentication provider, so in a non-trimmed environment the only gesture a developer needs to make to re-enable Azure authentication is to add a reference the additional Azure authentication package.

I see... I can certainly see the logic here, with the breaking change that would affect lots of people otherwise. I'm unsure on the details of using reflection in this way on a referenced assembly that otherwise isn't used in static code - hopefully it'll work :)

I agree that a lot of these configuration options should be exposed via SqlDataSource; they should override the library-level environmental defaults though. I don't think we should tether them solely to a specific connection string, because the net effect would be that we'd only be able to configure retry logic/etc. in .NET Core by using SqlDataSource or by specifying a .NET Framework-style app.config file.

I don't think there's anything preventing introducing SqlDataSource in .NET Framework - it simply wouldn't extend DbDataSource from System.Data (since it's absent there). This is the current state of affairs in Npgsql 8.0 - there's a DbDataSource.cs shim in the project, which is only compiled in on older .NET. This allows the main NpgsqlDataSource implementation to be always available no matter which .NET version you're using.

Since SqlClient seems to plan to support .NET Framework for the foreseeable future, and in addition is working towards a convergence of the .NET Framework/Core versions as much as possible, I'd recommend going in this direction. At that point I'm not sure why any sort of global usage of SqlClientOptions is needed - can you provide a bit more context on that?

One of the goals of SqlClientOptions is to allow the newer appsettings.json (/etc.) file to be used, if required.

Can you provide a bit more info on what you have in mind here? The application is free to pick up settings from appsettings.json and set them on SqlDataSourceBuilder as it wants; are you thinking of something more automatic where a SqlClientOptions would be able to read directly from appsettings.json, and then used to initialize a data source or connection? If so, that's indeed a scenario we've been thinking about in Npgsql (see #5067 again), but I'm not sure if a separate SqlClientOptions type here is needed as opposed to just initializing SqlDataSourceBuilder itself directly. Basically SqlDataSourceBuilder itself already needs to represent everything you could possibly configure on a connection, so it seems natural to load/initialize that etc.

@roji
Copy link
Member

roji commented Sep 22, 2024

(BTW feel free to ping me if you'd like to have a chat about this offline!)

@edwardneal
Copy link
Contributor Author

Slightly delayed reply here, sorry. Thanks for the offer of an offline chat though - I think we're on nearly the same wavelength, there's just some background context unique to this part of SqlClient.

Since SqlClient seems to plan to support .NET Framework for the foreseeable future, and in addition is working towards a convergence of the .NET Framework/Core versions as much as possible, I'd recommend going in this direction.

I agree, that's definitely the right way to go. There's already a degree of prior art even within SqlClient, in the form of SqlBatchCommand.

At that point I'm not sure why any sort of global usage of SqlClientOptions is needed - can you provide a bit more context on that?

Sure. At present, there are already global options:

  • Authentication providers are loaded from app.config and programmatically into a singleton SqlAuthenticationProviderManager instance. When a SqlConnection needs to authenticate, it looks up the authentication method to retrieve an authentication provider.
  • The default retry logic provider for SqlConnections and SqlCommands is lazy-loaded from app.config into SqlConfigurableRetryLogicManager via its static constructor.
  • Under .NET Framework, SqlClient will create a LocalDB instance if it doesn't exist. To do this, it needs to know the version of the LocalDB instance to create, and this name/version mapping is lazy-loaded from app.config into the internal LocalDBAPI instance.

I'd tentatively also include the registry keys used by SqlClient and ODBC to map from an alias to a SQL Server server name and port, although I've not yet tracked the alias processing end-to-end.

The design for SqlClientOptions isn't to introduce global options, but to gather the existing global options. It maintains compatibility with current app.config files by loading these options from them by default, but also exposes itself publicly via SqlClientOptions.Instance - so we can use the options pattern to set its properties from appsettings.json or another configuration source programmatically.

Separately, we build on this to enable trimming support. Because it's possible to programmatically modify SqlClientOptions.Instance, developers no longer rely exclusively upon app.config files - so we can safely trim away System.Configuration, knowing that we're not eliminating the only way to set the default retry logic provider for a SqlConnection or a SqlCommand, etc.

Building further on enabling programmatic modification of the default retry logic provider: I think that then opens up the potential for a third-party library to build a shim between SqlClient's SqlRetryLogicBaseProvider class and a Polly resilience pipeline.

When we come to implement SqlDataSource, my initial idea is to allow users to pass a SqlClientOptions instance to it to enable specific features, but if we want to use something else (or take a different approach) for SqlDataSource then I think there's still value in the original point (ability to eliminate System.Configuration references via trimming, and to support programmatic modification of the options which are currently only modifiable via app.config.)

@cheenamalhotra
Copy link
Member

SqlDataSourceBuilder itself already needs to represent everything you could possibly configure on a connection, so it seems natural to load/initialize that etc.

I agree with what @roji commented above too. We would be introducing SqlDataSourceBuilder in future that should meet goals for global configuration of the SqlClient library. Adding more options only opens doors to more confusions for customers and leaves us with multiple entry points to deal with.

The design for SqlClientOptions isn't to introduce global options, but to gather the existing global options. It maintains compatibility with current app.config files by loading these options from them by default, but also exposes itself publicly via SqlClientOptions.Instance - so we can use the options pattern to set its properties from appsettings.json or another configuration source programmatically.

The challenge comes down to who takes precedence over the other, when there are multiple entry points to control global options and singletons. One can still modify the instance options to their preference, in a running application. I know we can define precedence flows, document it, but it also opens a new channel of confusions for customers. And if things don't work as users expect or prefer, we cannot take back designs.

Separately, we build on this to enable trimming support. Because it's possible to programmatically modify SqlClientOptions.Instance, developers no longer rely exclusively upon app.config files - so we can safely trim away System.Configuration, knowing that we're not eliminating the only way to set the default retry logic provider for a SqlConnection or a SqlCommand, etc.

AFAIK there are application builders out there who want app.config for various reasons, mainly to abstract configurations from source code. Just want to clarify, for us, app.config is a supported scenario for a .NET application developer to configure various settings and there are no plans to make them obsolete or trim them down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💡 Enhancement New feature request 🆕 Public API Use this label for new API additions to the driver.
Projects
Status: Under Investigation
Development

No branches or pull requests

5 participants