Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drive health #269

Merged
merged 7 commits into from
Jan 16, 2024
Merged

Drive health #269

merged 7 commits into from
Jan 16, 2024

Conversation

johnml1135
Copy link
Collaborator

@johnml1135 johnml1135 commented Jan 11, 2024

Resolves sillsdev/machine#141 and #243.


This change is Reviewable

* Update to dotnet 8 (and needed updates)
* Add grpc endpoint for health checks with rich data
* Add brief public health endpoint
Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 35 of 35 files at r1, all commit messages.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @johnml1135)


samples/EchoTranslationEngine/EchoTranslationEngine.csproj line 16 at r1 (raw file):

  <ItemGroup>
    <ProjectReference Include="..\..\src\Serval.Grpc\Serval.Grpc.csproj" />
		<ProjectReference Include="..\..\src\Serval.Shared\Serval.Shared.csproj" />

I don't think this is needed.


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 3 at r1 (raw file):

namespace EchoTranslationEngine;

public class TranslationEngineServiceV1(BackgroundTaskQueue taskQueue, HealthCheckService healthCheckService) : TranslationEngineApi.TranslationEngineApiBase

You should update csharpier so that this get formatted correctly.


src/Serval.ApiServer/Startup.cs line 45 at r1 (raw file):

        services.AddHealthChecks().AddIdentityServer(new Uri(authority), name: "Auth0");

        string DataFilesDir = Configuration!.GetSection(DataFileOptions.Key)!.GetValue<string>("FilesDirectory")!;

I don't think that the null-forgiving operators are needed here.


src/Serval.ApiServer/Startup.cs line 47 at r1 (raw file):

        string DataFilesDir = Configuration!.GetSection(DataFileOptions.Key)!.GetValue<string>("FilesDirectory")!;
        // find drive letter for DataFilesDir
        string driveLetter = Path.GetPathRoot(DataFilesDir)![..1];

Please avoid using the null-forgiving operator.


src/Serval.ApiServer/Startup.cs line 216 at r1 (raw file):

            x.MapHangfireDashboard();
            x.MapHealthChecks("/health", new HealthCheckOptions { ResponseWriter = WriteHealthCheckResponse.Generate });
            ;

There is an extra semicolon here.


src/Serval.ApiServer/Controllers/StatusController.cs line 50 at r1 (raw file):

    /// <remarks>Provides an indication about the health of the API</remarks>
    /// <response code="200">The API health status</response>
    [HttpGet("health-public")]

I would call this something else. Maybe ping.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 108 at r1 (raw file):

message HealthCheckResponse {
    HealthCheckStatus Status = 1;

This should be snake case.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 109 at r1 (raw file):

message HealthCheckResponse {
    HealthCheckStatus Status = 1;
    string Duration = 2;

Do we really need this?


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 110 at r1 (raw file):

    HealthCheckStatus Status = 1;
    string Duration = 2;
    map<string, string> Data = 3;

What is this used for?


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 111 at r1 (raw file):

    string Duration = 2;
    map<string, string> Data = 3;
    optional string Exception = 4;

I would rename this to error or more generally description. Remember this is defining an API for engines. An engine could be implemented on any tech stack. We don't need to, nor should we match the C# health check models exactly. All we really need is status and description.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 163 at r1 (raw file):

enum HealthCheckStatus {
    Unhealthy = 0;

These should be all uppercase in order to be consistent with the other enums.


src/Serval.Shared/Configuration/IServalBuilderExtensions.cs line 46 at r1 (raw file):

    )
    {
        builder.Services.AddMongoDataAccess(builder.Configuration!.GetConnectionString("Mongo")!, "Serval", configure);

Please avoid using the null-forgiving operator. Throw an exception if necessary.


src/Serval.Shared/Controllers/ServiceUnavailableExceptionFilter.cs line 21 at r1 (raw file):

            _logger.Log(
                LogLevel.Error,
                "A user tried to access an unavailable service. Exception: \n{Exception}",

We already pass the exception to the Log method. This might be redundant.


src/Serval.Shared/Utils/WriteHealthCheckResponse.cs line 5 at r1 (raw file):

public class WriteHealthCheckResponse
{
    public static Task Generate(HttpContext context, HealthReport healthReport)

This is only used in Serval.ApiServer, so you can move it there.


src/Serval.Shared/Contracts/HealthReportDto.cs line 0 at r1 (raw file):
Why have these DTOs been moved?

@johnml1135
Copy link
Collaborator Author

samples/EchoTranslationEngine/EchoTranslationEngine.csproj line 16 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I don't think this is needed.

Correct. Removed.

@johnml1135
Copy link
Collaborator Author

samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 3 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

You should update csharpier so that this get formatted correctly.

It's the new net8.0 primary constructor.

@johnml1135
Copy link
Collaborator Author

src/Serval.ApiServer/Startup.cs line 45 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I don't think that the null-forgiving operators are needed here.

net8 was complaining about it - is there a better way? Should I use a ? or a if statement? If it is null, I would want to crash out of the code.

@johnml1135
Copy link
Collaborator Author

src/Serval.ApiServer/Startup.cs line 45 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

net8 was complaining about it - is there a better way? Should I use a ? or a if statement? If it is null, I would want to crash out of the code.

I see what I was doing - I was confusing the null forgiving operator in CSharp (!) with the not-null assertion operator in Kotlin (!!). I will change them to the ?? throw syntax.

@johnml1135
Copy link
Collaborator Author

src/Serval.ApiServer/Startup.cs line 47 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Please avoid using the null-forgiving operator.

Done

@johnml1135
Copy link
Collaborator Author

src/Serval.ApiServer/Startup.cs line 216 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

There is an extra semicolon here.

fixed.

@johnml1135
Copy link
Collaborator Author

src/Serval.ApiServer/Controllers/StatusController.cs line 50 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I would call this something else. Maybe ping.

ping is ok.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 108 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This should be snake case.

It should be ok - we recast as integers, not as strings.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 109 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Do we really need this?

It is in the standard health check - and there is no particular reason not to include it.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 110 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

What is this used for?

This is used for each of the sub-categories of checks:

    "SmtTransfer": {
      "status": "Healthy",
      "duration": "00:00:01.2095377",
      "description": "SmtTransfer",
      "exception": "System.Exception",
      "data": {
        "S3 Bucket": "Healthy: The S3 bucket is available",
        "Mongo": "Healthy: ",
        "Hangfire": "Healthy: ",
        "SMT Engine Storage Capacity": "Healthy: ",
        "ClearML Health Check": "Healthy: ClearML is available"
      }
    },

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did all of the .json, .ldml, .py files change?

Reviewed 23 of 23 files at r2, all commit messages.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @johnml1135)


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 3 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

It's the new net8.0 primary constructor.

I know. The line is going over the 120 character limit without wrapping. If you reformat this file with CSharpier, it should fix it.


src/Serval.ApiServer/Startup.cs line 45 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I see what I was doing - I was confusing the null forgiving operator in CSharp (!) with the not-null assertion operator in Kotlin (!!). I will change them to the ?? throw syntax.

I would do this instead:

var dataFileOptions = new DataFileOptions();
Configuration.GetSection(DataFileOptions.Key).Bind(dataFileOptions);

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 111 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I would rename this to error or more generally description. Remember this is defining an API for engines. An engine could be implemented on any tech stack. We don't need to, nor should we match the C# health check models exactly. All we really need is status and description.

The exception literally captures exceptions - which sometimes can be used for debugging. They would be on the logs - but again, if it is already included, why not?

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 163 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

These should be all uppercase in order to be consistent with the other enums.

Done

@johnml1135
Copy link
Collaborator Author

src/Serval.Shared/Configuration/IServalBuilderExtensions.cs line 46 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Please avoid using the null-forgiving operator. Throw an exception if necessary.

Done

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @johnml1135)


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 108 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

It should be ok - we recast as integers, not as strings.

Yes, but it is inconsistent with the rest of the API definition.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 109 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

It is in the standard health check - and there is no particular reason not to include it.

It is in the standard health check for .NET, but this API could be implemented by some other tech stack. I would like to make this as easy as possible to be implemented.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 111 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

The exception literally captures exceptions - which sometimes can be used for debugging. They would be on the logs - but again, if it is already included, why not?

By calling it exception, we are assuming a particular language construct. One that some languages might not have. This API is intended to be language-agnostic.

@johnml1135
Copy link
Collaborator Author

src/Serval.Shared/Controllers/ServiceUnavailableExceptionFilter.cs line 21 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We already pass the exception to the Log method. This might be redundant.

I think it was needed. I changed it because net8.0 complained about passing the exception. I needed to switch the order of exception and message to make the function happy - done.

@johnml1135
Copy link
Collaborator Author

src/Serval.Shared/Utils/WriteHealthCheckResponse.cs line 5 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This is only used in Serval.ApiServer, so you can move it there.

I used it more places before - you are right.

@johnml1135
Copy link
Collaborator Author

src/Serval.Shared/Contracts/HealthReportDto.cs line at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Why have these DTOs been moved?

I thought we needed them more places - for sharing with machine - when I was doing the rest endpoints. I'll move them back.

@johnml1135
Copy link
Collaborator Author

I ran formatting - I'll check the files again. They changed more than I was expecting and I will revert at least the ldml.

@johnml1135
Copy link
Collaborator Author

samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 3 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I know. The line is going over the 120 character limit without wrapping. If you reformat this file with CSharpier, it should fix it.

csharpier doesn't handle primary constructors in C#12 yet - it can only handle C#11. I am reverting all the new primary constructors to preserve csharpier working.

@johnml1135
Copy link
Collaborator Author

src/Serval.ApiServer/Startup.cs line 45 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I would do this instead:

var dataFileOptions = new DataFileOptions();
Configuration.GetSection(DataFileOptions.Key).Bind(dataFileOptions);

What about for connection strings? Shouldn't we fail if one is not actively defined?

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 108 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Yes, but it is inconsistent with the rest of the API definition.

updated - it works magically, making the C# classes names as camel case already.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 109 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It is in the standard health check for .NET, but this API could be implemented by some other tech stack. I would like to make this as easy as possible to be implemented.

Is making it optional good enough?

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 109 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Is making it optional good enough?

It isn't needed at all - the Health check function creates it's own duration independent of any data passed. I'll remove it.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 111 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

By calling it exception, we are assuming a particular language construct. One that some languages might not have. This API is intended to be language-agnostic.

Error is fine enough. We can map it to exception at the C# Health Check level.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 14 of 14 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @johnml1135)


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 3 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

csharpier doesn't handle primary constructors in C#12 yet - it can only handle C#11. I am reverting all the new primary constructors to preserve csharpier working.

It seems to work fine with primary constructors once I updated to the latest version.


src/Serval.ApiServer/Startup.cs line 45 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

What about for connection strings? Shouldn't we fail if one is not actively defined?

Yes.


src/Serval.ApiServer/Startup.cs line 50 at r3 (raw file):

        Configuration.GetSection(DataFileOptions.Key).Bind(dataFileOptions);
        // find drive letter for DataFilesDir
        string driveLetter = Path.GetPathRoot(dataFileOptions.FilesDirectory)![..1];

You missed removing the null-forgiving operator here.


src/Serval.Shared/Configuration/IServalBuilderExtensions.cs line 48 at r3 (raw file):

        var mongoConnectionString =
            builder.Configuration?.GetConnectionString("Mongo")
            ?? throw new Exception("Mongo connection string not configured");

I generally try to avoid throwing Exception. InvalidOperationException would be fine in this case. Also, please avoid the ?? throw syntax. I don't think it is as clear as a normal if statement.

@johnml1135
Copy link
Collaborator Author

samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 3 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It seems to work fine with primary constructors once I updated to the latest version.

It's working for me now as well - it may be a new release. I'll update.

@johnml1135
Copy link
Collaborator Author

src/Serval.ApiServer/Startup.cs line 50 at r3 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

You missed removing the null-forgiving operator here.

Fixed

* Caching still doesn't work !?!?!
…n make auth'd work, but we don't need to, or really want to.)
Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 23 of 23 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @johnml1135)


samples/EchoTranslationEngine/EchoTranslationEngine.csproj line 16 at r4 (raw file):

  <ItemGroup>
    <ProjectReference Include="..\..\src\Serval.Grpc\Serval.Grpc.csproj" />
		<ProjectReference Include="..\..\src\Serval.Shared\Serval.Shared.csproj" />

Why was this added back?

@johnml1135
Copy link
Collaborator Author

samples/EchoTranslationEngine/EchoTranslationEngine.csproj line 16 at r4 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Why was this added back?

By accident - I removed the commit that did all the mis-formatting of files and must have removed that change from it.

@johnml1135
Copy link
Collaborator Author

src/Serval.Shared/Configuration/IServalBuilderExtensions.cs line 48 at r3 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I generally try to avoid throwing Exception. InvalidOperationException would be fine in this case. Also, please avoid the ?? throw syntax. I don't think it is as clear as a normal if statement.

Done.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @johnml1135)

@johnml1135 johnml1135 merged commit cb56fe7 into main Jan 16, 2024
1 of 2 checks passed
@ddaspit ddaspit deleted the drive_health branch January 23, 2024 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add health check for SMT engine disk storage
2 participants