Skip to content

Commit

Permalink
Add GRPC KeepAlive and SSL sections to server docs (triton-inference-…
Browse files Browse the repository at this point in the history
…server#3149)

* Add initial GRPC KeepAlive and SSL sections to server docs

* Point to client docs from server docs

* Add documentation on grpc compression and authetication options

* Fixing the links

* Add code blocks, add grpc keepalive param list, point to client documentation

* fix links

Co-authored-by: Tanmay Verma <tanmay2592@gmail.com>
  • Loading branch information
rmccorm4 and tanmayv25 committed Jul 28, 2021
1 parent d1d86f4 commit caf02e5
Showing 1 changed file with 44 additions and 1 deletion.
45 changes: 44 additions & 1 deletion docs/inference_protocols.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<!--
# Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
Expand Down Expand Up @@ -47,6 +47,49 @@ model health, metadata and statistics. Additional endpoints allow
model loading and unloading, and inferencing. See the KFServing and
extension documentation for details.

### GRPC Options
Triton exposes various GRPC parameters for configuring the server-client network transactions. For usage of these options, refer to the output from `tritonserver --help`.

#### SSL/TLS

These options can be used to configure a secured channel for communication. The server-side options include:

* `--grpc-use-ssl`
* `--grpc-use-ssl-mutual`
* `--grpc-server-cert`
* `--grpc-server-key`
* `--grpc-root-cert`

For client-side documentation, see [Client-Side GRPC SSL/TLS](https://github.com/triton-inference-server/client/tree/main#ssltls)

For more details on overview of authentication in gRPC, refer [here](https://grpc.io/docs/guides/auth/).

#### Compression

Triton allows the on-wire compression of request/response messages by exposing following option on server-side:

* `--grpc-infer-response-compression-level`

For client-side documentation, see [Client-Side GRPC Compression](https://github.com/triton-inference-server/client/tree/main#compression)

Compression can be used to reduce the amount of bandwidth used in server-client communication. For more details, see [gRPC Compression](https://grpc.github.io/grpc/core/md_doc_compression.html).

#### GRPC KeepAlive

Triton exposes GRPC KeepAlive parameters with the default values for both
client and server described [here](https://github.com/grpc/grpc/blob/master/doc/keepalive.md).

These options can be used to configure the KeepAlive settings:

* `--grpc-keepalive-time`
* `--grpc-keepalive-timeout`
* `--grpc-keepalive-permit-without-calls`
* `--grpc-http2-max-pings-without-data`
* `--grpc-http2-min-recv-ping-interval-without-data`
* `--grpc-http2-max-ping-strikes`

For client-side documentation, see [Client-Side GRPC KeepAlive](https://github.com/triton-inference-server/client/blob/main/README.md#grpc-keepalive).

## C API

The Triton Inference Server provides a backwards-compatible C API that
Expand Down

0 comments on commit caf02e5

Please sign in to comment.