Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Python Deployment of Triton Inference Server #7501

Merged
merged 98 commits into from
Aug 30, 2024

Conversation

KrishnanPrash
Copy link
Contributor

@KrishnanPrash KrishnanPrash commented Aug 5, 2024

What does the PR do?

This PR is for the tritonfrontend python package, containing bindings to the HTTPAPIServer and grpc::Server classes in the frontend. This allows Triton users to start their respective frontends from within python.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

Where should the reviewer start?

server/src/python/tritonfrontend/_c/tritonfrontend_pybind.cc

Test plan:

Working on adding tests, which can be run through pytest, and added as a single command to the existing L0_python_api bash script.

  • CI Pipeline ID:

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

@KrishnanPrash KrishnanPrash added the PR: feat A new feature label Aug 5, 2024
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@KrishnanPrash KrishnanPrash marked this pull request as draft August 5, 2024 19:47
src/CMakeLists.txt Outdated Show resolved Hide resolved
qa/L0_python_api/test.sh Outdated Show resolved Hide resolved
Copy link
Collaborator

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With follow-up tickets in mind, this LGTM other than a few minor comments remaining.

docs/tritonfrontend/README.md Outdated Show resolved Hide resolved
docs/tritonfrontend/README.md Outdated Show resolved Hide resolved
qa/L0_python_api/testing_utils.py Outdated Show resolved Hide resolved
qa/L0_python_api/test_kserve.py Outdated Show resolved Hide resolved
# delayed_identity will still be an active model
# Hence, server.stop() causes InternalError: Timeout.
with pytest.raises(tritonserver.InternalError):
TestingUtils.teardown_server(server)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? Why shutting down frontend causes model still marked as "in used"? CC @Tabrizian @kthui

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the delay more than shutdown timeout seconds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delay was set to 2 seconds. Shutdown timeout is set to 30 seconds by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it reproducible outside In-Process Python API?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? Why shutting down frontend causes model still marked as "in used"? CC @Tabrizian @kthui

Maybe some handshake like calling InferenceRequestDelete or a callback (request release, response complete, etc) done in the frontend would be missed if frontend is shutdown before responses are complete, and cause core to think the request is still active (and model in use, held by request)?

src/grpc/grpc_server.cc Outdated Show resolved Hide resolved
src/grpc/infer_handler.cc Outdated Show resolved Hide resolved
src/grpc/infer_handler.cc Outdated Show resolved Hide resolved
src/python/README.md Outdated Show resolved Hide resolved
src/common.h Outdated Show resolved Hide resolved
KrishnanPrash and others added 7 commits August 28, 2024 15:23
Co-authored-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com>
Co-authored-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com>
Co-authored-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com>
setup_service,
teardown_client,
teardown_server,
teardown_service,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI that I am not in favor of this pattern to import name out of their namespace, it's hard to distinguish the origin of the functions and can cause shadowing.

@KrishnanPrash KrishnanPrash merged commit f6021f7 into main Aug 30, 2024
3 checks passed
@KrishnanPrash KrishnanPrash deleted the kprashanth-python-deployment branch August 30, 2024 23:23
@nnshah1
Copy link
Contributor

nnshah1 commented Aug 30, 2024 via email

@KrishnanPrash KrishnanPrash mentioned this pull request Sep 3, 2024
20 tasks
@dongs0104
Copy link

hello @KrishnanPrash i want to use this feature but i don't know How to set up TP, PP which is MPI world size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: feat A new feature
Development

Successfully merging this pull request may close these issues.

8 participants