Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use separate services for storing job queues and cache #7245

Merged
merged 1 commit into from
Dec 19, 2023

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented Dec 8, 2023

Motivation and context

These types of data have different characteristics and we have different expectations on them:

  • job queues are small and we'd rather not lose them (although losing them is not fatal);

  • cached chunks are large and we don't care if we lose them.

We currently store both in KeyDB, which has shown itself to not be especially reliable. A few times we've had to clear the KeyDB store due to data corruption, which destroyed the queues as well. While we'll probably end up replacing KeyDB with something else, it would still be useful to have the ability to just clear the cache volume without taking out the job queues in the process.

As a solution to this, add a Redis service to be used only for the queues (and potentially for other small data items). Using the original Redis instead of KeyDB should also help with reliability (at least as far as the job queues are concerned).

How has this been tested?

I checked the CVAT can still start using the development environment instructions, the Compose file and the Helm chart.

Checklist

  • I submit my changes into the develop branch
  • I have created a changelog fragment
  • [ ] I have updated the documentation accordingly
  • [ ] I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)
  • [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@SpecLad
Copy link
Contributor Author

SpecLad commented Dec 8, 2023

/check

Copy link
Contributor

github-actions bot commented Dec 8, 2023

❌ Some checks failed
📄 See logs here

@SpecLad
Copy link
Contributor Author

SpecLad commented Dec 11, 2023

/check

Copy link
Contributor

github-actions bot commented Dec 11, 2023

❌ Some checks failed
📄 See logs here

@SpecLad
Copy link
Contributor Author

SpecLad commented Dec 11, 2023

/check

Copy link
Contributor

github-actions bot commented Dec 11, 2023

❌ Some checks failed
📄 See logs here

azhavoro pushed a commit that referenced this pull request Dec 15, 2023
<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
The main reason for this is to update the Helm chart, because the
current one conflicts with the Redis chart I want to add (in #7245).

But also the version we have is quite old and out of support, so it's
due for an update anyway.

The change in `objects.py` is due to the fact that with the new version
of Clickhouse, the value of `date` somehow became a timezone-aware
datetime object, so the output of `isoparse` now includes a UTC offset,
and appending a "Z" now creates a malformed date. I chose to replace
this with a custom format string, which is consistent with other date
formatting code in the `analytics_report` app.

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->
I manually tried both the Helm chart and the Compose file, making sure
that the analytics are still recorded and displayed.

I also checked that a database created with the previous Clickhouse
version can be loaded with the new version (it can).

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [ ] I have created a changelog fragment <!-- see top comment in
CHANGELOG.md -->
- ~~[ ] I have updated the documentation accordingly~~
- ~~[ ] I have added tests to cover my changes~~
- [x] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))
- ~~[ ] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))~~

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
azhavoro pushed a commit that referenced this pull request Dec 15, 2023
…les (#7254)

The advantages of this are as follows:

* It's much easier for a developer to use one `docker compose up`
command to bring everything up than to run a custom command for each
service.

* We eliminate possible divergence of configuration (e.g. versions,
command-line parameters) between what we actually use and what's listed
in the documentation.

* It makes it easier to update the developer guide if new dependencies
are introduced.

* And speaking of new dependencies, we have KeyDB now, which hasn't been
added to the dev guide.

The disadvantage is that we have to run an extra copy of the CVAT
server, because otherwise OPA can't fetch its rules. I don't think it's
a significant issue, since it doesn't prevent you from debugging
anything.

<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
Working on #7245, I realized that I don't want to add another custom
command for running Redis in the development environment to the dev
guide. So I wanted to remove the custom commands entirely.

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->
By manually following the updated instructions.

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- ~~[ ] I have created a changelog fragment~~ <!-- see top comment in
CHANGELOG.md -->
- [x] I have updated the documentation accordingly
- ~~[ ] I have added tests to cover my changes~~
- [x] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))
- ~~[ ] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))~~

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
These types of data have different characteristics and we have different
expectations on them:

* job queues are small and we'd rather not lose them (although losing
  them is not fatal);

* cached chunks are large and we don't care if we lose them.

We currently store both in KeyDB, which has shown itself to not be
especially reliable. A few times we've had to clear the KeyDB store due to
data corruption, which destroyed the queues as well. While we'll probably
end up replacing KeyDB with something else, it would still be useful to have
the ability to just clear the cache volume without taking out the job queues
in the process.

As a solution to this, add a Redis service to be used only for the queues
(and potentially for other small data items). Using the original Redis
instead of KeyDB should also help with reliability (at least as far as the
job queues are concerned).
Copy link

codecov bot commented Dec 15, 2023

Codecov Report

Merging #7245 (dd13388) into develop (9fb582d) will increase coverage by 0.04%.
The diff coverage is n/a.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7245      +/-   ##
===========================================
+ Coverage    81.74%   81.79%   +0.04%     
===========================================
  Files          367      367              
  Lines        39375    39375              
  Branches      3644     3644              
===========================================
+ Hits         32188    32206      +18     
+ Misses        7187     7169      -18     
Components Coverage Δ
cvat-ui 75.99% <ø> (+0.10%) ⬆️
cvat-server 87.06% <ø> (-0.01%) ⬇️

@azhavoro azhavoro merged commit 48ab12b into cvat-ai:develop Dec 19, 2023
42 of 43 checks passed
@cvat-bot cvat-bot bot mentioned this pull request Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants