-
-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Multi-User Part 1]: Enable storage of settings for plaintext files based on user account #498
[Multi-User Part 1]: Enable storage of settings for plaintext files based on user account #498
Conversation
…nfig-with-multi-user
- Write processed text data to the DB using the embeddings service - Read data from the DB for search and chat - Update all text to jsonl processesors to use the embeddings service that writes to the DB
|
GitGuardian id | Secret | Commit | Filename | |
---|---|---|---|---|
8175313 | Django Secret Key | 869c37f | src/app/settings.py | View secret |
8175313 | Django Secret Key | 6fa925e | src/app/settings.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Our GitHub checks need improvements? Share your feedbacks!
…nfig-with-multi-user
… relevant data settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work on these big set of changes! I did a review pass and left some comments but it is not comprehensive
env: | ||
DEBIAN_FRONTEND: noninteractive | ||
run : | | ||
apt install -y postgresql postgresql-client && apt install -y postgresql-server-dev-14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apt install -y postgresql postgresql-client && apt install -y postgresql-server-dev-14 | |
apt install -y postgresql postgresql-client postgresql-server-dev-14 |
run: | | ||
sudo apt update && sudo apt install -y libegl1 | ||
apt update && apt install -y libegl1 sqlite3 libsqlite3-dev libsqlite3-0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is sqlite required here, given we're using postgres?
@@ -43,17 +53,37 @@ jobs: | |||
with: | |||
python-version: ${{ matrix.python_version }} | |||
|
|||
- name: Install Git | |||
run: | | |||
apt update && apt install -y git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is git required here? Is it because ubuntu-jammy doesn't have git pre-installed and pip needs it to figure out the git (tag) version of the codebase?
|
||
- name: 🌡️ Validate Application | ||
run: pre-commit run --hook-stage manual --all | ||
run: sed -i 's/dynamic = \["version"\]/version = "0.0.0"/' pyproject.toml && pip install --upgrade .[dev] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the version need to be reset here? I know pip install
uses git tag version for figuring out the version for khoj. But given that this was working earlier, what's changed now? Moving to ubuntu-jammy? This maybe related to why git is required comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it's because it's running in a containerized environment now, which requires this additional snippet for including the version.
if not config: | ||
return None | ||
return config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the below suggested code be equivalent (and simpler) or can config be also be False
or some such?
if not config: | |
return None | |
return config | |
return config |
if len(date_filters) > 0: | ||
min_date, max_date = date_filters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this unpacking operation of date_filters
into min_date
, max_date
fail if len(date_filters) == 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, but the only possible responses are None, empty list, and list of two.
|
||
# Dynamically generate search type enum by merging core search types with configured plugin search types | ||
return Enum("SearchType", merge_dicts(core_search_types, plugin_search_types)) | ||
return Enum("SearchType", merge_dicts(core_search_types, {})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The merge_dicts
can be removed, given the plugin search types are removed?
return Enum("SearchType", merge_dicts(core_search_types, {})) | |
return Enum("SearchType", core_search_types) |
- 0865416: Add better parsing for XML files - f3acfac: Add a try/catch around the dateparser in order to avoid internal server errors in app - 7d43cd6: Chunk embeddings generation in order to avoid large memory load - e02d751: Addresses comments from PR #498 - a3f393e: Addresses comments from PR #503 - 66eb078: Addresses comments from PR #511 - Address various items in #527
Incoming
pgvector
extensionCloses #466, Closes #345
Closes #195. On my local analysis, memory consumption in my machine on the prior setup with my local org notes was around 10-12 GB of RAM. With the pg_vector integration, it's around 2-3 GB.