forked from khoj-ai/khoj
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create test markdown files. Use them in sample config, docker-compose
- Loading branch information
Showing
4 changed files
with
229 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# Emacs Khoj | ||
|
||
*An Emacs interface for [Khoj](https://github.com/debanjum/khoj)* | ||
|
||
## Requirements | ||
|
||
- Install and Run [Khoj](https://github.com/debanjum/khoj) | ||
|
||
## Installation | ||
|
||
- Direct Install | ||
- Put `khoj.el` in your Emacs load path. For e.g \~/.emacs.d/lisp | ||
|
||
- Load via `use-package` in your \~/.emacs.d/init.el or .emacs | ||
file by adding below snippet | ||
|
||
``` elisp | ||
;; Khoj Package | ||
(use-package khoj | ||
:load-path "~/.emacs.d/lisp/khoj.el" | ||
:bind ("C-c s" . 'khoj)) | ||
``` | ||
- With [straight.el](https://github.com/raxod502/straight.el) | ||
- Add below snippet to your \~/.emacs.d/init.el or .emacs config | ||
file and execute it. | ||
|
||
``` elisp | ||
;; Khoj Package for Semantic Search | ||
(use-package khoj | ||
:after org | ||
:straight (khoj :type git :host github :repo "debanjum/khoj" :files (:defaults "src/interface/emacs/khoj.el")) | ||
:bind ("C-c s" . 'khoj)) | ||
``` | ||
- With [Quelpa](https://github.com/quelpa/quelpa#installation) | ||
- Ensure [Quelpa](https://github.com/quelpa/quelpa#installation), | ||
[quelpa-use-package](https://github.com/quelpa/quelpa-use-package#installation) | ||
are installed | ||
|
||
- Add below snippet to your \~/.emacs.d/init.el or .emacs config | ||
file and execute it. | ||
|
||
``` elisp | ||
;; Khoj Package | ||
(use-package khoj | ||
:after org | ||
:quelpa (khoj :fetcher url :url "https://raw.githubusercontent.com/debanjum/khoj/master/interface/emacs/khoj.el") | ||
:bind ("C-c s" . 'khoj)) | ||
``` | ||
|
||
## Usage | ||
|
||
1. Open Query Interface on Client | ||
|
||
- In Emacs: Call `khoj` using keybinding `C-c s` or `M-x khoj` | ||
- On Web: Open <http://localhost:8000/> | ||
|
||
2. Query in Natural Language | ||
|
||
e.g \"What is the meaning of life?\" \"What are my life goals?\" | ||
|
||
**Note: It takes about 4s on a Mac M1 and a \>100K line corpus of | ||
notes** | ||
|
||
3. (Optional) Narrow down results further | ||
|
||
Include/Exclude specific words or date range from results by | ||
updating query with below query format | ||
|
||
e.g \`What is the meaning of life? -god +none dt:\"last week\"\` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,153 @@ | ||
![](https://github.com/debanjum/khoj/actions/workflows/test.yml/badge.svg) | ||
![](https://github.com/debanjum/khoj/actions/workflows/build.yml/badge.svg) | ||
|
||
# Khoj | ||
|
||
*Allow natural language search on user content like notes, images, | ||
transactions using transformer ML models* | ||
|
||
User can interface with Khoj via [Web](./src/interface/web/index.html), | ||
[Emacs](./src/interface/emacs/khoj.el) or the API. All search is done | ||
locally[\*](https://github.com/debanjum/khoj#miscellaneous) | ||
|
||
## Demo | ||
|
||
<https://user-images.githubusercontent.com/6413477/168417719-8a8bc4e5-8404-42b2-89a7-4493e3d2582c.mp4> | ||
|
||
## Setup | ||
|
||
### 1. Clone | ||
|
||
``` shell | ||
git clone https://github.com/debanjum/khoj && cd khoj | ||
``` | ||
|
||
### 2. Configure | ||
|
||
- \[Required\] Update [docker-compose.yml](./docker-compose.yml) to | ||
mount your images, (org-mode or markdown) notes and beancount | ||
directories | ||
- \[Optional\] Edit application configuration in | ||
[sample~config~.yml](./config/sample_config.yml) | ||
|
||
### 3. Run | ||
|
||
``` shell | ||
docker-compose up -d | ||
``` | ||
|
||
*Note: The first run will take time. Let it run, it\'s mostly not hung, | ||
just generating embeddings* | ||
|
||
## Use | ||
|
||
- **Khoj via API** | ||
- See [Khoj API Docs](http://localhost:8000/docs) | ||
- [Query](http://localhost:8000/search?q=%22what%20is%20the%20meaning%20of%20life%22) | ||
- [Regenerate | ||
Embeddings](http://localhost:8000/regenerate?t=ledger) | ||
- [Configure Application](https://localhost:8000/ui) | ||
- **Khoj via Emacs** | ||
- [Install](https://github.com/debanjum/khoj/tree/master/src/interface/emacs#installation) | ||
[khoj.el](./src/interface/emacs/khoj.el) | ||
- Run `M-x khoj <user-query>` | ||
|
||
## Run Unit tests | ||
|
||
``` shell | ||
pytest | ||
``` | ||
|
||
## Upgrade | ||
|
||
``` shell | ||
docker-compose build --pull | ||
``` | ||
|
||
## Troubleshooting | ||
|
||
- Symptom: Errors out with \"Killed\" in error message | ||
- Fix: Increase RAM available to Docker Containers in Docker | ||
Settings | ||
- Refer: [StackOverflow | ||
Solution](https://stackoverflow.com/a/50770267), [Configure | ||
Resources on Docker for | ||
Mac](https://docs.docker.com/desktop/mac/#resources) | ||
- Symptom: Errors out complaining about Tensors mismatch, null etc | ||
- Mitigation: Delete content-type \> image section from | ||
docker~sampleconfig~.yml | ||
|
||
## Miscellaneous | ||
|
||
- The experimental [chat](localhost:8000/chat) API endpoint uses the | ||
[OpenAI API](https://openai.com/api/) | ||
- It is disabled by default | ||
- To use it add your `openai-api-key` to config.yml | ||
|
||
## Development Setup | ||
|
||
### Setup on Local Machine | ||
|
||
1. 1\. Install Dependencies | ||
|
||
1. Install Python3 \[Required\] | ||
|
||
2. [Install | ||
Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) | ||
\[Required\] | ||
|
||
3. Install Exiftool \[Optional\] | ||
|
||
``` shell | ||
sudo apt-get -y install libimage-exiftool-perl | ||
``` | ||
|
||
2. 2\. Install Khoj | ||
|
||
``` shell | ||
git clone https://github.com/debanjum/khoj && cd khoj | ||
conda env create -f config/environment.yml | ||
conda activate khoj | ||
``` | ||
|
||
3. 3\. Configure | ||
|
||
- Configure files/directories to search in `content-type` section | ||
of `sample_config.yml` | ||
- To run application on test data, update file paths containing | ||
`/data/` to `tests/data/` in `sample_config.yml` | ||
- Example replace `/data/notes/*.org` with | ||
`tests/data/notes/*.org` | ||
|
||
4. 4\. Run | ||
|
||
Load ML model, generate embeddings and expose API to query notes, | ||
images, transactions etc specified in config YAML | ||
|
||
``` shell | ||
python3 -m src.main -c=config/sample_config.yml -vv | ||
``` | ||
|
||
### Upgrade On Local Machine | ||
|
||
``` shell | ||
cd khoj | ||
git pull origin master | ||
conda deactivate khoj | ||
conda env update -f config/environment.yml | ||
conda activate khoj | ||
``` | ||
|
||
## Acknowledgments | ||
|
||
- [Multi-QA MiniLM | ||
Model](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1) | ||
for Asymmetric Text Search. See [SBert | ||
Documentation](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) | ||
- [OpenAI CLIP Model](https://github.com/openai/CLIP) for Image | ||
Search. See [SBert | ||
Documentation](https://www.sbert.net/examples/applications/image-search/README.html) | ||
- Charles Cave for [OrgNode | ||
Parser](http://members.optusnet.com.au/~charles57/GTD/orgnode.html) | ||
- Sven Marnach for | ||
[PyExifTool](https://github.com/smarnach/pyexiftool/blob/master/exiftool.py) |