In version 1.6.2, we add a new feature that is custom embedding model. It means you can use your favorite text embedding model to instead the default built-in model which is optimized for English.
NOTE: Only supports ONNX format models.
Below is an example using the German-English bilingual model:
Open https://huggingface.co/jinaai/jina-embeddings-v2-base-de in your browser, jina-embeddings-v2-base-de is a German/English bilingual text embedding model.
Select the “Files and versions” tab, download “tokenizer.json”.
Open “onnx” folder, download “model_quantized.onnx”. We recommend using a model which size less than 200M. Using a larger model will have a better recall but will cost more memory and CPU. NOTE: Only supports ONNX format models.
Go back the configuration page, enable the switch, then upload the two files.
Click the “Save” button, it will show like this if success:
Open “Manage apps” page, restart the plugin(click “Disable” and then “Enable”).
Model Resources:
https://huggingface.co/jinaai/jina-embeddings-v2-base-es (Spanish/English bilingual text embedding model)
https://huggingface.co/jinaai/jina-embeddings-v2-base-zh (Chinese/English bilingual text embedding model)