Custom Embedding Model
In version 1.6.4, we add a new feature that is custom embedding model. It means you can use your favorite text embedding model to instead the default built-in model which is optimized for English.
NOTE: Only supports ONNX format models.
Below is an example using the German-English bilingual model:
Open jinaai/jina-embeddings-v2-base-de · Hugging Face in your browser, jina-embeddings-v2-base-de is a German/English bilingual text embedding model.
Select the “Files and versions” tab, download “tokenizer.json”.
Open “onnx” folder, download “model_quantized.onnx”. We recommend using a model which size less than 200M. Using a larger model will have a better recall but will cost more memory and CPU. NOTE: Only supports ONNX format models.
Go back the configuration page, enable the switch, then upload the two files.
Click the “Save” button, it will show like this if success:
Open “Manage apps” page, restart the plugin(click “Disable” and then “Enable”).
Now the embedding model has switched to your custom model. It will cost some time to rebuild the embedding store, depends on the count of your Confluence contents.
Model Resources:
jinaai/jina-embeddings-v2-base-es · Hugging Face (Spanish/English bilingual text embedding model)
jinaai/jina-embeddings-v2-base-zh · Hugging Face (Chinese/English bilingual text embedding model)