Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 10 Next »

In version 1.6.4, we add a new feature that is custom embedding model. It means you can use your favorite text embedding model to instead the default built-in model which is optimized for English.

NOTE: Only supports ONNX format models.

image-20241224-070913.png


Below is an example using the German-English bilingual model:

  1. Open https://huggingface.co/jinaai/jina-embeddings-v2-base-de in your browser, jina-embeddings-v2-base-de is a German/English bilingual text embedding model.

  2. Select the “Files and versions” tab, download “tokenizer.json”.

    image-20241224-072337.png
  3. Open “onnx” folder, download “model_quantized.onnx”. We recommend using a model which size less than 200M. Using a larger model will have a better recall but will cost more memory and CPU. NOTE: Only supports ONNX format models.

    image-20241224-073514.png
  4. Go back the configuration page, enable the switch, then upload the two files.

    image-20241224-074604.png
  5. Click the “Save” button, it will show like this if success:

    image-20241227-062302.png
  6. Open “Manage apps” page, restart the plugin(click “Disable” and then “Enable”).

    image-20241224-075442.png

Now the embedding model has switched to your custom model. It will cost some time to rebuild the embedding store, depends on the count of your Confluence contents.


Model Resources:

Model

Language Support

Model File

Tokenizer File

Comment

jinaai/jina-embeddings-v2-base-de

German English

download

download

https://huggingface.co/jinaai/jina-embeddings-v2-base-de/

jinaai/jina-embeddings-v2-base-es

Spanish English

download

download

https://huggingface.co/jinaai/jina-embeddings-v2-base-es

jinaai/jina-embeddings-v2-base-zh

Chinese English

download

download

https://huggingface.co/jinaai/jina-embeddings-v2-base-zh

Lajavaness/bilingual-embedding-base

French English

download

download

Quantize to Int8
https://huggingface.co/Lajavaness/bilingual-embedding-small

BM-K/KoSimCSE-bert

Korean

download

download

Quantize to Int8
https://huggingface.co/BM-K/KoSimCSE-bert

  • No labels