/
Custom Embedding Model

Custom Embedding Model

In version 1.6.4, we add a new feature that is custom embedding model. It means you can use your favorite text embedding model to instead the default built-in model which is optimized for English.

NOTE: Only supports ONNX format models.

image-20241224-070913.png

 


Below is an example using the German-English bilingual model:

  1. Open https://huggingface.co/jinaai/jina-embeddings-v2-base-de in your browser, jina-embeddings-v2-base-de is a German/English bilingual text embedding model.

  2. Select the “Files and versions” tab, download “tokenizer.json”.

    image-20241224-072337.png
  3. Open “onnx” folder, download “model_quantized.onnx”. We recommend using a model which size less than 200M. Using a larger model will have a better recall but will cost more memory and CPU. NOTE: Only supports ONNX format models.

  4. Go back the configuration page, enable the switch, then upload the two files.

  5. Click the “Save” button, it will show like this if success:

  6. Open “Manage apps” page, restart the plugin(click “Disable” and then “Enable”).

Now the embedding model has switched to your custom model. It will cost some time to rebuild the embedding store, depends on the count of your Confluence contents.


Model Resources:

 

Model

Language Support

Model File

Tokenizer File

Comment

Model

Language Support

Model File

Tokenizer File

Comment

jinaai/jina-embeddings-v2-base-de

German English

download

download

https://huggingface.co/jinaai/jina-embeddings-v2-base-de/

jinaai/jina-embeddings-v2-base-es

Spanish English

download

download

https://huggingface.co/jinaai/jina-embeddings-v2-base-es

jinaai/jina-embeddings-v2-base-zh

Chinese English

download

download

https://huggingface.co/jinaai/jina-embeddings-v2-base-zh

Lajavaness/bilingual-embedding-base

French English

download

download

Quantize to Int8
https://huggingface.co/Lajavaness/bilingual-embedding-small

BM-K/KoSimCSE-bert

Korean

download

download

Quantize to Int8
https://huggingface.co/BM-K/KoSimCSE-bert

cointegrated/rubert-tiny2

Russian

download

download

https://huggingface.co/cointegrated/rubert-tiny2

Related content

Vector Database
Vector Database
Read with this
AI for Confluence Architecture
AI for Confluence Architecture
Read with this
Configuration
Configuration
Read with this
Chroma Issues: type object 'hnswlib.Index' has no attribute 'file_handle_count'
Chroma Issues: type object 'hnswlib.Index' has no attribute 'file_handle_count'
Read with this
Advanced
Read with this