最近阿里发布了Qwen3-Embedding模型,有多种参数规模可选。默认本人选责了最高的8b版本。硅基流动和ollama官方都上架了该模型。
不过很多地方找不到,如何置顶输出维度,默认Qwen3-Embedding模型是输出4096维,以前我用BAAI的bge-m3系列Embeding比较多,是1024维的输出。4096感觉维度高了很多,考虑到向量数据库索引兼容的问题,还是打算测测1024维的检索情况。
经过一番探索,搞定了,分享出来,希望能帮到需要的人。硅基流动比较简单,管方API说明里其实就有说明。

硅基流动示例

https://api.siliconflow.cn
示例如下:

# The number of dimensions the resulting output embeddings should have. Only supported in `Qwen/Qwen3` series. 
# - Qwen/Qwen3-Embedding-8B: [64,128,256,512,768,1024,1536,2048,2560,4096] 
# - Qwen/Qwen3-Embedding-4B:[64,128,256,512,768,1024,1536,2048,2560] 
# - Qwen/Qwen3-Embedding-0.6B: [64,128,256,512,768,1024]

#注意 dimensions 输出维度设置,只能在枚举范围内。以上是管方说明。如果要默认维度输出,则去掉 "dimensions": 64,参数即可。

# The number of dimensions the resulting output embeddings should have. Only supported in `Qwen/Qwen3` series. 
# - Qwen/Qwen3-Embedding-8B: [64,128,256,512,768,1024,1536,2048,2560,4096] 
# - Qwen/Qwen3-Embedding-4B:[64,128,256,512,768,1024,1536,2048,2560] 
# - Qwen/Qwen3-Embedding-0.6B: [64,128,256,512,768,1024]

#注意 dimensions 输出维度设置,只能在枚举范围内
#注意 ${API-KEY} 替换成自己的硅基流动API-KEY

$ curl --request POST   --url https://api.siliconflow.cn/v1/embeddings   --header 'Authorization: Bearer ${API-KEY}'   --header 'Content-Type: application/json'   --data '{ "model":"Qwen/Qwen3-Embedding-8B", "dimensions": 64, "input": "地球是圆的么?还是不规则椭球体?"}'

{"object":"list","data":[{"embedding":[0.10134493559598923,-0.1429036557674408,0.002346791559830308,-0.2653925120830536,0.1071777418255806,-0.19685706496238708,-0.031715862452983856,0.07108727842569351,-0.20560628175735474,0.20852267742156982,-0.1684221625328064,0.06853542476892471,0.12394704669713974,-0.030257660895586014,0.204148069024086,0.10644863545894623,-0.09077298641204834,-0.06416082382202148,0.17717136442661285,0.34850993752479553,0.08858568221330643,0.1071777418255806,0.16113115847110748,-0.17206765711307526,0.013579492457211018,0.24789409339427948,-0.03463226184248924,-0.1450909525156021,-0.008156809024512768,0.25081050395965576,-0.05942167341709137,-0.1655057668685913,0.0459333173930645,0.15165285766124725,-0.043746016919612885,-0.022693246603012085,-0.09004388004541397,0.06488992273807526,-0.022693246603012085,-0.08020102977752686,0.05942167341709137,0.09624123573303223,0.0004898642655462027,-0.0006835315143689513,0.03900686278939247,-0.23039568960666656,-0.1275925487279892,0.011027641594409943,0.12613435089588165,0.05504706874489784,0.06488992273807526,-0.11082324385643005,-0.09004388004541397,0.00893147848546505,-0.0382777638733387,0.007837828248739243,-0.020870495587587357,0.0013271903153508902,-0.028981735929846764,0.05504706874489784,-0.13634175062179565,-0.009888422675430775,0.17425496876239777,-0.08639837801456451],"index":0,"object":"embedding"}],"model":"Qwen/Qwen3-Embedding-8B","usage":{"prompt_tokens":15,"completion_tokens":0,"total_tokens":15}}

ollama本地部署调用示例
 

部署非常简单

ollama pull qwen3-embedding:8b

等待下载完成即可调用。

示例如下:

#指定输出32维向量编码,注意dimensions参数,这里就是可以任意指定了,按Qwen3管方发布的范围任意设置
#如果要输出默认4096维,则去掉"dimensions": 32, 参数即可

#指定输出32维向量编码,注意dimensions参数,这里就是可以任意指定了,按Qwen3管方发布的范围任意设置
#如果要输出默认4096维,则去掉"dimensions": 32, 参数即可

$ curl --request POST   --url http://127.0.0.1:11434/v1/embeddings   --header 'Content-Type: application/json'   --data '{ "model":"qwen3-embedding:8b", "dimensions": 32, "input": "地球是圆的么?还是不规则椭球体?"}'
{"object":"list","data":[{"object":"embedding","embedding":[0.06346299,-0.11780241,-0.008666516,-0.29239595,0.20539273,-0.16160041,-0.112265706,-0.022290846,-0.23823205,0.17676502,-0.12920417,0.078956306,0.10493126,0.009448055,0.2863263,0.1379949,-0.082198665,-0.0664435,0.17087917,0.46806002,0.13782167,0.1962539,0.033117626,-0.21283478,-0.02670334,0.27668136,-0.04899813,-0.20388159,0.029554997,0.30507603,-0.052901987,-0.103610314],"index":0}],"model":"qwen3-embedding:8b","usage":{"prompt_tokens":16,"total_tokens":16}}

很简单。

Logo

火山引擎开发者社区是火山引擎打造的AI技术生态平台,聚焦Agent与大模型开发,提供豆包系列模型(图像/视频/视觉)、智能分析与会话工具,并配套评测集、动手实验室及行业案例库。社区通过技术沙龙、挑战赛等活动促进开发者成长,新用户可领50万Tokens权益,助力构建智能应用。

更多推荐