-
|
Version 0.10 def start_llamafile(): --- 主程序 ---1. 启动服务if not start_llamafile(): 2. 初始化 OpenAI 客户端client = OpenAI( print("--- ✅ 服务已就绪,准备发送图像请求 ---") 3. 准备图片image_path = r"D:\python\Ai\jpg\0347.jpg" # 你的图片路径 4. 图片转 Base64with open(image_path, "rb") as image_file: try: except Exception as e: print("\n--- 结束 ---")` res: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi @yang-521 ! If I got it right (please correct me if I am wrong) you are trying to run The llamafile binary is able to extract gguf model weights from a bundled There are two ways you can run llamafile to serve a multimodal model:
Hope this helps, lmk how it goes! |
Beta Was this translation helpful? Give feedback.
Hi @yang-521 !
If I got it right (please correct me if I am wrong) you are trying to run
Qwen3.5-0.8B-Q8_0.llamafileon windows by calling the llamafile server binary and passing the.llamafileas input model.The llamafile binary is able to extract gguf model weights from a bundled
.llamafile, but given the way llama.cpp works you still have to provide both-mand--mmprojparameters to run a multimodal model. As a test, if you runllamafile.exe -m Qwen3.5-0.8B-Q8_0.llamafile --cli --image image_name.pngfrom your terminal, you should definitely see an error.There are two ways you can run llamafile to serve a multimodal model: