Why not support image input? #944

yang-521 · 2026-04-13T04:13:32Z

yang-521
Apr 13, 2026

Version 0.10
code：`import subprocess
import os
import time
import sys
import base64
from openai import OpenAI

def start_llamafile():
# 1. 定义工作目录
work_dir = r"D:\AI\llamafile-0.10.0\bin"

# 2. 拼接 exe 的完整绝对路径
# 使用 os.path.join 可以自动处理斜杠问题，防止出错
exe_path = os.path.join(work_dir, "llamafile.exe")
# 3. 定义命令列表
# 注意：这里参数和值分开写，或者直接拼成字符串
command = [
    exe_path,  # 直接填入绝对路径
    "-m", "Qwen3.5-0.8B-Q8_0.llamafile",  # 因为设置了 cwd，模型文件可以用相对路径，或者也用绝对路径
    "--gpu", "auto",
    "--port", "8080"
]

print(f"🚀 准备启动多模态服务...")
try:
    # 使用 shell=False 更稳定，确保参数准确传递
    subprocess.Popen(
        command,
        cwd=work_dir,
        shell=False,
        creationflags=subprocess.CREATE_NEW_CONSOLE
    )
    print("⏳ 服务启动中，正在加载模型和视觉组件 (约需 5 秒)...")
    time.sleep(5)  # 视觉模型加载需要时间，建议多等一会儿
    return True
except Exception as e:
    print(f"启动失败: {e}")
    return False

--- 主程序 ---

1. 启动服务

if not start_llamafile():
sys.exit(1)

2. 初始化 OpenAI 客户端

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="sk-no-key-required"
)

print("--- ✅ 服务已就绪，准备发送图像请求 ---")

3. 准备图片

image_path = r"D:\python\Ai\jpg\0347.jpg" # 你的图片路径
if not os.path.exists(image_path):
print(f"❌ 错误：找不到图片 {image_path}")
sys.exit(1)

4. 图片转 Base64

with open(image_path, "rb") as image_file:
base64_image = base64.b64encode(image_file.read()).decode('utf-8')

try:
# 5. 发送请求
print("--- 🗣️ AI 正在看图... ---")
completion = client.chat.completions.create(
model="LLaMA_CPP",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "这张图片里有什么？请详细描述。"},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
]
}
],
stream=True
)

for chunk in completion:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

except Exception as e:
print(f"\n❌ 发生错误: {e}")
print("💡 如果提示 'image input is not supported'，请检查黑窗口日志是否成功加载了 mmproj。")

print("\n--- 结束 ---")`

res：
❌ 发生错误: Error code: 500 - {'error': {'code': 500, 'message': 'image input is not supported - hint: if this is unexpected, you may need to provide the mmproj', 'type': 'server_error'}} 💡 如果提示 'image input is not supported'，请检查黑窗口日志是否成功加载了 mmproj。

Answered by aittalam

Apr 17, 2026

Hi @yang-521 !

If I got it right (please correct me if I am wrong) you are trying to run Qwen3.5-0.8B-Q8_0.llamafile on windows by calling the llamafile server binary and passing the .llamafile as input model.

The llamafile binary is able to extract gguf model weights from a bundled .llamafile, but given the way llama.cpp works you still have to provide both -m and --mmproj parameters to run a multimodal model. As a test, if you run llamafile.exe -m Qwen3.5-0.8B-Q8_0.llamafile --cli --image image_name.png from your terminal, you should definitely see an error.

There are two ways you can run llamafile to serve a multimodal model:

if you have llamafile.exe, you can get both LLM and project…

View full answer

aittalam · 2026-04-17T16:35:43Z

aittalam
Apr 17, 2026
Maintainer

Hi @yang-521 !

If I got it right (please correct me if I am wrong) you are trying to run Qwen3.5-0.8B-Q8_0.llamafile on windows by calling the llamafile server binary and passing the .llamafile as input model.

The llamafile binary is able to extract gguf model weights from a bundled .llamafile, but given the way llama.cpp works you still have to provide both -m and --mmproj parameters to run a multimodal model. As a test, if you run llamafile.exe -m Qwen3.5-0.8B-Q8_0.llamafile --cli --image image_name.png from your terminal, you should definitely see an error.

There are two ways you can run llamafile to serve a multimodal model:

if you have llamafile.exe, you can get both LLM and projector weights (see eg here: you'll find both quants and mmproj ggufs) and then run it as llamafile.exe -m model_weights.gguf --mmproj mmproj_weights.gguf
if you have downloaded a llamafile that already contains both model and mmproj weights (this should be the case for our 0.10.0 pre-built llamafiles), you can simply rename it (e.g. Qwen3.5-0.8B-Q8_0.llamafile -> Qwen3.5-0.8B-Q8_0.llamafile.exe) and then directly run it. You can first try manually and test how your python code works with it, then start it automatically from your program.

Hope this helps, lmk how it goes!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why not support image input? #944

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why not support image input? #944

Uh oh!

yang-521 Apr 13, 2026

--- 主程序 ---

1. 启动服务

2. 初始化 OpenAI 客户端

3. 准备图片

4. 图片转 Base64

Replies: 1 comment

Uh oh!

aittalam Apr 17, 2026 Maintainer

yang-521
Apr 13, 2026

aittalam
Apr 17, 2026
Maintainer