Skip to content

stream_analyzer Fails with Ollama Provider #296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
6 tasks done
siddharthdevilz opened this issue Apr 1, 2025 · 1 comment
Open
6 tasks done

stream_analyzer Fails with Ollama Provider #296

siddharthdevilz opened this issue Apr 1, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@siddharthdevilz
Copy link

siddharthdevilz commented Apr 1, 2025

Checklist

  • I'm running the newest version of LLM Vision https://github.com/valentinfrlch/ha-llmvision/releases/latest
  • I have enabled debug logging for the integration.
  • I have filled out the issue template to the best of my ability.
  • This issue only contains 1 issue (if you have multiple issues, open one issue for each issue).
  • This is a bug and not a feature request.
  • I have searched open issues for my problem.

Describe the issue

The stream_analyzer action in LLMVision fails when using Ollama as the provider. While the image_analyzer action works correctly with Ollama, the stream_analyzer does not function due to a limitation in the vision model used by Ollama. Specifically, the vision model only supports a single image per message, which causes the stream_analyzer to fail when it attempts to send multiple frames in a single request.

Reproduction steps

  1. Set up LLMVision with Ollama as the provider, following the documentation.
  2. Configure the stream_analyzer to analyze frames from a camera entity.
  3. Trigger the stream_analyzer action.

Tried with multiple models

  1. gemma3:12b
  2. llama3.2-vision:11b
  3. llava-phi3

Debug logs

Home Assistant

2025-04-01 12:02:32.110 DEBUG (MainThread) [custom_components.llmvision.memory] Memory([], [], 0)
2025-04-01 12:02:32.111 INFO (MainThread) [custom_components.llmvision.providers] Request data: {'model': 'llama3.2-vision:11b', 'messages': [{'role': 'user', 'content': [{'type': 'text', 'text': 'porch_fluent frame 1:'}, {'type': 'image_url', 'image_url': {'url': '<long_string>'}}, {'type': 'text', 'text': 'porch_fluent frame 0:'}, {'type': 'image_url', 'image_url': {'url': '<long_string>'}}, {'type': 'text', 'text': 'The attached images are frames from a live camera feed. Describe what you see'}]}], 'max_tokens': 100, 'temperature': 0.2}
2025-04-01 12:02:32.111 INFO (MainThread) [custom_components.llmvision.providers] Posting to http://192.168.x.x:3000/api/chat/completions
2025-04-01 12:02:32.241 INFO (MainThread) [custom_components.llmvision.providers] [INFO] Full Response: {"detail":"500: Ollama: 500, message='Internal Server Error', url='http://localhost:11434/api/chat'"}
2025-04-01 12:02:32.242 ERROR (MainThread) [homeassistant.helpers.script.websocket_api_script] websocket_api script: Error executing script. Error for call_service at pos 1: Unknown error
2025-04-01 12:02:32.245 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [546287706048] Unknown error


Ollama OpenWebUI Docker

time=2025-04-01T01:25:52.536Z level=INFO source=server.go:624 msg="llama runner started in 21.82 seconds"
[GIN] 2025/04/01 - 01:25:52 | 500 |  24.39373541s |       127.0.0.1 | POST     "/api/chat"
time=2025-04-01T01:25:52.536Z level=ERROR source=routes.go:1516 msg="chat prompt error" error="vision model only supports a single image per message"
2025-04-01 01:25:52.537 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 192.168.x.x:46240 - "POST /api/chat/completions HTTP/1.1" 400 - {}
@siddharthdevilz siddharthdevilz added the bug Something isn't working label Apr 1, 2025
@valentinfrlch
Copy link
Owner

To me it looks like your actually using Open WebUI as provider, which then uses Ollama as the backend. Also your logs mention that "vision model only supports a single image per message". This could be a limitation with Open WebUI. Can you try using Ollama directly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants