Skip to content

Support for RTX4060 with 8GB VRAM? #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nigelp opened this issue Apr 8, 2025 · 10 comments
Closed

Support for RTX4060 with 8GB VRAM? #5

nigelp opened this issue Apr 8, 2025 · 10 comments

Comments

@nigelp
Copy link

nigelp commented Apr 8, 2025

Any chance of support for smaller local models?

Copy link

github-actions bot commented Apr 8, 2025

👋 Welcome! Thanks for opening your first issue. If you'd like to take a crack at fixing it, feel free to open a pull request — otherwise, we'll take a look as soon as we can!

@mandalsouvik3333
Copy link
Collaborator

You can try running the Qwen/Qwen2.5-VL-3B-Instruct-AWQ model. In my experience, its accuracy wasn’t great—especially with complex tabular or handwritten documents. However, if you're working with printed electronic documents, the 3B model might still be worth a try.

@mandalsouvik3333
Copy link
Collaborator

mandalsouvik3333 commented Apr 10, 2025

@nigelp I have integrated ollama. You should be able to use this with ollama. Can check here https://github.com/NanoNets/docext?tab=readme-ov-file#models-with-ollama-linux-and-macos

Let me know if this works. I have tested on Quadro M4000 (8gb). So should work fine on your 4060

@nigelp
Copy link
Author

nigelp commented Apr 10, 2025

Thanks very much. Just tried it. Feedback:

Bit of a problem not finding a requirements.text file. Had to do it manually using my AI. It didn't work when it was installed anyway. Gradio comes up fine, but when I try to scan a file it gives me a console error as below.

css.ts:32 Unable to preload CSS for https://gradio.s3-us-west-2.amazonaws.com/assets/index-Bu6H1l3u.css
(anonymous) @ css.ts:32
ui-sans-serif-Regular.woff2:1
Failed to load resource: the server responded with a status of 404 (Not Found)
system-ui-Regular.woff2:1
Failed to load resource: the server responded with a status of 404 (Not Found)
Index.svelte:362 Failed to execute 'postMessage' on 'DOMWindow': The target origin provided ('https://huggingface.co') does not match the recipient window's origin ('http://localhost:7860').
(anonymous) @ Index.svelte:362
manifest.json:1
Failed to load resource: the server responded with a status of 404 (Not Found)
localhost/:1 Manifest fetch from http://localhost:7860/manifest.json failed, code 404
ui-sans-serif-Bold.woff2:1
Failed to load resource: the server responded with a status of 404 (Not Found)
system-ui-Bold.woff2:1
Failed to load resource: the server responded with a status of 404 (Not Found)
api_info.ts:423 Too many arguments provided for the endpoint.
rg @ api_info.ts:423
stream.ts:185 Method not implemented.
close @ stream.ts:185
api_info.ts:423 Too many arguments provided for the endpoint.
rg @ api_info.ts:423
manifest.json:1
Failed to load resource: the server responded with a status of 404 (Not Found)

@mandalsouvik3333
Copy link
Collaborator

mandalsouvik3333 commented Apr 10, 2025

@nigelp Can you share the installed package version and the command you had used to start the app?

@nigelp
Copy link
Author

nigelp commented Apr 10, 2025

The latest version I assume? The command:

python -m docext.app.app --model_name ollama/llama3.2-vision --max_img_size 1024

Another error:
2025-04-10 10:48:02.674 | INFO | docext.core.extract:extract_fields_from_documents:37 - Sending request to ollama/llama3.2-vision
2025-04-10 10:48:02.699 | INFO | docext.core.extract:extract_tables_from_documents:91 - Sending request to ollama/llama3.2-vision

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

Traceback (most recent call last):
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\connection.py", line 196, in _new_conn
sock = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
raise err
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [WinError 10049] The requested address is not valid in its context

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\connectionpool.py", line 789, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\connectionpool.py", line 495, in _make_request
conn.request(
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\connection.py", line 398, in request
self.endheaders()
File "C:\ProgramData\miniconda3\Lib\http\client.py", line 1331, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\ProgramData\miniconda3\Lib\http\client.py", line 1091, in _send_output
self.send(msg)
File "C:\ProgramData\miniconda3\Lib\http\client.py", line 1035, in send
self.connect()
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\connection.py", line 236, in connect
self.sock = self._new_conn()
^^^^^^^^^^^^^^^^
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\connection.py", line 211, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x000002867872B170>: Failed to establish a
new connection: [WinError 10049] The requested address is not valid in its context

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\User\AppData\Roaming\Python\Python312\site-packages\requests\adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\miniconda3\Lib\site-packages\urllib3\util\retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='0.0.0.0', port=11434): Max retries exceeded with url: /api/generate
(Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000002867872B170>: Failed to establish a new connection: [WinError 10049] The requested address is not valid in its context'))

@nigelp
Copy link
Author

nigelp commented Apr 10, 2025

Fixed. According to my AI - "The problem was fixed by modifying the code so that if Ollama is detected running on localhost:11434, the app explicitly overrides the host to localhost, instead of keeping the default 0.0.0.0.

Before, the app would only override the port but not the host, resulting in invalid API calls to 0.0.0.0:11434.

Now, it correctly uses localhost:11434 for Ollama requests, allowing successful communication."

@mandalsouvik3333
Copy link
Collaborator

Yeah. Nice catch. The error is in this line. Need to change it to localhost as default. https://github.com/NanoNets/docext/blob/b3093f4a71dc2a895645b85d847dcc85ddf7f87a/docext/app/args.py#L17C10-L17C27

For now you can do this python -m docext.app.app --model_name ollama/llama3.2-vision --max_img_size 1024 --vlm_server_host localhost

@mandalsouvik3333
Copy link
Collaborator

@nigelp I have fixed this with the default settings also. Let me know if you are still facing any issues?

@mandalsouvik3333
Copy link
Collaborator

@nigelp closing this. Feel free to reopen if you face any issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants