Best model for air-gapped usage? #4081
Unanswered
MyndSpaceAI
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
First of all, thanks for the wonderful repo, great work!
I’m looking for advice on the best model to use with LMDeploy for information extraction (RAG) from a wide variety of multilingual documents in an air-gapped setup (no external network access).
My environment is limited to a single NVIDIA RTX 5090 GPU, so VRAM and compute efficiency also matter.
Does anyone have experience or recommendations for models that perform well under these conditions — e.g. Qwen3, GLM-4.x, or InternVL 3.5 series?
Thanks in advance for any insights or benchmarks you can share!
• Peter
Beta Was this translation helpful? Give feedback.
All reactions