Skip to main content
1 vote
1 answer
115 views

I’m trying to install a BLAS-enabled version of llama-cpp-python on WSL so that the GGML library uses OpenBLAS. I attempted two different pip install invocations with CMAKE_ARGS, but the module ...
Kunal Gupta's user avatar
0 votes
1 answer
710 views

I have been trying to install llama-cpp-python for windows 11 with GPU support for a while, and it just doesn't work no matter how I try. I installed the necessary visual studio toolkit packages, ...
MiszS's user avatar
  • 11
1 vote
0 answers
394 views

I tried to install llama-cpp-python via pip, but I have an error with the installation The command that I wrote: CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip install ...
ZZISST's user avatar
  • 21
1 vote
0 answers
222 views

I am trying to set up local, high speed NLP but am failing to install the arm64 version of llama-cpp-python. Even when I run CMAKE_ARGS="-DLLAMA_METAL=on -DLLAMA_METAL_EMBED_LIBRARY=on" \ ...
Dennis Losett's user avatar
4 votes
0 answers
251 views

I am new to this. I have been trying but could not make the the model answer on images. from llama_cpp import Llama import torch from PIL import Image import base64 llm = Llama( model_path='Holo1-...
Abhash Rai's user avatar
0 votes
0 answers
128 views

I am attempting to bundle a rag agent into a .exe. However on usage of the .exe i keep running into the same two problems. The first initial problem is with locating llama-cpp, which i have fixed. The ...
Arnab Mandal's user avatar
-1 votes
2 answers
631 views

Creating directory "llava_shared.dir\Release". Structured output is enabled. The formatting of compiler diagnostics will reflect the error hierarchy. See https://aka.ms/cpp/structured-output ...
sandeep's user avatar
  • 161
0 votes
0 answers
102 views

I want a dataset of common n-grams and their log likelihoods. Normally I would download the Google Books Ngram Exports, but I wonder if I can generate a better dataset using a large language model. ...
evashort's user avatar
1 vote
0 answers
150 views

I’m working on a project that requires fully deterministic outputs across different machines using Ollama. I’ve ensured the following parameters are identical: Model quantization (e.g., llama2:7b-q4_0)...
user29255210's user avatar
2 votes
1 answer
1k views

I want my llm chatbot to remember previous conversations even after restarting the program. It is made with llama cpp python and langchain, it has conversation memory of the present chat but obviously ...
QUARKS's user avatar
  • 29
1 vote
0 answers
77 views

I was implementing RAG on a document with using the LLama2 model but my model is asking questions to itself and answering it to them. llm = LlamaCpp(model_path=model_path, temperature=0, ...
Knox's user avatar
  • 21
0 votes
0 answers
192 views

I start llama cpp Python server with the command: python -m llama_cpp.server --model D:\Mistral-7B-Instruct-v0.3.Q4_K_M.gguf --n_ctx 8192 --chat_format functionary Then I run my Python script which ...
Jengi829's user avatar
2 votes
1 answer
1k views

I want to manually choose my tokens by myself, instead of letting llama-cpp-python automatically choose one for me. This requires me to see a list of candidate next tokens, along their probabilities, ...
caveman's user avatar
  • 464
0 votes
2 answers
880 views

code: from langchain_community.vectorstores import FAISS from langchain_community.embeddings import HuggingFaceEmbeddings from langchain import PromptTemplate from langchain_community.llms import ...
Ashish Sawant's user avatar
1 vote
0 answers
173 views

I am trying to build a chatbot using LangChain. This chatbot uses different backend: Ollama Huggingfaces LLama.cpp Open AI and in a YAML file, I can configure the back end (aka provider) and the ...
Salvatore D'angelo's user avatar

15 30 50 per page