LLM Web Search Agent

This project implements a web search agent powered by a local Large Language Model (LLM) and a SearXNG search engine. It automates the process of searching the web for information and summarizing the results using an LLM.

Overview

The llm-websearch.bash script orchestrates the search process. It first formulates a search query using the LLM based on the user's input. Then, it queries a SearXNG instance and iterates through the search results, using the LLM to determine if a page is relevant. If a page is deemed relevant, its content is extracted, and the LLM summarizes the key information. Finally, the script presents a consolidated summary of the findings, along with the source URLs.

Requirements

SearXNG Instance: A running SearXNG instance is required for web searching. See the SearXNG Docker Compose for an easy setup.
Local LLM: A local LLM endpoint is needed for query formulation, relevance assessment, and summarization. This project was tested with Gemma 2 2B Q8.
Python Dependencies: The Python scripts (llm-python-chat.py, llm-python-file.py) require the openai library.
Utilities: The llm-websearch.bash script depends on curl, htmlq, html2text, and pdf2txt.

Installation

Install Dependencies:

pip install openai
sudo apt-get install curl htmlq html2text poppler-utils # For Ubuntu

Configure SearXNG Address: Modify the llm-websearch.bash script to point to your local SearXNG instance. The default is http://searx.lan.
Configure LLM Endpoint: Modify the Python scripts (llm-python-chat.py, llm-python-file.py) to point to your local LLM endpoint. The default is http://localhost:9090/v1.
Place Scripts in PATH: Ensure that the three scripts (llm-python-chat.py, llm-python-file.py, llm-websearch.bash) are in your system's PATH (e.g., /usr/local/bin).

Usage

Run the llm-websearch.bash script with a search query as an argument:

llm-websearch.bash "Best Robot Vacuum of 2025"

The script will output a list of relevant URLs, descriptions, and summaries, followed by a final summary generated by the LLM.

Search Process Details

Query Formulation: The script uses llm-python-chat.py to ask the LLM to refine the user's search term into a search engine friendly phrase.
SearXNG Query: The script queries the SearXNG instance with the formulated search phrase.
Relevance Assessment: For each search result, the script uses llm-python-chat.py to determine if the result's description suggests it's relevant to the original search term.
Content Extraction and Summarization: If a result is deemed relevant, the script extracts the content using curl and html2text (or pdf2txt if it's a PDF), and then uses llm-python-file.py to summarize the content.
Final Summary: After processing all relevant results, the script uses llm-python-file.py to generate a final summary of all the extracted information.

Troubleshooting

Ensure SearXNG is Running: Verify that your SearXNG instance is running and accessible at the configured address.
Check LLM Endpoint: Make sure your local LLM endpoint is running and accessible.
Inspect Temporary Files: The script uses /dev/shm/llm-websearch.txt as a temporary file. Check this file for intermediate results and error messages.
Examine Script Output: Pay close attention to the script's output for any error messages or unexpected behavior.

Screenshot

Disclaimer

THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md
llm-python-chat.py		llm-python-chat.py
llm-python-file.py		llm-python-file.py
llm-websearch.bash		llm-websearch.bash
llm-websearch.png		llm-websearch.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Web Search Agent

Overview

Requirements

Installation

Usage

Search Process Details

Troubleshooting

Screenshot

Disclaimer

About

Uh oh!

Releases

Packages

Languages

License

Jay4242/llm-websearch

Folders and files

Latest commit

History

Repository files navigation

LLM Web Search Agent

Overview

Requirements

Installation

Usage

Search Process Details

Troubleshooting

Screenshot

Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages