Skip to content

Commit 99a65b7

Browse files
authored
Update README.md
1 parent 5adf678 commit 99a65b7

File tree

1 file changed

+50
-40
lines changed

1 file changed

+50
-40
lines changed

‎README.md‎

Lines changed: 50 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
# llmware
2-
![Static Badge](https://img.shields.io/badge/python-3.9_%7C_3.10%7C_3.11-blue?color=blue)
2+
![Static Badge](https://img.shields.io/badge/python-3.9_%7C_3.10%7C_3.11%7C 3.12-blue?color=blue)
33
![PyPI - Version](https://img.shields.io/pypi/v/llmware?color=blue)
44
[![discord](https://img.shields.io/badge/Chat%20on-Discord-blue?logo=discord&logoColor=white)](https://discord.gg/MhZn5Nc39h)
55

66
## 🧰🛠️🔩The Ultimate Toolkit for Building LLM Apps
77

88
From quickly building POCs to scalable LLM Apps for the enterprise, LLMWare is packed with all the tools you need.
99

10-
`llmware` is an integrated framework with over 50+ models in Hugging Face for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
10+
`llmware` is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
1111

1212
This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications.
1313

@@ -47,7 +47,7 @@ from llmware.prompts import Prompt
4747
models = ModelCatalog().list_all_models()
4848

4949
# to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter
50-
my_model = ModelCatalog().load_model("llmware/bling-tiny-llama-v0")
50+
my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf")
5151
output = my_model.inference("what is the future of AI?", add_context="Here is the article to read")
5252

5353
# to integrate model into a Prompt
@@ -64,7 +64,7 @@ response = prompter.prompt_main("what is the future of AI?", context="Insert Sou
6464

6565
from llmware.library import Library
6666

67-
# to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json)
67+
# to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html)
6868

6969
# step 1 - create a library, which is the 'knowledge-base container' construct
7070
# - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name})
@@ -80,8 +80,8 @@ lib.add_files("/folder/path/to/my/files")
8080
# to install an embedding on a library - pick an embedding model and vector_db
8181
lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500)
8282

83-
# to add a second embedding to the same library (mix-and-match models + vector db)
84-
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="faiss", batch_size=100)
83+
# to add a second embedding to the same library (mix-and-match models + vector db)
84+
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100)
8585

8686
# easy to create multiple libraries for different projects and groups
8787

@@ -176,7 +176,8 @@ prompt_history = prompter.get_current_history()
176176
<summary><b>RAG-Optimized Models</b> - 1-7B parameter models designed for RAG workflow integration and running locally. </summary>
177177

178178
```
179-
""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context """
179+
""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both
180+
Pytorch and GGUF versions. """
180181
181182
import time
182183
from llmware.prompts import Prompt
@@ -387,22 +388,25 @@ if __name__ == "__main__":
387388
388389
# list of 'rag-instruct' laptop-ready small bling models on HuggingFace
389390
390-
model_list = ["llmware/bling-1b-0.1", # fastest + most popular
391-
"llmware/bling-tiny-llama-v0", # *** newest ***
392-
"llmware/bling-1.4b-0.1",
393-
"llmware/bling-falcon-1b-0.1",
394-
"llmware/bling-cerebras-1.3b-0.1",
395-
"llmware/bling-sheared-llama-1.3b-0.1",
396-
"llmware/bling-sheared-llama-2.7b-0.1",
397-
"llmware/bling-red-pajamas-3b-0.1",
398-
"llmware/bling-stable-lm-3b-4e1t-v0" # most accurate
399-
]
391+
pytorch_models = ["llmware/bling-1b-0.1", # most popular
392+
"llmware/bling-tiny-llama-v0", # fastest
393+
"llmware/bling-1.4b-0.1",
394+
"llmware/bling-falcon-1b-0.1",
395+
"llmware/bling-cerebras-1.3b-0.1",
396+
"llmware/bling-sheared-llama-1.3b-0.1",
397+
"llmware/bling-sheared-llama-2.7b-0.1",
398+
"llmware/bling-red-pajamas-3b-0.1",
399+
"llmware/bling-stable-lm-3b-4e1t-v0",
400+
"llmware/bling-phi-3" # most accurate (and newest)
401+
]
400402
401-
# dragon models are 6-7B and designed for GPU use - but the GGUF versions run nicely on a laptop with at least 16 GB of RAM
402-
gguf_models = ["llmware/dragon-yi-6b-gguf", "llmware/dragon-llama-7b-gguf", "llmware/dragon-mistral-7b-gguf"]
403+
# Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM
404+
gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"]
403405
404-
# try the newest bling model - 'tiny-llama' or load a gguf model
405-
bling_meets_llmware_hello_world(model_list[1])
406+
# try model from either pytorch or gguf model list
407+
# the newest (and most accurate) is 'bling-phi-3-gguf'
408+
409+
bling_meets_llmware_hello_world(gguf_models[0]
406410
407411
# check out the model card on Huggingface for RAG benchmark test performance results and other useful information
408412
```
@@ -425,7 +429,7 @@ LLMWareConfig().set_vector_db("milvus")
425429

426430
# for fast start - no installations required
427431
LLMWareConfig().set_active_db("sqlite")
428-
LLMWareConfig().set_vector_db("faiss") # try also chromadb and lancedb
432+
LLMWareConfig().set_vector_db("chromadb") # try also faiss and lancedb
429433

430434
# for single postgres deployment
431435
LLMWareConfig().set_active_db("postgres")
@@ -528,13 +532,13 @@ def contract_analysis_on_laptop (model_name):
528532

529533
query_list = {"executive employment agreement": "What are the name of the two parties?",
530534
"base salary": "What is the executive's base salary?",
531-
"governing law": "What is the governing law?"}
535+
"vacation": "How many vacation days will the executive receive?"}
532536

533537
# Load the selected model by name that was passed into the function
534538

535539
print (f"\n > Loading model {model_name}...")
536540

537-
prompter = Prompt().load_model(model_name)
541+
prompter = Prompt().load_model(model_name, temperature=0.0, sample=False)
538542

539543
# Main loop
540544

@@ -556,7 +560,7 @@ def contract_analysis_on_laptop (model_name):
556560

557561
# step 4 above - calling the LLM with 'source' information already packaged into the prompt
558562

559-
responses = prompter.prompt_with_source(value, prompt_name="just_the_facts", temperature=0.3)
563+
responses = prompter.prompt_with_source(value, prompt_name="default_with_context")
560564

561565
# step 5 above - print out to screen
562566

@@ -579,8 +583,8 @@ def contract_analysis_on_laptop (model_name):
579583

580584
if __name__ == "__main__":
581585

582-
# use local cpu model - smallest, fastest (use larger BLING models for higher accuracy)
583-
model = "llmware/bling-tiny-llama-v0"
586+
# use local cpu model - try the newest - RAG finetune of Phi-3 quantized and packaged in GGUF
587+
model = "bling-phi-3-gguf"
584588

585589
contract_analysis_on_laptop(model)
586590

@@ -590,15 +594,15 @@ if __name__ == "__main__":
590594

591595
## 🔥 What's New? 🔥
592596

593-
-**Web Services with Agent Calls for Financial Research** - end-to-end scenario - [video](https://youtu.be/l0jzsg1_Ik0?si=hmLhpT1iv_rxpkHo) and [example](examples/SLIM-Agents/web_services_slim_fx.py)
597+
-**Web Services with Agent Calls for Financial Research** - end-to-end scenario - [video](https://youtu.be/l0jzsg1_Ik0?si=hmLhpT1iv_rxpkHo) and [example](examples/Use_Cases/web_services_slim_fx.py)
594598

595-
-**Voice Transcription with WhisperCPP** - fast, accurate local transcription of voice files - [example](examples/Models/using-whisper-cpp-getting-started.py)
599+
-**Voice Transcription with WhisperCPP** - [getting_started](examples/Models/using-whisper-cpp-getting-started.py), [using_sample_files](examples/Models/using-whisper-cpp-sample-files.py), and [analysis_use_case](examples/Use_Cases/parsing_great_speeches.py)
596600

597601
-**Small, specialized, function-calling Extract Model** - introducing slim-extract - [video](https://youtu.be/d6HFfyDk4YE?si=VB8JTsN3X7hsB_I) and [example](examples/SLIM-Agents/using_slim_extract_model.py)
598602

599603
-**LLM to Answer Yes/No questions** - introducing slim-boolean model - [video](https://youtu.be/jZQZMMqAJXs?si=7HpkLqG39ohgNecx) and [example](examples/SLIM-Agents/using_slim_boolean_model.py)
600604

601-
-**Natural Language Query to CSV End to End example** - using slim-sql model - [video](https://youtu.be/z48z5XOXJJg?si=V-CX1w-7KRioI4Bi) and [example](examples/SLIM-Agents/text2sql-end-to-end-2.py) and now using Custom Tables on Postgres [example](https://github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/agent_with_custom_tables.py)
605+
-**Natural Language Query to CSV End to End example** - using slim-sql model - [video](https://youtu.be/z48z5XOXJJg?si=V-CX1w-7KRioI4Bi) and [example](examples/SLIM-Agents/text2sql-end-to-end-2.py) and now using Custom Tables on Postgres [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/agent_with_custom_tables.py)
602606

603607
-**Multi-Model Agents with SLIM models** - multi-step Agents with SLIMs on CPU - [video](https://www.youtube.com/watch?v=cQfdaTcmBpY) - [example](examples/SLIM-Agents)
604608

@@ -621,9 +625,11 @@ if __name__ == "__main__":
621625

622626
## 🔥 Top New Examples 🔥
623627

624-
End-to-End Scenario - [**Function Calls with SLIM Extract and Web Services for Financial Research**](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/web_services_slim_fx.py)
628+
End-to-End Scenario - [**Function Calls with SLIM Extract and Web Services for Financial Research**](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/web_services_slim_fx.py)
629+
Analyzing Voice Files - [**Great Speeches with LLM Query and Extract**](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py)
625630
New to LLMWare - [**Fast Start tutorial series**](https://github.com/llmware-ai/llmware/tree/main/fast_start)
626-
SLIM Examples - [**SLIM Models**](examples/SLIM-Agents/)
631+
Getting Setup - [**Getting Started**](https://github.com/llmware-ai/llmware/tree/main/examples/Getting_Started)
632+
SLIM Examples - [**SLIM Models**](examples/SLIM-Agents/)
627633

628634
| Example | Detail |
629635
|-------------|--------------|
@@ -632,9 +638,9 @@ SLIM Examples - [**SLIM Models**](examples/SLIM-Agents/)
632638
| 3. Hybrid Retrieval - Semantic + Text ([code](examples/Retrieval/dual_pass_with_custom_filter.py)) | Using 'dual pass' retrieval to combine best of semantic and text search |
633639
| 4. Multiple Embeddings with PG Vector ([code](examples/Embedding/using_multiple_embeddings.py) / [video](https://www.youtube.com/watch?v=Bncvggy6m5Q)) | Comparing Multiple Embedding Models using Postgres / PG Vector |
634640
| 5. DRAGON GGUF Models ([code](examples/Models/dragon_gguf_fast_start.py) / [video](https://www.youtube.com/watch?v=BI1RlaIJcsc&t=130s)) | State-of-the-Art 7B RAG GGUF Models. |
635-
| 6. RAG with BLING ([code](examples/RAG/contract_analysis_on_laptop_with_bling_models.py) / [video](https://www.youtube.com/watch?v=8aV5p3tErP0)) | Using contract analysis as an example, experiment with RAG for complex document analysis and text extraction using `llmware`'s BLING ~1B parameter GPT model running on your laptop. |
636-
| 7. Master Service Agreement Analysis with DRAGON ([code](examples/RAG/msa_processing.py) / [video](https://www.youtube.com/watch?v=Cf-07GBZT68&t=2s)) | Analyzing MSAs using DRAGON YI 6B Model. |
637-
| 8. Streamlit Example ([code](examples/Getting_Started/ui_without_a_database.py)) | Upload pdfs, and run inference on llmware BLING models. |
641+
| 6. RAG with BLING ([code](examples/Use_Cases/contract_analysis_on_laptop_with_bling_models.py) / [video](https://www.youtube.com/watch?v=8aV5p3tErP0)) | Using contract analysis as an example, experiment with RAG for complex document analysis and text extraction using `llmware`'s BLING ~1B parameter GPT model running on your laptop. |
642+
| 7. Master Service Agreement Analysis with DRAGON ([code](examples/Use_Cases/msa_processing.py) / [video](https://www.youtube.com/watch?v=Cf-07GBZT68&t=2s)) | Analyzing MSAs using DRAGON YI 6B Model. |
643+
| 8. Streamlit Example ([code](examples/UI/simple_rag_ui_with_streamlit.py)) | Ask questions to Invoices with UI run inference. |
638644
| 9. Integrating LM Studio ([code](examples/Models/using-open-chat-models.py) / [video](https://www.youtube.com/watch?v=h2FDjUyvsKE&t=101s)) | Integrating LM Studio Models with LLMWare |
639645
| 10. Prompts With Sources ([code](examples/Prompts/prompt_with_sources.py)) | Attach wide range of knowledge sources directly into Prompts. |
640646
| 11. Fact Checking ([code](examples/Prompts/fact_checking.py)) | Explore the full set of evidence methods in this example script that analyzes a set of contracts. |
@@ -757,7 +763,7 @@ git clone git@github.com:llmware-ai/llmware.git
757763

758764
- 💡 Making it easy to deploy fine-tuned open source models to build state-of-the-art RAG workflows
759765
- 💡 Private cloud - keeping documents, data pipelines, data stores, and models safe and secure
760-
- 💡 Model quantization, especially GGUF, and democratizing the game-changing use of 7B CPU-based LLMs
766+
- 💡 Model quantization, especially GGUF, and democratizing the game-changing use of 1-7B CPU-based LLMs
761767
- 💡 Developing small specialized RAG optimized LLMs between 1B-7B parameters
762768
- 💡 Industry-specific LLMs, embedding models and processes to support core knowledge-based use cases
763769
- 💡 Enterprise scalability - containerization, worker deployments and Kubernetes
@@ -775,11 +781,15 @@ Questions and discussions are welcome in our [github discussions](https://github
775781

776782
## 📣 Release notes and Change Log
777783

778-
**Wednesday, May 1 - v0.2.12-WIP Update**
779-
- Working on support for Python 3.12 -> will deprecate faiss and replace with 'no-install' chromadb in Fast Start examples
784+
**Sunday, May 5 - v0.2.12-WIP Update**
785+
- Launched ["bling-phi-3"](https://huggingface.co/llmware/bling-phi-3) and ["bling-phi-3-gguf"](https://huggingface.co/llmware/bling-phi-3-gguf) in ModelCatalog - newest and most accurate BLING/DRAGON model
786+
- New long document summarization method using slim-summary-tool [example](https://github.com/llmware-ai/llmware/tree/main/examples/Prompts/document_summarizer.py)
787+
- New Office (Powerpoint, Word, Excel) sample files [example](https://github.com/llmware-ai/llmware/tree/main/examples/Parsing/parsing_microsoft_ir_docs.py)
788+
- Added support for Python 3.12
789+
- Deprecated faiss and replaced with 'no-install' chromadb in Fast Start examples
780790
- Refactored Datasets, Graph and Web Services classes
781791
- Updated Voice parsing with WhisperCPP into Library
782-
- Changes merged into main branch in repo - will be released as pypi 0.2.12 version targeted by Friday, May 3 EOD
792+
- Changes merged into main branch in repo - will be released as pypi 0.2.12 version targeted by Monday, May 6
783793

784794
**Monday, April 29 - v0.2.11 Update**
785795
- Updates to gguf libs for Phi-3 and Llama-3
@@ -789,7 +799,7 @@ Questions and discussions are welcome in our [github discussions](https://github
789799
- Improved CUDA detection on Windows and safety checks for older Mac OS versions
790800

791801
**Monday, April 22 - v0.2.10 Update**
792-
- Updates to Agent class to support Natural Language queries of Custom Tables on Postgres [example](https://github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/agent_with_custom_tables.py)
802+
- Updates to Agent class to support Natural Language queries of Custom Tables on Postgres [example](https://github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/agent_with_custom_tables.py)
793803
- New Agent API endpoint implemented with LLMWare Inference Server and new Agent capabilities [example](https://github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents/agent_api_endpoint.py)
794804

795805
**Tuesday, April 16 - v0.2.9 Update**

0 commit comments

Comments
 (0)