Skip to content

Conversation

@AshAnand34
Copy link
Contributor

This pull request introduces a new MultiModalModel class to the llmware/models.py file, designed to support multi-modal models that handle various data types like text and images. It includes methods for preprocessing, postprocessing, and performing inference across different model types, such as PyTorch, ONNX, OpenVINO, TensorFlow, and GGUF.

Related Issue: #1025

New MultiModalModel class implementation:

  • Class overview: Added the MultiModalModel class to manage multi-modal models, with attributes for model name, type, and optional preprocessors and postprocessors.
  • Preprocessing and postprocessing: Introduced methods add_preprocessor, add_postprocessor, preprocess, and postprocess to handle data transformations for specific data types.
  • Inference logic: Implemented the inference method to preprocess inputs, run the model, and postprocess outputs. The _run_model method supports multiple model types, including PyTorch, ONNX, OpenVINO, TensorFlow, and GGUF.

This addition significantly enhances the flexibility and extensibility of the codebase for handling multi-modal machine learning models.

@doberst
Copy link
Contributor

doberst commented Jun 3, 2025

@AshAnand34 - love it .... great stuff - has been on our to-do list for a while - so really appreciate this contribution. Please give me a couple of days to go through some testing - and may add some conforming elements with other model classes. Well done!

@doberst doberst merged commit eab5000 into llmware-ai:main Jun 18, 2025
@doberst
Copy link
Contributor

doberst commented Jun 18, 2025

@AshAnand34 - nice work .... we will build on this multi modal class further - it is a good contribution. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants