Skip to content

Conversation

@GTimothee
Copy link
Contributor

@GTimothee GTimothee commented Apr 7, 2025

Description

The goal is to support push_to_hub from Huggingface, so that users can easily save, share, version and reuse their datasets.

It is still a work in progress but I would like to:

  1. get your feedback and recommendations about what I have done so far (Implementation mostly taken from https://github.com/MinishLab/vicinity/pull/58/files)
  2. I wanted to ask how I can get the llm that is being used (See snippet below) ? So that I can add it to the config (I've set a fake llm name for tests)
import giskard 
giskard.llm.set_llm_model("gpt-4o")

You can see how it looks like here: https://huggingface.co/datasets/GTimothee/qatestset_demo

Both push_to_hub and load_from_hub are working. You can try it yourself with:

from giskard.rag.testset import QATestset
dset = QATestset.load_from_hub("GTimothee/qatestset_demo")
dset.to_pandas().head()

Related Issue

None yet (to the best of my knowledge)

Type of Change

  • 📚 Examples / docs / tutorials / dependencies update
  • 🔧 Bug fix (non-breaking change which fixes an issue)
  • 🥂 Improvement (non-breaking change which improves an existing feature)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 🔐 Security fix

Checklist

  • I've read the CODE_OF_CONDUCT.md document.
  • I've read the CONTRIBUTING.md guide.
  • I've written tests for all new methods and classes that I created.
  • I've written the docstring in Google format for all the methods and classes that I used.
  • I've updated the pdm.lock running pdm update-lock (only applicable when pyproject.toml has been
    modified)
@davidberenstein1957 davidberenstein1957 self-requested a review April 8, 2025 07:28
Copy link
Member

@davidberenstein1957 davidberenstein1957 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, looking great. Left some small comment :)

@GTimothee
Copy link
Contributor Author

Hopefully I implemented all your comments successfully. I've contacted you about the llm name defined in the config

@GTimothee
Copy link
Contributor Author

I am now fetching the llm config dynamically to add it in the model card :)

@GTimothee
Copy link
Contributor Author

GTimothee commented Apr 8, 2025

Ok so I found the issue. I did not remember that instantiating a child without implementing the parent class' abstract methods would result in a failure at instanciation time. I thought it was only when calling the method that it would fail, hence the misunderstanding. So I implemented the get_config for each llmclient and it works. :)

@GTimothee
Copy link
Contributor Author

I've written the tests and documentation, I just need to update the lock file because I added pytest-mock to the dependencies

@davidberenstein1957 davidberenstein1957 added safe for build Lockfile Temporary label to update pdm.lock labels Apr 9, 2025
Copy link
Member

@davidberenstein1957 davidberenstein1957 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, looking good. Perhaps we should also check if everything has been added to API references.

@henchaves
Copy link
Member

Hello @GTimothee, thanks for this PR.

Could you rename push_to_hub and load_from_hub methods to push_to_hf_hub and load_from_hf_hub? Because we also have a product in Giskard called LLM Hub, so these methods might be misinterpreted.

@GTimothee
Copy link
Contributor Author

I updated the names in the code, docs and tests 👍

@GTimothee
Copy link
Contributor Author

So I was able to reproduce the issue with pydantic v1, just downgrading litellm to an appropriate version makes the tests pass again. So I suggested a change in the workflow file.

@davidberenstein1957 davidberenstein1957 self-requested a review June 11, 2025 07:14
@davidberenstein1957 davidberenstein1957 requested review from henchaves and removed request for henchaves June 11, 2025 07:15
@davidberenstein1957 davidberenstein1957 added Lockfile Temporary label to update pdm.lock safe for build and removed safe for build labels Jun 11, 2025
@henchaves henchaves added safe for build Lockfile Temporary label to update pdm.lock and removed safe for build Lockfile Temporary label to update pdm.lock labels Jun 11, 2025
@henchaves henchaves removed the Lockfile Temporary label to update pdm.lock label Jun 11, 2025
@davidberenstein1957 davidberenstein1957 self-requested a review June 11, 2025 11:24
@henchaves henchaves merged commit ca94fea into Giskard-AI:main Jun 11, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants