QATestset - push to hub integration #2151

GTimothee · 2025-04-07T17:17:09Z

Description

The goal is to support push_to_hub from Huggingface, so that users can easily save, share, version and reuse their datasets.

It is still a work in progress but I would like to:

get your feedback and recommendations about what I have done so far (Implementation mostly taken from https://github.com/MinishLab/vicinity/pull/58/files)
I wanted to ask how I can get the llm that is being used (See snippet below) ? So that I can add it to the config (I've set a fake llm name for tests)

import giskard 
giskard.llm.set_llm_model("gpt-4o")

You can see how it looks like here: https://huggingface.co/datasets/GTimothee/qatestset_demo

Both push_to_hub and load_from_hub are working. You can try it yourself with:

from giskard.rag.testset import QATestset
dset = QATestset.load_from_hub("GTimothee/qatestset_demo")
dset.to_pandas().head()

Related Issue

None yet (to the best of my knowledge)

Type of Change

📚 Examples / docs / tutorials / dependencies update
🔧 Bug fix (non-breaking change which fixes an issue)
🥂 Improvement (non-breaking change which improves an existing feature)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
🔐 Security fix

Checklist

I've read the CODE_OF_CONDUCT.md document.
I've read the CONTRIBUTING.md guide.
I've written tests for all new methods and classes that I created.
I've written the docstring in Google format for all the methods and classes that I used.
I've updated the pdm.lock running pdm update-lock (only applicable when pyproject.toml has been
modified)

davidberenstein1957

Hi, looking great. Left some small comment :)

giskard/rag/dataset_card_template.md

giskard/rag/testset.py

GTimothee · 2025-04-08T10:27:51Z

Hopefully I implemented all your comments successfully. I've contacted you about the llm name defined in the config

GTimothee · 2025-04-08T12:52:19Z

I am now fetching the llm config dynamically to add it in the model card :)

giskard/rag/testset.py

GTimothee · 2025-04-08T16:12:31Z

Ok so I found the issue. I did not remember that instantiating a child without implementing the parent class' abstract methods would result in a failure at instanciation time. I thought it was only when calling the method that it would fail, hence the misunderstanding. So I implemented the get_config for each llmclient and it works. :)

GTimothee · 2025-04-08T21:38:15Z

I've written the tests and documentation, I just need to update the lock file because I added pytest-mock to the dependencies

davidberenstein1957

Hi, looking good. Perhaps we should also check if everything has been added to API references.

docs/open_source/testset_generation/testset_generation/index.md

henchaves · 2025-04-10T11:53:52Z

Hello @GTimothee, thanks for this PR.

Could you rename push_to_hub and load_from_hub methods to push_to_hf_hub and load_from_hf_hub? Because we also have a product in Giskard called LLM Hub, so these methods might be misinterpreted.

GTimothee · 2025-04-11T04:41:15Z

I updated the names in the code, docs and tests 👍

GTimothee · 2025-04-28T17:26:48Z

So I was able to reproduce the issue with pydantic v1, just downgrading litellm to an appropriate version makes the tests pass again. So I suggested a change in the workflow file.

giskard/rag/testset.py

GTimothee added 3 commits April 7, 2025 18:57

first draft

8580e8e

tmp change for test

75db905

tmp change for test

9f5e454

davidberenstein1957 self-requested a review April 8, 2025 07:28

davidberenstein1957 reviewed Apr 8, 2025

View reviewed changes

giskard/rag/dataset_card_template.md Show resolved Hide resolved

giskard/rag/testset.py Outdated Show resolved Hide resolved

giskard/rag/testset.py Outdated Show resolved Hide resolved

giskard/rag/testset.py Show resolved Hide resolved

update tags in template, litlle updates in the testset class

1ebc028

add get_config usage in llmclient

1f12a0c

davidberenstein1957 reviewed Apr 8, 2025

View reviewed changes

giskard/rag/testset.py Outdated Show resolved Hide resolved

fix little inconsistency with get_config

f054c3e

GTimothee added 9 commits April 8, 2025 18:53

add unit tests

bb656bd

update tests

be6e661

update tests

1d6cc7f

update tests

41acbce

test get_config for litellm

e007e91

fix test get_config for litellm

1e5a580

fix test get_config for other llm

632b939

fix test get_config for mistralllm

7b9d58d

add documentation

23fe214

davidberenstein1957 added safe for build Lockfile Temporary label to update pdm.lock labels Apr 9, 2025

davidberenstein1957 reviewed Apr 9, 2025

View reviewed changes

docs/open_source/testset_generation/testset_generation/index.md Outdated Show resolved Hide resolved

GTimothee added 2 commits April 9, 2025 09:08

pdm lock updated

1c0edea

add API reference

2da300e

update method names + add logo to card template

b80d2b4

fix workflow for testing pydantic v1

4457516

davidberenstein1957 self-requested a review June 11, 2025 07:14

davidberenstein1957 approved these changes Jun 11, 2025

View reviewed changes

davidberenstein1957 requested review from henchaves and removed request for henchaves June 11, 2025 07:15

davidberenstein1957 added 3 commits June 11, 2025 09:16

Update QATestset.md

7763077

Update build-python.yml

2f62912

Delete pdm.lock

460fd27

davidberenstein1957 added Lockfile Temporary label to update pdm.lock safe for build and removed safe for build labels Jun 11, 2025

henchaves added safe for build Lockfile Temporary label to update pdm.lock and removed safe for build Lockfile Temporary label to update pdm.lock labels Jun 11, 2025

update pdm.lock

eea7afd

henchaves removed the Lockfile Temporary label to update pdm.lock label Jun 11, 2025

Merge branch 'main' into qatest_push_to_hub

085abe6

henchaves added safe for build and removed safe for build labels Jun 11, 2025

davidberenstein1957 added 3 commits June 11, 2025 12:53

Update testset.py

e7e4aaa

Update testset.py

86ebea5

Update testset.py

dc0596c

davidberenstein1957 reviewed Jun 11, 2025

View reviewed changes

giskard/rag/testset.py Outdated Show resolved Hide resolved

Update giskard/rag/testset.py

cb74007

davidberenstein1957 self-requested a review June 11, 2025 11:24

davidberenstein1957 approved these changes Jun 11, 2025

View reviewed changes

henchaves approved these changes Jun 11, 2025

View reviewed changes

henchaves merged commit ca94fea into Giskard-AI:main Jun 11, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

QATestset - push to hub integration #2151

QATestset - push to hub integration #2151

Uh oh!

GTimothee commented Apr 7, 2025 •

edited

Loading

davidberenstein1957 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GTimothee commented Apr 8, 2025

GTimothee commented Apr 8, 2025

Uh oh!

GTimothee commented Apr 8, 2025 •

edited

Loading

GTimothee commented Apr 8, 2025

davidberenstein1957 left a comment

Uh oh!

henchaves commented Apr 10, 2025

GTimothee commented Apr 11, 2025

GTimothee commented Apr 28, 2025

Uh oh!

Uh oh!

Labels

3 participants

Uh oh!

QATestset - push to hub integration #2151

QATestset - push to hub integration #2151

Uh oh!

Conversation

GTimothee commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

Checklist

davidberenstein1957 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GTimothee commented Apr 8, 2025

GTimothee commented Apr 8, 2025

Uh oh!

GTimothee commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GTimothee commented Apr 8, 2025

davidberenstein1957 left a comment

Choose a reason for hiding this comment

Uh oh!

henchaves commented Apr 10, 2025

GTimothee commented Apr 11, 2025

GTimothee commented Apr 28, 2025

Uh oh!

Uh oh!

Labels

3 participants

GTimothee commented Apr 7, 2025 •

edited

Loading

GTimothee commented Apr 8, 2025 •

edited

Loading