Skip to content

Conversation

@MukulLambat
Copy link

Description

This PR updates Giskard’s RAG evaluation to be compatible with the async metrics API introduced in ragas>=0.3.x, while fixing the original RagasLLMWrapper instantiation issue.

In ragas 0.3.x, the API introduced two main breaking changes that affected Giskard:

  1. BaseRagasLLM added a new abstract method is_finished(), which caused:

    TypeError: Can't instantiate abstract class RagasLLMWrapper with abstract method is_finished
    
  2. Metrics such as AnswerRelevancy, Faithfulness etc. moved to an async, sample-based API (SingleTurnSample + single_turn_ascore / multi_turn_ascore) instead of the old synchronous .score(dict) / .single_turn_score(...) API.


What this PR does

  1. Fix LLM wrapper instantiation

    • Implements is_finished() in RagasLLMWrapper to conform to the new BaseRagasLLM interface.
    • The implementation inspects generation metadata / finish reasons so that RAGAS can correctly decide when a completion is done.
  2. Adopt the async metrics API with SingleTurnSample

    In RagasMetric.__call__:

    • We continue to build the existing ragas_sample dict via prepare_ragas_sample(...).

    • We then construct a SingleTurnSample:

      sample = SingleTurnSample(
          user_input=ragas_sample.get("user_input") or ragas_sample.get("question"),
          response=ragas_sample.get("response") or ragas_sample.get("answer"),
          retrieved_contexts=ragas_sample.get("retrieved_contexts") or ragas_sample.get("contexts"),
          reference=ragas_sample.get("reference") or ragas_sample.get("ground_truth"),
      )
    • Metric evaluation is now done via the async RAGAS API, using an internal coroutine:

      async def _compute_score():
          if hasattr(self.metric, "single_turn_ascore"):
              return await self.metric.single_turn_ascore(sample)
          elif hasattr(self.metric, "multi_turn_ascore"):
              return await self.metric.multi_turn_ascore(sample)
          else:
              raise AttributeError(
                  f"{self.metric} has neither single_turn_ascore nor multi_turn_ascore "
                  "— check ragas version or metric type."
              )
      
      loop = asyncio.get_event_loop()
      val = loop.run_until_complete(_compute_score())
      return {self.name: val}
    • This aligns Giskard with the current RAGAS metric interface, where metrics are evaluated asynchronously over SingleTurnSample (or multi-turn equivalents).

  3. Update tests to use AsyncMock

    • tests/rag/test_ragas_metrics.py is updated so that metric mocks reflect the new async API:

      • AsyncMock is used to mock single_turn_ascore in:

        • test_ragas_metric_computation_with_context
        • test_ragas_metric_computation
      • Tests still assert the same observable behaviour: RagasMetric.__call__ returns a dict like {"test": 0.5}.

    • Existing behaviour for the “missing context” case is preserved; the corresponding test continues to check that we return 0.0 and log a warning.


Behavioural impact

  • RAG evaluation now works again with ragas>=0.3.x, using the async metric API and SingleTurnSample.

  • The original error:

    Can't instantiate abstract class RagasLLMWrapper with abstract method is_finished
    

    is fixed by implementing is_finished() in the wrapper.

  • Metric computation is driven by single_turn_ascore / multi_turn_ascore, matching the current RAGAS documentation for legacy metrics.


Related Issue

This PR fixes #2218.


Type of Change

  • 🔧 Bug fix (non-breaking change which fixes an issue)
  • 🥂 Improvement (non-breaking change which improves an existing feature)

Checklist

  • I’ve read the CODE_OF_CONDUCT.md document.
  • I’ve read the CONTRIBUTING.md guide.
  • I’ve written tests (or updated existing ones) for the methods and classes I touched.
  • I’ve updated docstrings where necessary.
  • I’ve updated pdm.lock by running pdm update-lock (only applicable if pyproject.toml was modified).
@SurajBhar
Copy link

@kevinmessiaen Please chek the new PR with updated changes! We have reversed the changes made in build-python.yml, the file is in its original state. Please cross check it and let us know your feedback on this PR!

@MukulLambat MukulLambat mentioned this pull request Nov 20, 2025
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants