Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
大语言模型评估平台,支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。
Python 73 12
人机对话轮次检测模型,有效解决声学VAD等待过长的问题
Python 11 3
There was an error while loading. Please reload this page.