I am correctly downloading the GAIA 2023 Level 1 validation dataset using snapshot_download and load_dataset. This submission is for Unit 4 Agent Course.
data_dir = snapshot_download(
repo_id="gaia-benchmark/GAIA",
repo_type="dataset"
)
dataset = load_dataset(data_dir, "2023_level1", split="validation")
subset = dataset.select(range(20))
for item in subset:
task_id = item.get("task_id")
question_text = item.get("Question")
file_name = item.get("file_name")
I experience failures when trying to run the first 20 questions i received only 5 task ids are valid.. When I specifically tried to isolate and run the task ID '935e2cff-ae78-4218-b3f5-115589b19dae' using the filtering method, the evaluation system reported.
'Submission received, but no valid/matching task IDs were found in the 1 answers provided.' This occurred even though I was confident the answer was correct