Message: Submission received, but no valid/matching task IDs were found in the 1 answers provided. Score did not improve previous record, leaderboard not updated.

I am correctly downloading the GAIA 2023 Level 1 validation dataset using snapshot_download and load_dataset. This submission is for Unit 4 Agent Course.

data_dir = snapshot_download(
repo_id="gaia-benchmark/GAIA",
repo_type="dataset"
)

dataset = load_dataset(data_dir, "2023_level1", split="validation")
subset = dataset.select(range(20))
for item in subset:
task_id = item.get("task_id")
question_text = item.get("Question")
file_name = item.get("file_name")

I experience failures when trying to run the first 20 questions i received only 5 task ids are valid.. When I specifically tried to isolate and run the task ID '935e2cff-ae78-4218-b3f5-115589b19dae' using the filtering method, the evaluation system reported.

'Submission received, but no valid/matching task IDs were found in the 1 answers provided.' This occurred even though I was confident the answer was correct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Message: Submission received, but no valid/matching task IDs were found in the 1 answers provided. Score did not improve previous record, leaderboard not updated. #623

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Message: Submission received, but no valid/matching task IDs were found in the 1 answers provided. Score did not improve previous record, leaderboard not updated. #623

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions