which model you use to get the score？ #2

Open

WenchaoWangLLMLearn

opened

on Jul 23, 2025

I have tested the AME dataset with GPT-4o, and the performance was poor. I would like to know which model you tested with, and whether the code was a simplified version.

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests