Explore the GDPVal dataset.
See the occupations, tasks, and requirements as they appear in the benchmark, including the supporting files and context.
or select one on the left
GDPVal measures model performance on economically valuable, real-world tasks across 44 occupations in the top 9 U.S. GDP sectors. Tasks are authored by experienced professionals and grounded in actual work products, with a public 220-task gold subset.
Read the blog post, review the paper, explore evals.openai.com, or browse the dataset on Hugging Face.