Skip to content

Add BGPT REFUTE (scientific critique & epistemic calibration benchmark)#224

Open
connerlambden wants to merge 1 commit into
THUDM:mainfrom
connerlambden:add-bgpt-refute
Open

Add BGPT REFUTE (scientific critique & epistemic calibration benchmark)#224
connerlambden wants to merge 1 commit into
THUDM:mainfrom
connerlambden:add-bgpt-refute

Conversation

@connerlambden
Copy link
Copy Markdown

Adds REFUTE, an Apache-2.0 Hugging Face benchmark for scientific critique and epistemic calibration on recent science paper summaries. Includes Inspect AI + lm-eval integrations and a public leaderboard.

@connerlambden
Copy link
Copy Markdown
Author

For reviewers: integrations https://huggingface.co/datasets/BGPT-OFFICIAL/refute/tree/main/integrations · technical report on dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant