Skip to content

Hyun-Ryu/Arguinas

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Argument Reconstruction as Supervision for Critical Thinking in LLMs

arXiv BibTex HuggingFace

Implementation of GAAR (Generalized Automatic Argument Reconstruction) and Arguinas (Argument reconstruction) dataset as presented in our paper:
Argument Reconstruction as Supervision for Critical Thinking in LLMs
by Hyun Ryu*1,2, Gyouk Chu*2, Gregor Betz3, Eunho Yang2, Carolyn Rosé†1, and Sean Welleck†1
1Language Technologies Institute, Carnegie Mellon University    2Graduate School of AI, Korea Advanced Institute of Science & Technology    3Department of Philosophy, Karlsruhe Institute of Technology    *Equal Contribution    Equal Advising

Argument Reconstruction

GAAR


🔔 Updates

  • [✔] (26.05.21) The fine-tuned model from Qwen3-4B-Base/Instruct and Qwen3-8B-Base have been released.
  • [✔] (26.04.21) The code implementation of GAAR and Arguinas dataset are out.
  • [✔] (26.03.18) Paper is out! here

🏆 Model & Data

All released models and datasets are gathered in our HuggingFace collection: ChuGyouk/Arguinas.

🔧 Environment Setup

Follow the steps below to set up your environment:

  1. Create a Python virtual environment using e.g. Conda:
conda create -n arguinas python=3.12 && conda activate arguinas
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure API keys

Copy .env.example to .env and fill in your keys:

cp .env.example .env

Then edit .env:

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-proj-...

You only need to set the key(s) for the model family you intend to run (Anthropic for Claude models, OpenAI for GPT models).

🚀 Usage

Run the pipeline with default arguments:

python run_GAAR.py

This is equivalent to:

python run_GAAR.py \
  --data_path ./data/Sample \
  --data_filename sample.json \
  --use_general_reconstruction True \
  --use_specific_reconstruction False \
  --save_path ./output \
  --prompt_path ./prompts/GAAR \
  --subset sample \
  --model_name claude-sonnet-4-5-20250929 \
  --max_num_recon 10 \
  --max_num_debug 5 \
  --max_attempts 5

Outputs are written to ./output/reconstruction_<subset>_<model_name>.json.

🏋️ Data

Our train and test Arguinas datasets live in data/. See data/README.md for the full data format (top-level columns, fallacy_info, sections, etc.).

Expected input format for run_GAAR.py

run_GAAR.py only reads three fields from each entry in the input JSON:

Field Type Description
title string The debate topic.
background string Background context ("None" if absent).
argument string The raw argument text to reconstruct.

See data/Sample/sample.json for a minimal working example, and output/reconstruction_sample_claude-sonnet-4-5-20250929.json for a corresponding sample output produced by the pipeline.

To run on your own data, place a JSON file with the same schema under any directory and point --data_path / --data_filename to it.

📊 Prompts

All prompt templates used by each stage of the pipeline (fallacy detection, reconstruction, validity checking, streamlining, faithfulness checking, program debugging) live under prompts/GAAR/. Refer to these files to see or modify the instructions given to the LLM at each step.

Two reconstruction variants are provided:

  • General (reconstruction_general_*.txt) — classifies reasoning into 4 broad types (deductive / inductive / analogical / abductive).
  • Specific (reconstruction_60_types_*.txt) — classifies reasoning into 60 fine-grained Walton-style argumentation schemes.

Toggle between them with the --use_general_reconstruction / --use_specific_reconstruction flags.

📚 BibTeX

If you find this repo useful for your research, please consider citing us:

@article{ryu2026argument,
  title={Argument Reconstruction as Supervision for Critical Thinking in LLMs},
  author={Ryu, Hyun and Chu, Gyouk and Betz, Gregor and Yang, Eunho and Rose, Carolyn and Welleck, Sean},
  journal={arXiv preprint arXiv:2603.17432},
  year={2026}
}

✉️ Contact

If you have any questions or feedback, feel free to reach out:

About

Official Implementation of "Argument Reconstruction as Supervision for Critical Thinking in LLMs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%