Skip to content

harryden/bart-headline-generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BART Headline Generation with LoRA

Fine-tuning experiment for generating news headlines from short article descriptions using facebook/bart-base and LoRA adapters.

This was built as a DAT410 Design of AI Systems course project at Chalmers University in Spring 2025 with Elvina Fahlgren. The project received 100/100, and the original course report is available at docs/report.pdf.

What This Project Shows

The project explores whether parameter-efficient fine-tuning can adapt a pretrained sequence-to-sequence model to headline generation under limited compute.

Key technical pieces:

  • Fine-tuning facebook/bart-base for description-to-headline generation
  • Parameter-efficient training with LoRA instead of updating all model weights
  • Hugging Face transformers, datasets, evaluate, and peft
  • Exploratory analysis of the HuffPost News Category Dataset
  • Training and evaluation in Google Colab on an A100 GPU
  • Scriptable training, evaluation, and inference entry points for reproducibility

Project Snapshot

Area Details
Task Generate a headline from a short news description
Dataset HuffPost News Category Dataset, about 209k articles
Base model facebook/bart-base
Adaptation method LoRA on BART attention projection layers
Trainable parameters 442,368 of 139,862,784, about 0.32%
Training environment Google Colab Pro, A100 GPU, fp16
Batch size 16
Saved notebook output Evaluation after epoch 5

Results and Metric Note

The committed notebook reports:

eval_loss: 4.6797
eval_bleu: 0.5041
epoch: 5.0

Important: the notebook's compute_metrics function multiplies Hugging Face's raw BLEU value by 100 before returning it. That means the reported eval_bleu: 0.5041 corresponds to raw corpus BLEU of about 0.005.

This should not be read as 0.50 raw BLEU or 50% BLEU. The result is best understood as a working fine-tuning pipeline built under course constraints, not as a model optimized for production headline quality.

For headline generation, BLEU is also a limited metric: many valid headlines can share little exact n-gram overlap with the reference. A stronger follow-up would add ROUGE, BERTScore, qualitative examples, and a reproducible evaluation script.

Repository Layout

.
|-- configs/
|   `-- default.json
|-- docs/
|   `-- report.pdf
|-- notebooks/
|   |-- 01_final_model.ipynb
|   `-- 02_load_and_explore.ipynb
|-- src/
|   |-- data.py
|   |-- evaluate.py
|   |-- infer.py
|   |-- metrics.py
|   |-- modeling.py
|   `-- train.py
|-- .gitignore
|-- requirements.txt
`-- README.md

Notebooks

Run these in order:

  1. notebooks/02_load_and_explore.ipynb explores the dataset, category distribution, word counts, and missing values.
  2. notebooks/01_final_model.ipynb loads BART, applies LoRA, preprocesses the dataset, trains, evaluates, and prints sample predictions.

Dataset

The dataset is not included in this repository.

Use the Kaggle "News Category Dataset" by Rishabh Misra. The notebooks expect the JSON file at:

/content/drive/MyDrive/News_Category_Dataset_v3.json

The training notebook also writes checkpoints to:

/content/drive/MyDrive/checkpoints

Running the Project

The original run was done in Google Colab. The repository now also includes script entry points so the workflow can be rerun outside the notebooks.

Install dependencies:

pip install -r requirements.txt

The default config expects the Kaggle dataset at:

/content/drive/MyDrive/News_Category_Dataset_v3.json

To use a different location, edit data.dataset_path in configs/default.json.

Training

python -m src.train --config configs/default.json

This trains LoRA adapters with a fixed train/test split seed and writes checkpoints to:

checkpoints/bart-lora-headline

Evaluation

Evaluate a trained adapter:

python -m src.evaluate \
  --config configs/default.json \
  --adapter-path checkpoints/bart-lora-headline/final

Omit --adapter-path to evaluate the base facebook/bart-base model as a baseline.

For a quick smoke test on a smaller subset:

python -m src.evaluate \
  --config configs/default.json \
  --adapter-path checkpoints/bart-lora-headline/final \
  --limit 100

The script writes aggregate metrics and sample predictions under results/.

Inference

Generate a headline from one description:

python -m src.infer \
  --config configs/default.json \
  --adapter-path checkpoints/bart-lora-headline/final \
  --text "A short news article description goes here."

Omit --adapter-path to generate with the base model.

Notebook Path

  1. Download News_Category_Dataset_v3.json from Kaggle.
  2. Upload it to the Google Drive path shown above.
  3. Run the exploration notebook.
  4. Run the final model notebook on a GPU runtime.

Current Limitations

This repository currently preserves the course-project version of the work. Before treating it as a fully reproducible ML project, these items should be cleaned up:

  • Add markdown explanations inside the notebooks.
  • Rerun the new script pipeline and commit a clean metric table plus sample predictions.
  • Add BERTScore and a stronger baseline comparison.
  • Add an inference-only demo path.
  • Add a license and clearer dataset usage notes.

Authorship

Group project by Harry Denell and Elvina Fahlgren for DAT410, Chalmers University.

About

Fine-tuned BART with LoRA for news headline generation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors