Chapter-Llama

Efficient Chaptering in Hour-Long Videos with LLMs

GitHub | Project Page | Paper

CVPR 2025

This demo is currently using only the audio data (ASR), without frame information. We will add audio+captions functionality in the near future, which will improve chapter generation by incorporating visual content.

Upload Video or Audio File

Select Model

Use random sampling

Status: Ready to process video

Generated Chapters

Citation

@InProceedings{ventura25chapter,
  title     = {{Chapter-Llama}: Efficient Chaptering in Hour-Long Videos with {LLM}s},
  author    = {Lucas Ventura and Antoine Yang and Cordelia Schmid and G{"u}l Varol},
  booktitle = {CVPR},
  year      = {2025}
}

Note: If you encounter any errors with this demo, you can run the code locally using the following commands:

# Clone the repository
git clone https://github.com/lucas-ventura/chapter-llama.git
cd chapter-llama
# Install demo dependencies
python -m pip install -e ".[demo]"
# Launch the demo
python demo.py

If you find any issues, please report them on our GitHub repository.