This demo is currently using only the audio data (ASR), without frame information. We will add audio+captions functionality in the near future, which will improve chapter generation by incorporating visual content.
Select Model
Status: Ready to process video
Citation
@InProceedings{ventura25chapter,
title = {{Chapter-Llama}: Efficient Chaptering in Hour-Long Videos with {LLM}s},
author = {Lucas Ventura and Antoine Yang and Cordelia Schmid and G{"u}l Varol},
booktitle = {CVPR},
year = {2025}
}
Note: If you encounter any errors with this demo, you can run the code locally using the following commands:
# Clone the repository git clone https://github.com/lucas-ventura/chapter-llama.git cd chapter-llama # Install demo dependencies python -m pip install -e ".[demo]" # Launch the demo python demo.py
If you find any issues, please report them on our GitHub repository.