☂️ BoCoEL
Bayesian Optimization as a Coverage Tool for Evaluating Large Language Models
🤔 Why BoCoEL?
Evaluating large language models are expensive and slow, and the size of modern datasets are gigantic. If only there is a way to just select a meaningful subset of the corpus and obtain a highly accurate evaluation.....
Wait, sounds like Bayesian Optmization!
🚀 Features
- 🎯 Accurately evaluate large language models with just tens of samples from your selected corpus.
- 💂♂️ Uses the power of Bayesian optimization to select an optimal set of samples for language model to evaluate.
- 💯 Evalutes the corpus on the model in addition to evaluating the model on corpus.
- 🤗 Integration with huggingface transformers and datasets
- 🧩 Modular design.
🚧 TODO: work in progress
- 📊 Visualization module of the evaluation.
- 🎲 Integration of alternative methods (random, kmedoids...) with Gaussian process.
Bayesian Optimization
⬇️ Installation
I don't want optional dependencies:
pip install bocoel
Give me the full experience (all optional dependencies):
pip install "bocoel[all]"