Predicting LMSYS Chatbot Arena Votes With the SCBN and RQTL Benchmarks
Below is the notebook I submitted (late) to the LMSYS – Chatbot Arena Human Preference Predictions competition on Kaggle. This notebook applies NLP techniques for classifying text with popular Python libraries such as scikit-learn and TextBlob, and my own fine-tuned versions of Distilbert. The notebook introduces the first standardized version of SCBN (Specificity, Coherency, Brevity, Novelty) quantitative scores for evaluating chatbot response performance. Additionally, I introduced a new benchmark for classifying prompts named RQTL (Request vs Question, Test vs Learn), which aims to refine human choice predictions and provide context for the SCBN scores based on inferred user intent. You can check all the code, annotations, and charts in the Kaggle widget below.
Explore and run the notebook on Kaggle
Introduction to the SCBN benchmark
SCBN chatbot battles on Talking to Chatbots