data science | Talking to Chatbots

Guess Age and Risk Profile From Net Worth Chart – SCBN Battle

Tagged with AI, chatbots, crypto, data science, finance, Generative AI, investing, multimodal AI, scbn, Vision Models, wealth Last updated on January 30, 2025

Today’s post contains an easy practical use case for multimodal chatbots that I’ve been wanting to test and show here for quite some time, but it still proves we’re far from expecting reliable ‘reasoning’ from LLM tools when incorporating even fairly rudimentary visual elements such as time series charts. I challenged ChatGPT 4o (as discussed in an earlier post, o1 in the web app still does not support image input), Gemini, and Claude to analyze a stacked area chart that visually represents the evolution in the net worth of an individual. After several attempts and blatant hallucinations by all models, I share the best outputs which, as usual in most of the SCBN battles published on Talking to Chatbots, were those from ChatGPT.

Predicting LMSYS Chatbot Arena Votes With the SCBN and RQTL Benchmarks

Tagged with AI, chatbots, coding, data science, python, scbn Last updated on December 3, 2024

Below is the notebook I submitted (late) to the LMSYS – Chatbot Arena Human Preference Predictions competition on Kaggle. This notebook applies NLP techniques for classifying text with popular Python libraries such as scikit-learn and TextBlob, and my own fine-tuned versions of Distilbert. The notebook introduces the first standardized version of SCBN (Specificity, Coherency, Brevity, Novelty) quantitative scores for evaluating chatbot response performance. Additionally, I introduced a new benchmark for classifying prompts named RQTL (Request vs Question, Test vs Learn), which aims to refine human choice predictions and provide context for the SCBN scores based on inferred user intent. You can check all the code, annotations, and charts in the Kaggle widget below. Explore and run the notebook …

Predicting LMSYS Chatbot Arena Votes With the SCBN and RQTL Benchmarks Read more »

Hiring Chatbots: Cloudflare’s Lava Lamps Interview Question

Tagged with business, chatbots, data science, technology, work Last updated on October 7, 2024

Introducing a new section in Talking to Chatbots: Hiring Chatbots. In this series of chats, we will be conducting job interviews with chatbots – either a standard version of the most popular LLMs or a customized GPT or Character – and we’ll make them compete for the job. The chatbot with the best answer to our question wins and gets published. The rest of the answers will be listed, ranked, and shared in a signature SCBN chatbot battle chart. With the SCBN scores, I value adherence to my prompt instructions (role-playing is important in this case, which most chatbots fail to obey). Are you a hiring manager or HR employee who has ever wondered if AI would replace …

Hiring Chatbots: Cloudflare’s Lava Lamps Interview Question Read more »

Tag: data science

Guess Age and Risk Profile From Net Worth Chart – SCBN Battle

Predicting LMSYS Chatbot Arena Votes With the SCBN and RQTL Benchmarks

Hiring Chatbots: Cloudflare’s Lava Lamps Interview Question