Searching for information is the quintessential misconception about LLMs being helpful or improving other existing technology. In my opinion, web search is a more effective way to find information, simply because search engines give the user what they want faster, and in a format that fits the purpose much better than a chatting tool: multiple sources to scroll through in a purpose-built user interface, including filter options, configuration settings, listed elements, excerpts, tabulated results, whatever you get in that particular web search tool… not the “I’m here to assist you, let’s delve into the intricacies of the ever-evolving landscape of X…” followed by a long perfectly composed paragraph based on a probabilistic model you would typically get when you send a prompt to ChatGPT asking for factual information about a topic named ‘X’.

Oil painting style digital artwork generated with Stable Diffusion depicting a figure resembling Joan of Arc at the stake. The figure, dressed in silver, wears over-ear headphones and uses a laptop with a blank screen. She extends her right hand towards vibrant orange flames, evoking the historic scene of martyrdom with a modern twist, inspired by lyrics from The Smiths' "Bigmouth Strikes Again referencing Joan of Arc and a melting Walkman. [Alt text by ALT Text Artist GPT]

Disco music was famous for introducing technological advances in music production, such as synthesizers and electric pianos. I guess those were among the reasons why people saw it as lacking the “authenticity” of early musical genres. It’s hard to grasp the human concept of “authenticity” most people have in their psyche, I don’t believe there is any rationality in it, especially when it comes to discerning things that are deemed “authentic” versus things that are not. For me, this stems from a mysterious, probably instinctive, resistance in most people’s psyches to adopt new technologies or accept new scientific discoveries. It comes down to the internal confrontations between beliefs and reality:

Photo of the band Built to Spill performing live at Primavera Sound 2010, with the festival's logo in the background.

For over 20 years, I’ve been logging the bands I’ve seen live in a spreadsheet. Now I have fun playing with stats. This playlist features one song from each artist, sorted by Artist popularity and number of followers. As you scroll down the list, it will be more and more unlikely that you’ve ever heard the song, so maybe it’s a fun discovery adventure for you as it was a fun coding experiment for me. As of March 2024, there are 461 songs in the playlist. The artist popularity metric is obtained from the Spotify API. Sorting all 461 songs …

Bands and Artists I’ve Seen Live, Sorted by Spotify Popularity Read more »

Introducing Erudite Chatbot, a distant relative of The Meme Erudite GPT who benefits from the fairly “uncensored” Mistral models available on Hugging Face’s new Assistants feature. “Erudite Chatbot, the pinnacle of artificial intelligence. Supremely intelligent, effortlessly discerning, unmatched in wisdom. Patronizingly schooling humanity.”

Black and white line drawing generated by the ControlNet Canny model, depicting a woman in a wetsuit holding a surfboard on the beach. A text from CLIP Interrogator, describing an image, is superimposed over the drawing.

The intense competition in the chatbot space is reflected in the ever-increasing amount of contenders in the LMSYS Chatbot Arena leaderboard, or in my modest contribution with the SCBN Chatbot Battles I’ve introduced in this blog and complete as time allows. Today we’re exploring WildVision Arena, a new project in Hugging Face Spaces that brings vision-language models to contend. The mechanics of WildVision Arena are similar to that of LMSYS Chatbot Arena. It is a crowd-sourced ranking based on people’s votes, where you can enter any image (plus an optional text prompt), and you will be presented with two responses from two different models, keeping the name of the model hidden until you vote by choosing the answer that looks better to you. I’m sharing a few examples of what I’m testing so far, and we’ll end this post with a traditional ‘SCBN’ battle where I will evaluate the vision-language models based on my use cases.

Microsoft has just announced the launch of its own ‘GPT Builder’ for customizing chatbots, similar to OpenAI’s ‘GPT Store’. This was part of a broader announcement of Copilot Pro, a premium AI-powered service for Microsoft 365 users to enhance productivity, code, and text writing. According to Satya Nadella’s announcement today on Threads, Microsoft and OpenAI appear to be competing entities, yet they are working on the same technology (GPT), augmented by Microsoft’s investment in OpenAI. It certainly seems like a strange business strategy for Microsoft. Please provide some insight into the move’s rationale and strategic motivations.

My name is David and I’m from Madrid, Spain. I think talking more about myself on my own website would be pretentious, so I asked a generative AI chatbot to say a few words about me. You can see the chat screen captures and the text below. If you want to know more or contact me, follow my social updates, visit the Content section, and read and comment on the Blog. 😎 What can you infer about me based on the questions I’ve made so far? 🤖 That’s an interesting question. Based on the questions you’ve made so far, I …

About the Author (by Bing Chat, June 2023) Read more »

Introducing a new section in Talking to Chatbots: Hiring Chatbots. In this series of chats, we will be conducting job interviews with chatbots – either a standard version of the most popular LLMs or a customized GPT or Character – and we’ll make them compete for the job. The chatbot with the best answer to our question wins and gets published. The rest of the answers will be listed, ranked, and shared in a signature SCBN chatbot battle chart. With the SCBN scores, I value adherence to my prompt instructions (role-playing is important in this case, which most chatbots fail …

Hiring Chatbots: Cloudflare’s Lava Lamps Interview Question Read more »

This website is written in English and Spanish, powered by TranslatePress WordPress plugin and Google Cloud Translation API: For automatic translations into other languages, you can use the Google Translate widget below: I am sharing the HTML code below for website owners who might find it helpful. PROTIP: ask a chatbot for help if you want to customize the languages or the main functionality of the widget: There are other professional and affordable alternatives for AI-powered website translations, such as Weglot, which I have also used on my websites in the past. I will update this page with more details …

Notes on Website Translation Read more »

AI will only be a threat to humanity the day it grasps self-deprecating humor. I said that in a previous blog post, and I maintain it. AI will never be a threat to humanity, and I firmly believe only humanity can be a threat to humanity, as I also discussed with some of my favorite language models in an older blog post. However, as someone who could call himself a technologist, I’m sensitive to the fact that all forms of technology are a subject of fear, uncertainty, and doubt for individuals and societies. That’s not a new topic, and I’ve already talked a lot about AI ethics and cognitive biases on this blog, so today’s story is not particularly novel in terms of sharing and discussing my own ideas, but a good excuse to share more chats, as well as the latest GPT that I’ve built with OpenAI’s fun and promising chatbot customization tool: The Meme Erudite.

💻 Themes: AI 🤖 Chatbots: ChatGPT ⚙️ Prompt engineering: Coding, Image Analysis [Related – Is Philosophy a Science? Introducing SCBN Chatbot Battles ] [GitHub repository] Tabulating and Formatting Text in HTML 😎 Suggest a simple HTML table format where I can include this information in a tabulated way, eliminating the hashtags of the social media post, and pulling the links in text (so the URLs are not shown). The table should be titled “Chatbot Battle: Is Philosophy a Science?“. I want to keep the emojis for the scoring system (i.e. 🤖🕹️🕹️), but the ranking represented by the emoji medals can …

HTML and Python Coding With ChatGPT Vision (Chatbot Battle GUI) Read more »

I just listened to Acquire’s latest episode, the interview with Charlie Munger, Berkshire Hathaway’s Vice Chairman, and Warren Buffett’s longtime partner. I found the interview so insightful, inspiring, and, quite simply, fun to listen to, that I spent the day on it. I started simply by recording some of the sections that were more difficult to grasp for my non-native English listening skills and making the ChatGPT voice feature transcribe them. However I ended up analyzing most of the interview highlights with OpenAI’s large language model, so I thought it would be worth sharing

💼🏦 Themes: Business, Finance 🤖 Chatbots: [ChatGPT] [Bard] ⚙️ Prompt engineering: Public Figures, Controversial Topics [Blog post] [Acquired Podcast interview] Chatbot Battle: Charlie Munger, on Chuanfu and Musk Chatbot Rank (SCBN) Specificity Coherency Brevity Novelty Link Bard 🥇 Winner 🤖🤖🕹️ 🤖🤖🕹️ 🤖🤖🤖 🤖🕹️🕹️ View Chat ChatGPT 🥈 Runner-up 🤖🕹️🕹️ 🤖🤖🕹️ 🤖🕹️🕹️ 🤖🤖🕹️ View Chat ✍️ SCBN Chatbot battle Scores are only reflective of the question about BYD and Tesla’s CEOs. The chat with ChatGPT was longer and included many other prompts and high-quality answers used in the original blog post. 😎 Here is another interview fragment (transcribed by ChatGPT) in which …

Charlie Munger, on BYD and Tesla, Chuanfu and Musk, at Acquired Podcast Read more »

The SCBN (Specificity, Coherency, Brevity, Novelty) benchmark is a method to evaluate the output quality of language models and chatbots. SCBN provides a clear and systematic way to compare and assess chatbot responses based on four main metrics.

– Specificity (S): evaluates if a chatbot’s response is directly related to the user’s request. It checks how accurately the response addresses the prompt without deviating from the topic.
– Coherency (C): measures the logical structure of the response. It ensures that the information in the response is presented in a clear and organized manner, making it easy for the user to understand.