The detailed results per AI Chatbot test are on each test page and here we have an overview.

Results Overview

In 1st Place… Perplexity AI – Overall Score = 9.5 

Very good all round, only negative points are a couple of errors in the answer text for Q6 and follow up question for Q6. Answered general questions, more specific questions, and handled nonsense questions and garbage data.

Much fuller experience with answers with more info, images, sources, links than the others. A couple of errors in answer text prevented 10/10 score! 

In 2nd Place… Microsoft Copilot – Overall Score = 7.5

Good responses, although sometimes a little slow with them, provides sources and good text and descriptions where relevant. Some accuracy errors and some poor UX/UI issues knock the score down a fair bit. 

Better responses than some of the others, but some accuracy errors and UX/UI errors keep the score at 7.5

In 3rd Place… OpenAI ChatGPT – Overall Score = 6.5

Quick responses but often sparse answers. Provides only text answers with no sources or media or links or ways to share. Got 3 of the questions wrong so also a bit worrying on accuracy. 

Fairly average performance, as sparse answers, no sources or accompanying media or links. Got 3 of the questions wrong so that’s also poor, scraped in 3rd place with a 6.5 rating. 

And, last but not least, in 4th Place… Google Gemini – Overall Score = 6

Where it can provide a response, some good responses, although some are a bit sparse. Worst thing was that couldn’t answer some questions the other chatbots did. Accuracy issues also, plus some answers were a bit slow to be provided.

Average performance, some questions it couldn’t answer, some it got accuracy errors for, and sometimes slow to provide answers, so 4th place of 4. 

Conclusion 

As the tests have shown, none of the AI Chatbots got it correct every time, whereas a lot of the media would have you believe AI Chatbots are infallible and will take everyone’s job in the not too distant future. I assume (and hope) they’ll improve the accuracy and reliability of these AI Chatbots, as in many cases people may be acting on what they’re being told, without any way (or often interest) in doing any checking of the answers they’re being given.

Of those AI Chatbots I tested, Perplexity AI was easily the winner, so I’ll be mainly using that in the future. I wouldn’t mind if Apple bought Perplexity AI and subsumed it into their ecosystem, hopefully replacing Siri – anyway, that’s another story. (Recent news stories suggest that Apple are actually in talks with Google about Gemini and possibly also talked with OpenAI.)


All the posts 

I have created a post for each AI Chatbot test in order to fully explore the test results, a post which shows the questions asked, plus a Results post, where I crown the winning AI Chatbot. You can jump straight there or read each test post via the links below.

Testing the AI Chatbots

Testing the AI Chatbots: Questions

Testing the AI Chatbots: Perplexity AI

Testing the AI Chatbots: OpenAI ChatGPT

Testing the AI Chatbots: Microsoft Copilot

Testing the AI Chatbots: Google Gemini

Testing the AI Chatbots: Results


Note: The heading images used in these posts were created via Bing Image Creator 

About my testing services: iOS App Testing / Android App Testing / Website Testing / AI Chatbot Testing