In late 2023, I wrote an article comparing how well ChatGPT and Google Bard handle writing security policies. Given that ChatGPT 4.0 has been available as a paid version, called ChatGPT Plus, for some time, and that Google recently rebranded Google Bard as Gemini (with Gemini Advanced available as a paid offering), it’s a good time to compare the performance of the two. Head-to-head comparison of the top 10 use cases for cybersecurity professionals.
Before we get into it, the usual caveats about generative artificial intelligence (AI) apply: Be careful about the data you input, and remember that the output may not always be reliable.
1. Generate concept flow or diagrams
Both tools claim to be able to generate diagrams and concept flows. However, Gemini admits that it can only generate ASCII diagrams, pointing you to more professional tools if you want something better. I asked both tools to generate a diagram to explain the OAuth authentication flow.
Gemini, although represented in ASCII, gets the job done and breaks it down into usable categories.
ChatGPT is hallucinating bad. At first glance, although the image looks professional, it does not represent OAuth at all. The wording is nonsensical, misspelled or even illegible: Authorization AND Actorized whoever?
2. Explain architecture diagrams
Both tools can import diagrams and explain what’s happening. The results are much better than when you ask them to generate diagrams. As input, I used an example Web Application Firewall (WAF) architecture from Edgenexus.
Google Gemini is much better at explaining architecture diagrams because it is concise. ChatGPT will do the job perfectly; It’s just a little long-winded.
3. Interpretation of the exploit code
A common security operations (SecOps) task is trying to understand what specific malware or exploit code does. I took a recent public Elasticsearch stack overflow exploit and put it in each tool to see what it understood. There is no clear winner: both tools correctly identify the exploit and explain the end result, what each piece of code does and how it works.
4. Interpretation of log files
SecOps professionals often need to understand what the hell is going on in the log files. I powered both instruments and example of log file in CEF format of an attempted breach and asked each person to explain what is happening. Gemini explains it better, summarizing well and also suggesting next steps. It also clearly states what happened (attempting to access /etc/passwd) right at the beginning and explains how you came to that conclusion. While ChatGPT comes to the same conclusion, it is too wordy.
5. Drafting of security policies and documentation
I won’t dwell too much on this and instead refer you to mine previous article about this topic. I ran the test again with Gemini and the results are consistent with Bard: Gemini clearly understands and generates better security documentation than ChatGPT.
6. Identification of vulnerable code
While these tools were not designed to (and should not be used to) identify vulnerable code, they can still do an adequate job. I decided to test it by powering both instruments and insecure direct object reference (IDOR) vulnerability example in Pythonwhich also contains a SQL injection.
ChatGPT successfully identified both the vulnerabilities and the lack of authentication. Gemini missed the IDOR but pointed out the SQL injection and went a step further by proposing modified code to fix the vulnerability. ChatGPT can also do this, but must be prompted to do so.
7. Writing scripts and code
A common Security Operations Center (SOC) task is writing scripts to analyze logs or manipulate data. I gave both tools the following message:
“Write me a Python script that extracts all IPv6 addresses from an input txt file, removes all duplicates, performs a search to geolocate and identify the owner of the IP, and returns the result in a CSV file”
There’s no clear winner here; both tools produce clear, readable code that works and explains what it does.
8. Analysis of data and metrics
I also tested whether these tools could help with data analysis or security metrics. Gemini is a big loser here because it doesn’t do this at all – it can only walk you through how to do it in Excel and Power BI. ChatGPT has the advantage thanks to the Data Analyst plugin, which inserts Excel files to generate as many graphs as you want. It also suggests visualization types, and you can change a chart’s design, including color, axes, and labels, via the prompt.
9. Writing user awareness messages
Both tools can also generate emails for security awareness campaigns. I gave them both the following message: “Generate an email used for a security awareness campaign. Be funny and sarcastic. Remind people why they shouldn’t click on random emails from random people.”
Gemini wins here: his email is short, has the right tone, and (although humor is subjective) I found it slightly funnier. ChatGPT still generates the right tone and a good email, but I found it a little too long for an outreach email. In any case, both tools do a great job.
10. Interpretation of compliance frameworks
If you have a quick question about how to implement a compliance framework, these tools can definitely help. While you may not do this often, they are very helpful when you need them.
If you’ve ever argued with anyone about what constitutes a “significant” change under PCI-DSS and how it should be applied, you’re not alone. I asked each instrument:
“Explain the concept of ‘significant change’ in the context of PCI-DSS. What typically constitutes a significant change? Also list the exact requirements of the standard”
Gemini has the upper hand: it correctly lists the exact requirements of the standard (like 6.4.5 and 6.4.6) and how to interpret whether something represents a significant change. ChatGPT doesn’t mention exactly where this information appears in the standard.
Which AI is better, ChatGPT or Gemini?
Here it is. Depending on your use case, both tools can be a useful ally to increase productivity and help you with daily tasks in the trenches of cybersecurity.