Blog

Find vulnerabilities in your code with CodeQL

Automation
In our new article, Marina Mulyukina delves into how GitHub's CodeQL leverages AI to find and fix code vulnerabilities. Learn how this tool enhances security, saves time, and improves code quality.

As of early 2024, GitHub has become the largest platform for code hosting, boasting over 100 million developers and more than 372 million repositories. This significant growth underscores GitHub's pivotal role in the global development community, enabling extensive collaboration and project management for developers worldwide.

Application security remains a top priority for developers. The 'shift left' strategy, which involves integrating security measures early in the development process, is crucial for maintaining both security and developer efficiency. GitHub supports this approach through tools like CodeQL and its marketplace, facilitating early detection and resolution of potential security issues.

What is GitHub code scanning (CD)?

Doing security analysis on source code right when it gets committed seems like a natural way to do source code repositories. in 2019, GitHub acquired Semmle with the idea of incorporating their static code analysis technology into the platform. A year later, the technology got fully integrated into GitHub, made compatible with GitHub Actions (GitHub continuous integration / continuous delivery mechanism) and GitHub marketplace, and launched as GitHub code scanning.

Code scanning is free for public repositories and is a GitHub Advanced Security feature for GitHub Enterprise.

To enable it, simply go to the Security tab of your code repository:

2973-picture-1

Set CodeQL policy:

2970-picture-2

Set all necessary configuration for your project:

2976-picture-3

Select the alert severity for code scanning:

CodeQL and AI

When GitHub's code scanning detects a potential vulnerability or error in your code, it generates an alert within the repository. Once you correct the issue that triggered the alert, GitHub automatically closes the alert. 

CodeQL uses a query language known as QL, which is an object-oriented logic programming language. This query language allows CodeQL to analyze and understand the structure and behavior of your code.

How CodeQL works:

  1. Database generation: CodeQL creates a database representation of your code.
  2. Query execution: Queries are run against this database to identify potential issues.

picture-5

What can CodeQL detect?

While CodeQL primarily identifies security vulnerabilities, it doesn't directly fix them but excels at highlighting where the issues lie. This makes it easier for developers to address the weaknesses.

Issues CodeQL can identify:

  • Injection flaws: Such as SQL injection and cross-site scripting (XSS) that allow attackers to inject malicious code.
    • SQL Injection: An attacker can manipulate an SQL query to execute unintended commands, potentially accessing or modifying database information. For example, a simple login form might be exploited to reveal user credentials.
    • Cross-Site Scripting (XSS): This vulnerability allows attackers to inject client-side scripts into web pages viewed by other users. This can lead to unauthorized actions, such as capturing user input or redirecting users to malicious sites. For instance, a comment section on a blog could be exploited to execute scripts in other users' browsers.
  • Insecure direct object references: Code that permits unauthorized access by directly referencing objects.
  • Other security vulnerabilities: Various common security issues depending on the programming language.

Initially, CodeQL would flag potential issues for developers to investigate and fix. However, with the introduction of the Autofix feature, the process is enhanced by AI. This functionality allows the tool not only to detect problems but also to suggest appropriate code modifications to address them directly.

The Autofix feature integrates the capabilities of CodeQL with GitHub Copilot, another AI-powered tool. CodeQL scans the codebase for vulnerabilities, while Copilot leverages machine learning to generate potential fixes. These suggestions are then presented to the developer for review and approval.

Vulnerability alerts in GitHub repository are sorted from Low to Critical:

2988-picture-6

You will be able to fix issue manually or to use Autofix: 

picture-7

Benefits of using GitHub CD

This innovative tool offers many benefits for developers and development teams:

  • Saves time: Automating routine fixes frees up valuable development time, allowing developers to focus on more complex coding challenges and innovative features.
  • Code quality: By automatically addressing errors and vulnerabilities, the tool helps maintain a higher overall code quality.
  • Stronger security: Proactive identification and rectification of security weaknesses significantly reduces the attack surface of applications, making them more secure.
  • Cost reduction: Streamlined development processes and improved code quality can lead to significant cost reductions in the long run.

Difference between CodeQL and other Code Analysis Tools

Criteria

CodeQL

SonarQube

Checkmarx

Analysis approach

Semantic code analysis using a query language to treat code as data
Static code analysis with continuous inspection
Static and dynamic code analysis

Language support

Supports multiple languages including C/C++, Java/Kotlin, JavaScript/TypeScript, and more
Supports 30+ languages including Java, C#, JavaScript, TypeScript, Python, and PHP
Supports 30+ languages including Java, C#, JavaScript, TypeScript, Python, and PHP

Vulnerability detection

High precision with detailed security queries
Broad coverage with an emphasis on code quality and security vulnerabilities
Comprehensive security vulnerability detection including OWASP Top 10 and SANS 25

Customizability

Highly customizable with the ability to write custom queries
Customizable rules and profiles, extensible with plugins
Extensive customization options, allows custom queries and rules

Autofix capabilities

Offers AI-powered autofix for certain vulnerabilities
Provides suggestions but limited autofix capabilities
Limited autofix capabilities, focuses more on detection and reporting

Learning curve

Moderate to high; requires understanding of its query language (QL)
Moderate; user-friendly interface with extensive documentation
Moderate; extensive documentation and training materials available

Cost

Free for public repositories on GitHub; paid plans for private repositories and enterprise features
Open-source with paid options for enterprise features
Commercial product with different pricing tiers based on the size and needs of the organization

Link

CodeQl
SonarQube
Checkmarx

Conclusion

I started using CodeQL a few months ago and quickly realized how useful and user-friendly this tool is. As someone who prioritizes code protection and aims to keep my code free of vulnerabilities, CodeQL has been invaluable. Its ability to analyze code and identify potential security issues helps ensure that my projects are clean and secure.

One feature I particularly appreciate is Autofix, which simplifies the developer's job by automatically suggesting fixes for detected vulnerabilities. However, if you prefer to handle fixes manually, CodeQL accommodates that as well, giving you the flexibility to address issues in your own way.

Overall, I would highly recommend CodeQL to anyone who is serious about cybersecurity and wants a reliable tool for maintaining robust code security.