CI/CD Security

Created by

RetailTortoise38285

Cards (80)

CI/CD Pipelines:
Source Code Storage
We need to consider several things when deciding where to store our code:
How can we perform access control for our source code?
How can we make sure that changes made are tracked?
Can we integrate our source code storage system with our development tools?
Can we store and actively use multiple different versions of our source code?
Should we host our source code internally, or can we use an external third party to host our code?
Version Control
We need version control for two main reasons:
We are often integrating new features in our software. Modern development approaches, such as Agile, means we are constantly updating our code. To keep all of these updates in check, we need version control.
An entire development team is working on the code, not just one developer. To ensure that we can integrate the changes from multiple developers, version control is required.
Source Code & Version Control Tools:
The two most common source code storage and version control systems are Git and SubVersion (SVN). Git is a distributed source control tool, meaning that each contributor will have their own copy of the source code. On the other hand, SVN is a centralised source control tool, meaning the control of the repo is managed centrally.
For SVN, the two most popular tools are TortoiseSVN and Apache SVN.
Source Code Security Considerations
We want to make sure it is not exposed. This is why authentication and access control for our source code is so important. We also want to make sure that changes and updates are adequately tracked, allowing us to always go back to a previous version if something happens.
Source code cannot be fully secret since developers need access to it. As such, we should be careful not to confuse source code storage with secret management. We need to make sure not to store secrets, such as database connection strings and credentials, in our source code.
Git Never Forgets
Git never forgets". Code is "committed" to a Git repo. When this happens, Git determines the changes made to the files and creates a new version based on these changes. Any user with access to the repo can look at historical commits and the changes that were made.
What can often happen is a developer accidentally commits secrets such as credentials or database connection strings to a Git repo. Realising their mistake, they delete the secrets and create another commit. However, the repo will now have both commits.
Git Never Forgets
If an attacker got access to the repo, they could use a tool such as GittyLeaks, which would scan through the commits for sensitive information. Even if this information no longer exists in the current version, these tools can scan through all previous versions and uncover these secrets.
Dependencies:
Although we might think that we are writing a large amount of code when we develop, the truth is that it is only the tip of the iceberg. Unless you are coding in binary, chances are you are actually only writing a fraction of the actual code. This is because a lot of the code has already been written for us in the form of libraries and software development kits (SDKs). Even variables like String in an application have an entire library behind them! The management of these dependencies is a vital part of the pipeline.
External Dependencies:
External dependencies are publicly available libraries and SDKs. These are hosted on external dependency managers such as PyPi for Python, NuGet for .NET, and Gems for Ruby libraries. Internal dependencies are libraries and SDKs that an organisation develops and maintains internally.
For example, an organisation might develop an authentication library. This library could then be used for all applications developed by the organisation.
External vs Internal Dependencies
Internal:
Libraries can often become legacy software since they no longer receive updates or the original developer has left the company.
The security of the package manager is our responsibility for internal libraries.
A vulnerability in an internal library could affect several of our applications since it is used in all of them.
External vs Internal Dependencies
External:
Since we do not have full control over the dependency, we must perform due diligence to ensure that the library is secure.
If a package manager or content distribution network (CDN) is compromised, it could lead to a supply chain attack.
External libraries can be researched by attackers to discover 0day vulnerabilities. If such a vulnerability is found, it could lead to the compromise of several organisations at the same time.
Common Tools
A dependency manager, also called a package manager, is required to manage libraries and SDKs. As mentioned before, tools such as PyPi, NuGet, and Gems are used for external dependencies.
The management of internal dependencies is a bit more tricky. For these, we can use tools such as JFrog Artifactory or Azure Artifacts to manage these dependencies.
Package Manager:
A package manager or package-management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer in a consistent manner.
Log4Shell
A 0day vulnerability was discovered in Log4j dependency in 2021 called Log4Shell. Log4j is a Java-based logging utility. It is part of the Apache Logging Services, a project of the Apache Software Foundation. The vulnerability could allow an unauthenticated attacker to gain remote code execution on a system that makes use of the logger. The true issue? This small little dependency was used almost literally everywhere.
Unit Testing
A unit test is a test case for a small part of the application or service. The idea is to test the application in smaller parts to ensure that all the functionality works as it should.
In modern pipelines, unit testing can be used as quality gates. Test cases can be integrated into the Continuous Integration and Continuous Deployment (CI/CD) part of the pipeline, where the build will be stopped from progressing if these test cases fail. However, unit testing is usually focused on functionality and not security.
Integration Testing
Where unit tests focus on small parts of the application, integration testing focuses on how these small parts work together. Similar to unit tests, testing will be performed for each of the integrations and can also be integrated into the CI/CD part of the pipeline.
A subset of integration testing is regression testing, which aims to ensure that new features do not adversely impact existing features and functionality. However, similar to unit testing, integration testing, including regression testing, is not usually performed for security purposes.
SAST
Static Application Security Testing (SAST) works by reviewing the source code of the application or service to identify sources of vulnerabilities. SAST tools can be used to scan the source code for vulnerabilities. This can be integrated into the development process to already highlight potential issues to developers as they are writing code.
We can also integrate this into the CI/CD process. Not as quality gates, but as security gates, preventing the pipeline from continuing if the SAST tool still detects vulnerabilities that have not been flagged as false positives.
DAST
Dynamic Application Security Testing (DAST) is similar to SAST but performs dynamic testing by executing the code. This allows DAST tools to detect additional vulnerabilities that would not be possible with just a source code review.
DAST
One method that DAST tools use to find vulnerabilities, such as Cross Site Scripting (XSS), is by creating sources and sinks. When a DAST tool provides input to a field in the application, it marks it as a source. When data is returned by the application, it looks for this specific parameter again and, if it finds it, will mark it as a sink.
It can then send potentially malicious data to the source and, depending on what is displayed at the sink, determine if there is a vulnerability such as XSS. Similar to SAST, DAST tools can be integrated into the CI/CD pipeline as security gates.
Penetration Testing
SAST and DAST tools cannot fully replace manual testing, such as penetration tests.
However, the main issue remains that these automated tools, do not perform well against contextual vulnerabilities. Take the process flow of a payment, for example. A common vulnerability is when part of the process can be bypassed, for example, the credit card validation step. But since it requires context, even DAST tooling will find it hard to discover the bypass
SAST/DAST Common Tools
There are several common tools that can be used for automated testing. Both GitHub and Gitlab have built-in SAST tooling. Tools such as Snyk and Sonarqube are also popular for SAST and DAST.
SAST/DAST:
A common issue with SAST and DAST tooling is that the tool is simply deployed into the pipeline, even simply for a Proof-of-Concept (PoC). However, you need to take several things into consideration:
Performance cost
Integration points
Calibration of results
Quality and security gate implementation
If you introduce a new security gate, even just for a PoC, that scans each merge request for vulnerabilities before approval, this can have a drastic performance cost on your infrastructure and the speed at which developers can perform merge requests.
SAST/DAST:
When introducing new automated testing tooling, careful consideration should be given to how a PoC should be performed to ensure that no disruptions are caused but also to ensure that the PoC is representative of how the tooling will interact when it is finally integrated. A fine balance to try and achieve!
CI/CD:
We can create what is called a CI/CD pipeline. These pipelines usually have the following distinct elements:
Starting Trigger - The action that kicks off the pipeline process. For example, a push request is made to a specific branch.
Building Actions - Actions taken to build both the project and the new feature.
Testing Actions - Actions that will test the project to ensure that the new feature does not interfere with any of the current features of the application.
CI/CD:
Deployment Actions - Should a pipeline succeed, the deployment actions detail what should happen with the build. For example, it should then be pushed to the Testing Environment.
Delivery Actions - As CI/CD processes have evolved, the focus is now no longer just on the deployment itself, but all aspects of the delivery of the solution. This includes actions such as monitoring the deployed solution.
CI/CD:
CI/CD pipelines require build-infrastructure to execute the actions of these elements. We usually refer to this infrastructure as build orchestrators and agents.
A build orchestrator directs the various agents to perform the actions of the CI/CD pipelines as required.
These CI/CD pipelines are usually where the largest portion of automation can be found. As such, this is usually the largest attack surface and the biggest chance for misconfigurations to creep in.
CI/CD Build Agents:
One common misconfiguration with CI/CD pipelines is using the same build agents for both Development (DEV) and Production (PROD) builds. This creates an interesting problem since most developers will have access to the starting trigger for a DEV build but not a PROD build.
CI/CD Build Agents: Same Build Agents for DEV & PROD:
If one of these developers were compromised, an attacker could leverage their access to cause a malicious DEV build that would compromise the build agent. This would not be a big issue if the build agent was just used for DEV builds.
However, since this agent is also used for PROD builds, an attacker could just persist on this build agent until a PROD build is actioned to inject their malicious code into the build, which would allow them to compromise the production build of the application.
Build Orchestrators and Agents:
What do we call the build infrastructure element that controls all builds: Build Orchestrator
What do we call the build infrastructure element that performs the build: Build Agent
DEV Environment:
The DEV environment is the playground for developers. This environment is the most unstable as developers are continuously pushing new code and testing it. From a security standpoint, this environment has the weakest security. Access control is usually laxer, and developers often have direct access to the infrastructure itself.
The likelihood of the development environment being compromised is high, but if there is adequate segregation, the impact of such a compromise should be low.
No Customer Data Should be Here
UAT - User Acceptance Testing
The UAT environment is used to test the application or select features before they are pushed to production. These include unit tests that ensure the developed feature behaves as expected. This can (and should) include security tests as well.
Although this environment is more stable than DEV, it can often still be fairly unstable. Similarly, certain security hardening controls would have been introduced for UAT, but it is still not as hardened as PreProd or PROD.
No Customer Data Should Be Here
PreProd - Pre-Production
The PreProd environment is used to mimic production without actual customer/user data. This environment is kept stable and used to perform the final tests before the new feature is pushed to production. From a security standpoint, PreProd's security should technically mirror PROD. Although, this is not always the case.
PROD - Production
The PROD environment is the most sensitive. This is the current active environment that serves users or customers. To ensure that our users have the best experience, this environment must be kept stable. No updates should be performed here without proper change management.
To enforce this, the security of this environment is the strongest. Only a select few employees or services will have the ability to make changes here. Furthermore, since we may have "malicious" users, the security has to be hardened to prevent outsider threats as well.
DR/HA - Disaster Recovery or High Availability
Depending on the criticality of the system, there may be a DR or HA environment. If the switchover is instantaneous, it is usually called a HA environment.
This is often used for critical applications such as Online Banking, where the bank has to pay large penalties if the website goes down.
In the event where some (but still small) downtime is allowed, the environment is called a DR environment, meant to be used to recover from a disaster in production. DR and HA environments should be exact mirrors of PROD in both stability and security.
Green and Blue Environments
Green and Blue environments are used for a Blue/Green deployment strategy when pushing an update to PROD. Instead of having a single PROD instance, there are two. The Blue environment is running the current application version, and the Green environment is running the newer version. Using a proxy or a router, all traffic can then be switched to the Green environment when the team is ready.
Green and Blue Environments
However, the Blue environment is kept for some time, meaning that if there are any unforeseen issues with the new version, traffic can just be routed to the Blue environment again.
We can think of this as High-Availability backups of PROD during a new deployment to use for a roll-back if something goes wrong, which is faster than having to perform a roll-back of the actual PROD environment.
Canary Environments
Similar to Green and Blue environments, the goal of Canary environments is to smooth the PROD deployment process. Again two environments are created, and users are gradually moved to the new environment.
For example, at the start, 10% of users can be migrated. If the new environment remains stable, another 10% can be migrated until 100% of the users are in the new environment. Again, these are usually classified under PROD environments but are used to reduce the risk associated with a PROD upgrade to limit potential issues and downtime.
Environment Security Considerations
The underlying infrastructure of an application also forms part of the attack surface of the actual application. Any vulnerabilities in this infrastructure could allow an attacker to take control of the host and the application. As such, the infrastructure must be hardened against attacks. This hardening process usually requires things like the following:
Removing unnecessary services
Updating the host and applications
Using a firewall to block unused ports
Developer Bypasses in PROD:
One of the common issues that can happen with different environments is that often things that should stay in DEV, don't. Develop bypasses are common in DEV environments for features like the following:
Multi-factor authentication
CAPTCHAs
Password resets
Login portals
Developer bypasses allow developers to quickly test different application features by bypassing time-consuming features such as MFA prompts. A common example is having a specific One-Time Pin (OTP) code that is always accepted, regardless of the OTP code that is sent by the application.
Developer Bypasses in PROD:
However, if there is inadequate sanitisation of these bypasses before the application is moved to the next environment, it could lead to a developer bypass making its way all the way into PROD. That OTP bypass? It could now be leveraged by an attacker to bypass MFA and compromise user accounts.
This is why environments must be segregated, and similar to quality gates, security gates must be implemented to ensure a clean application is moved to the next environment.