Application security has come a long way these past couple of decades. In the early 2000s, SQL injection and Cross Site Scripting (XSS) attacks were a nightmare for cybersecurity teams as attackers easily bypassed network firewalls through attacks at the application layer. Since traditional network firewalls at that time were not application-aware, these attacks proved a blind spot allowing attackers to compromise web applications easily.
The industry quickly bounced back, however, and web application firewalls (WAF) and source code security reviews became a standard part of most cybersecurity checks. Now we have DevSecOps, who automate these checks within CI/CD pipelines to and allow security at speed with dynamic application security testing (DAST) and static application security testing (SAST) solutions becoming commonplace.
However, a new trend is growing that has the potential to be another blind spot like the SQL injections of the previous decades unless controls are put in place.
These are attacks targeting AI and machine learning systems.
AI and machine learning systems
AI and machine learning is easily one of the most disruptive technologies of recent years and is being adopted across the globe by companies and governments alike. Even cybersecurity products are now boasting the “powered by AI” label as they adopt machine learning algorithms to boost their capabilities and stop cyberattacks in real-time without human input.
These models are trained on data to build up their decision-making abilities, similar to how a human being learns from trial and error. Basically, the more data a machine learning model is trained on, the more accurate it becomes. Once deemed fit for production, these models are placed behind applications that typically expose public APIs, which can be queried for results.
However, the adoption of these applications in sensitive industries like hospitality, medicine, and finance, along with their access to sensitive training data, makes them prime targets for attackers. As a result, a new breed of attacks is developing that are targeting the workings of machine learning applications.
Why AI is a blind spot in cybersecurity
Cybersecurity teams typically assess an AI application via traditional security processes such as hardening, patching, vulnerability assessments, etc. These are carried out at the infrastructure and application levels. While this is all good practice, these assurance processes do not cover AI specific attacks such as data poisoning, membership inference, model evasion, etc.
In these types of attacks, cybercriminals are not interested in compromising the underlying infrastructure or carrying out SQL injections but in manipulating the way in which AI and machine learning applications reach decisions.
This allows them to:
- Interfere with the workings of AI applications and make them reach the wrong decisions.
- Find out how the model works so they can reverse engineer it for further attacks.
- Find out what data the model was trained on, revealing sensitive characteristics they are not supposed to know.
These attacks have been growing in number and have been successfully carried out against production-based AI applications listed here.
Let us take a look at some of the most common attacks, like inference, evasion, and poisoning and how we can harden our ML applications against them.
During inference attacks on AI applications, an attacker attempts to discover the inner workings of a model or what type of data was used to train it. The APIs exposed by ML models may provide responses as confidence scores and give stronger scores if the data they are fed matches the data they were trained on. With access to this API, the attacker can start running queries and analyzing the responses from the machine learning model. In one example, attackers could reconstruct the faces used to train a machine learning model by analyzing the confidence rate of different images submitted to it. By submitting multiple, random images and looking at the responses, the attackers were able to reconstruct the training images with up to 95% accuracy.
This sort of attack can result in the AI model disclosing highly sensitive data, especially in industries that deal with personally identifiable information. Most companies do not build machine learning models from scratch and usually rely on pre-built models, which are hosted on cloud platforms. A successful attack against one of the models can enable the attacker to compromise multiple AI applications in a supply chain attack.
Another attack can occur on the actual training data itself, where the attacker can essentially “pollute” the data on which the model is being trained to tamper with its decision-making. Like pre-built models, most companies again do not want to create training data from scratch and often leverage pre-built training data, which they run through their machine learning models. If an attacker can compromise this data repository via a security vulnerability and inject his own data into this store, then the machine learning model will be trained to accept a malicious input right from the start. For example, an attacker can input data into a self-driving vehicle’s data store that trains it to recognize objects when driving. By changing the labels of the data, the actual working of the vehicle can be tampered with.
Attackers usually bide their time and wait for the data store to reach a certain level of market acceptance before trying to “poison” them. Model training is also not a one-time activity. A data store might be completely fine in the beginning and then later polluted by an attacker further down the road once they are confident they will not be detected.
Another attack on AI systems is evasion attacks, in which attackers attempt to trick models by providing subtly changed data. It has been proven that making small changes to an image that are not noticeable to a human can result in dramatically different decisions being made by machine learning models. This data type is referred to as an adversarial sample and can trick AI-powered systems such as facial recognition applications or self-driving cars.
For example, simply putting pieces of tape onto a stop sign can result in a machine learning model not recognizing it, which could cause car accidents. Or tricking a medical system for the purposes of committing fraud
The way forward
AI-based attacks are becoming more and more common, and cybersecurity teams need to upskill to understand this new breed of application attacks. In this year’s Machine Learning Security Evasion Competition (MLSEC 2022), it was demonstrated that it was trivially easy to evade facial recognition models via minor changes.
Cybersecurity teams need to be skilled and made aware of these attacks so they can be proactively highlighted in the initial design reviews. Resources like MITRE ATLAS, which describes itself as a knowledge base of attacks against machine learning models, are a great resource for teams to get up to speed quickly.
As mentioned before, traditional cybersecurity controls will not protect against these vulnerabilities, and new types of controls need to be put in place. Similar to how application security evolves and emerges as a separate domain within cybersecurity, AI security needs to do the same quickly. AI applications are already involved in critical decision-making in industries such as healthcare, financial institutions, and law enforcement and present a prime target for cyber attackers.
Given that there is no quick patch to fix these issues, a culture of AI security needs to be developed so that these controls are implemented at multiple levels. Some of the key controls that can be implemented are:
- Threat modeling of machine learning models should be carried out and made a standard practise before choosing any new or pre-built model. Guidelines like the UK’s National Cyber Security Centre have resources like “Principles for the security of machine learning” are a great reference point.
- Detection controls need to be updated to alert if a particular attacker is repeatedly querying a particular machine learning API. This could be indicative of an inference attack.
- Models should be hardened to sanitize confidence scores in their responses. A balance between usability and security needs to be struck with developers possibly being provided full confidence score details while end users only need a summarized score. This would greatly increase the difficulty for an attacker to assess the underlying logic or data that the model was trained on.
- Machine learning models should be trained on adversarial samples to assess their resilience to such attacks. By subjecting the model early to such samples, companies can quickly identify gaps in their learning and remediate the same.
- Data stores being used to train machine learning models should undergo rigorous security testing to make sure they do not contain vulnerabilities that may allow attacks to gain access to their data and poison them. Similarly, the data that was “clean” at one time might get poisoned at a later stage, hence it is essential to verify that a model is working correctly every time it refreshes its training data or is trained on new data. Quality teams should subject the model to various tests and verify previous results to make sure it is still working optimally and no “poisoning” has occurred.
- Lastly, companies should have a policy around using public or open-source data for training. While they are easier to use for training models, a compromise of the data store could lead to the model being corrupted in its training.
It is clear that AI attacks are only poised to increase with time, and awareness amongst cybersecurity teams is currently lacking. Unless proper technical controls and governance are implemented, these attacks will create the same havoc as SQL injections did a few decades back.