Skip to content

This repository provide the studies on the security of language models for code (LM4Code).

License

Notifications You must be signed in to change notification settings

wssun/TiSE-LM4Code-Security

Repository files navigation

⚔🛡 Security of Language Models for Code

This repository provides a summary of recent advancements in the security landscape surrounding Language Models for Code (also known as Neural Code Models), including backdoor, adversarial attacks, corresponding defenses and so on.

NOTE: We collect the original code from the papers and the code we have reproduced. While, our reproduced code is not guaranteed to be fully accurate and is for reference only. For specific issues, please consult the original authors.

arXiv GitHub stars Awesome

Overview

Language Models for Code (LM4Code) are advanced deep learning models that excel in programming language understanding and generation. LM4Code has achieved impressive results across various code intelligence tasks, such as code generation, code summarization, vulnerability/bug detection, and so on. However, with the growing use of LM4Code in sensitive applications, they have become a prime target for security attacks, which exploit the vulnerabilities inherent in machine learning models. This repository organizes the current knowledge on Security Threats and Defense Strategies for LM4Code.

Table of Contents

NOTE: Our paper is labeled with 🚩.

📃Survey

The survey analyzes security threats to LM4Code, categorizing existing attack types such as backdoor and adversarial attacks, and explores their implications for code intelligence tasks.

Year Conf./Jour. Paper
2023 CoRR A Survey of Trojans in Neural Models of Source Code: Taxonomy and Techniques.
2024 CoRR Robustness, Security, Privacy, Explainability, Efficiency, and Usability of Large Language Models for Code.
2024 《软件学报》 深度代码模型安全综述 🚩

⚔Security Threats

According to the document, security threats in LM4Code are mainly classified into two categories: backdoor attacks and adversarial attacks. Backdoor attacks occur during the training phase, where attackers implant hidden backdoors in the model, allowing it to function normally on benign inputs but behave maliciously when triggered by specific patterns. In contrast, adversarial attacks happen during the testing phase, where carefully crafted perturbations are added to the input, causing the model to make incorrect predictions with high confidence while remaining undetectable to humans.

An overview of attacks in LM4Code.

Backdoor Attacks

Backdoor attacks inject malicious behavior into the model during training, allowing the attacker to trigger it at inference time using specific triggers:

  • Data poisoning attacks: Slight changes to the training data that cause backdoor behavior.
Year Conf./Jour. Paper Code Repository Reproduced Repository
2021 USENIX Security Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers. Octocat
2021 USENIX Security You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion.
2022 ICPR Backdoors in Neural Models of Source Code. Octocat
2022 FSE You See What I Want You to See: Poisoning Vulnerabilities in Neural Code Search. Octocat
2023 ICPC Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks. Octocat
2023 ACL Backdooring Neural Code Search. 🚩 Octocat
2024 TSE Stealthy Backdoor Attack for Code Models. Octocat
2024 SP Trojanpuzzle: Covertly Poisoning Code-Suggestion Models. Octocat
2024 TOSEM Poison Attack and Poison Detection on Deep Source Code Processing Models. Octocat
  • Model poisoning attacks: Changes that do not alter the functionality of the code but trick the model.
Year Conf./Jour. Paper Code Repository Reproduced Repository
2021 USENIX Security You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion.
2023 CoRR BadCS: A Backdoor Attack Framework for Code search.
2023 ACL Multi-target Backdoor Attacks for Code Pre-trained Models. Octocat
2023 USENIX Security PELICAN: Exploiting Backdoors of Naturally Trained Deep Learning Models In Binary Code Analysis. Octocat

Adversarial Attacks

These attacks manipulate the input data to deceive the model into making incorrect predictions. Including two categories:

  • White-box attacks: Attackers have complete knowledge of the target model, including model structure, weight parameters, and training data.
Year Conf./Jour. Paper Code Repository Reproduced Repository
2018 CoRR Adversarial Binaries for Authorship Identification.
2020 OOPSLA Adversarial Examples for Models of Code. Octocat
2020 ICML Adversarial Robustness for Code. Octocat
2021 ICLR Generating Adversarial Computer Programs using Optimized Obfuscations. Octocat
2022 TOSEM Towards Robustness of Deep Program Processing Models - Detection, Estimation, and Enhancement. Octocat
2022 ICECCS Generating Adversarial Source Programs Using Important Tokens-based Structural Transformations.
2023 CoRR Adversarial Attacks against Binary Similarity Systems.
2023 SANER How Robust Is a Large Pre-trained Language Model for Code Generation𝑓 A Case on Attacking GPT2.
  • Black-box attacks: Adversaries attackers can only generate adversarial examples by obtaining limited model outputs through model queries.
Year Conf./Jour. Paper Code Repository Reproduced Repository
2019 USENIX Security Misleading Authorship Attribution of Source Code using Adversarial Learning.
2019 CODASPY Adversarial Authorship Attribution in Open-source Projects.
2020 CoRR STRATA: Simple, Gradient-free Attacks for Models of Code.
2020 AAAI Generating Adversarial Examples for Holding Robustness of Source Code Processing Models. Octocat
2021 TIFS A Practical Black-box Attack on Source Code Authorship Identification Classifiers.
2021 ICST A Search-Based Testing Framework for Deep Neural Networks of Source Code Embedding. Octocat
2021 QRS Generating Adversarial Examples of Source Code Classification Models via Q-Learning-Based Markov Decision Process.
2021 GECCO Deceiving Neural Source Code Classifiers: Finding Adversarial Examples with Grammatical Evolution. Octocat
2021 CoRR On Adversarial Robustness of Synthetic Code Generation. Octocat
2022 TOSEM Adversarial Robustness of Deep Code Comment Generation. Octocat
2022 ICSE Natural Attack for Pre-trained Models of Code. Octocat Localcat
2022 ICSE RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation. Octocat
2022 EMNLP TABS: Efficient Textual Adversarial Attack for Pre-trained NL Code Model Using Semantic Beam Search.
2023 AAAI CodeAttack: Code-Based Adversarial Attacks for Pre-trained Programming Language Models. Octocat
2023 PACM PL Discrete Adversarial Attack to Models of Code.
2023 CoRR Adversarial Attacks on Code Models with Discriminative Graph Patterns.
2023 Electronics AdVulCode: Generating Adversarial Vulnerable Code against Deep Learning-Based Vulnerability Detectors.
2023 ACL DIP: Dead code Insertion based Black-box Attack for Programming Language Model.
2023 CoRR A Black-Box Attack on Code Models via Representation Nearest Neighbor Search. Octocat
2023 CoRR SHIELD: Thwarting Code Authorship Attribution.
2023 ASE Code Difference Guided Adversarial Example Generation for Deep Code Models. Octocat
2024 JSEP CodeBERT‐Attack: Adversarial Attack against Source Code Deep Learning Models via Pre‐trained Model.

Other Threats

This includes xxx

🛡Defensive Strategies

In response to the growing security threats, researchers have proposed various defense mechanisms:

Backdoor Defense

Methods for defending against backdoor attacks include:

Year Conf./Jour. Paper Code Reporisty Reproduced Reporisty
2022 ICPR Backdoors in Neural Models of Source Code. Octocat
2023 CoRR Occlusion-based Detection of Trojan-triggering Inputs in Large Language Models of Code.
2024 TOSEM Poison Attack and Poison Detection on Deep Source Code Processing Models.
2024 CoRR Eliminating Backdoors in Neural Code Models via Trigger Inversion. 🚩
          |

Adversarial Defense

Approaches to counter adversarial attacks include:

Year Conf./Jour. Paper Code Reporisty Reproduced Reporisty
2022 SANER Semantic Robustness of Models of Source Code. Octocat
2022 COLING Semantic-Preserving Adversarial Code Comprehension. Octocat
2023 ICSE RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation. Octocat
2023 PACM PL Discrete Adversarial Attack to Models of Code.
2023 CCS Large Language Models for Code: Security Hardening and Adversarial Testing. Octocat
2023 CoRR Enhancing Robustness of AI Offensive Code Generators via Data Augmentation.

Citation

If you find this repository useful for your work, please include the following citation:

@article{xxx,
  title={Security of Language Models for Code},
  author={xxx and xxx and xxx and xxx and xxx},
  journal={arXiv preprint arXiv:xxxxx},
  year={2024}
}