Publicação
RedShell: A Generative AI-Based Approach to Ethical Hacking
| Resumo: | The application of Machine Learning techniques in code generation is now a common practice for most developers. Tools such as ChatGPT from OpenAI leverage the natural language processing capabilities of Large Language Models to generate machine code from natural language descriptions. In the cybersecurity field, red teams can also take advantage of generative models to build malicious code generators, providing more automation to pentest audits. However, the application of Large Language Models in malicious code generation remains challenging due to the lack of data to train and evaluate offensive code generators. In this work, we propose RedShell, a tool that allows ethical hackers to generate malicious PowerShell code. We also introduce a ground truth dataset, combining publicly available code samples to fine-tune models in malicious PowerShell generation. Our experiments demonstrate the strong capabilities of RedShell in generating syntactically valid PowerShell, with over 90% of the generated samples successfully parsed without errors. Furthermore, our specialized model was able to produce samples that were semantically consistent with reference snippets, achieving a competitive performance on standard output similarity metrics such as edit distance and METEOR, with their similarity scores exceeding 50% and 40%, respectively. We also conducted a functional evaluation of the snippets generated by our tool, emphasizing their strong effectiveness in a wide range of offensive cybersecurity operations. This work sheds light on the state-of-the-art research in the field of Generative AI applied to pentesting and also serves as a steppingstone for future advancements, highlighting the potential benefits these models hold within such controlled environments. |
|---|---|
| Autores principais: | Bessa, Ricardo Jorge Matos |
| Assunto: | Cybersecurity Ethical Hacking Pentesting Large Language Models |
| Ano: | 2025 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | The application of Machine Learning techniques in code generation is now a common practice for most developers. Tools such as ChatGPT from OpenAI leverage the natural language processing capabilities of Large Language Models to generate machine code from natural language descriptions. In the cybersecurity field, red teams can also take advantage of generative models to build malicious code generators, providing more automation to pentest audits. However, the application of Large Language Models in malicious code generation remains challenging due to the lack of data to train and evaluate offensive code generators. In this work, we propose RedShell, a tool that allows ethical hackers to generate malicious PowerShell code. We also introduce a ground truth dataset, combining publicly available code samples to fine-tune models in malicious PowerShell generation. Our experiments demonstrate the strong capabilities of RedShell in generating syntactically valid PowerShell, with over 90% of the generated samples successfully parsed without errors. Furthermore, our specialized model was able to produce samples that were semantically consistent with reference snippets, achieving a competitive performance on standard output similarity metrics such as edit distance and METEOR, with their similarity scores exceeding 50% and 40%, respectively. We also conducted a functional evaluation of the snippets generated by our tool, emphasizing their strong effectiveness in a wide range of offensive cybersecurity operations. This work sheds light on the state-of-the-art research in the field of Generative AI applied to pentesting and also serves as a steppingstone for future advancements, highlighting the potential benefits these models hold within such controlled environments. |
|---|