Microsoft releases PyRIT: a Red Teaming tool for generative AI

Generative AI

Microsoft has released an open access automation framework called YOU TRY (short for Python Risk Identification Tool) to proactively identify risks in generative artificial intelligence (AI) systems.

The red teaming tool is designed to “empower any organization around the world to innovate responsibly with the latest advances in artificial intelligence,” Ram Shankar Siva Kumar, AI red team lead at Microsoft, said.

The company said PyRIT can be used to assess the robustness of large language model (LLM) endpoints against various harm categories such as fabrication (e.g. hallucinations), misuse (e.g. bias), and prohibited content (e.g. harassment).

It can also be used to identify security harms ranging from malware generation to jailbreaking, as well as privacy harms such as identity theft.

Cybersecurity

PyRIT comes with five interfaces: target, datasets, scoring engine, the ability to support multiple attack strategies, and to include a memory component that can take the form of JSON or a database to store the intermediate input and output interactions.

The scoring engine also offers two different options for scoring the output of the target AI system, allowing red teamers to use a classic machine learning classifier or use an LLM endpoint for self-assessment.

“The goal is to enable researchers to have a baseline of how well their model and the entire inference pipeline does against different damage categories and to be able to compare that baseline against future iterations of their model,” Microsoft said.

Generative AI

“This allows them to have empirical data on how well their model is doing today, and detect any deterioration in performance based on future improvements.”

That said, the tech giant is careful to emphasize that PyRIT is not a replacement for manual red teaming of generative AI systems and that it complements a red team’s existing domain expertise.

In other words, the tool aims to highlight risk “hotspots” by generating clues that can be used to evaluate the AI ​​system and highlight areas that require further investigation.

Cybersecurity

Microsoft further acknowledged that bringing together generative AI systems requires examining both security and responsible AI risks simultaneously and makes the exercise more probabilistic, while also pointing out the major differences in generative AI system architectures.

“Manual probing, although time-consuming, is often necessary to identify potential blind spots,” Siva Kumar said. “Automation is necessary for scale-up, but is not a substitute for manual probing.”

The development comes as Protect AI revealed multiple critical vulnerabilities in popular AI supply chain platforms such as ClearML, Hugging Face, MLflow and Triton Inference Server, which could result in arbitrary code execution and disclosure of sensitive information.



#Microsoft #releases #PyRIT #Red #Teaming #tool #generative

Total
0
Shares
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
Previous Post
Nova Sentinel Malware

Dormant PyPI package compromised to spread Nova Sentinel malware

Next Post
Zero-Click Shortcuts Vulnerability

Researchers describe Apple’s recent zero-click shortcuts vulnerability

Related Posts