Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. More information
While many existing risks and controls may apply to generative AI, the breakthrough technology also has many nuances that require new tactics.
Models are prone to hallucinations or the production of inaccurate content. Other risks include leakage of sensitive data through a model’s output, compromise of models that allow rapid manipulation and biases due to poor training data selection or insufficient well-controlled tuning and training.
Ultimately, conventional cyber detection and response must be expanded to monitor AI abuse – and AI must be conversely used for defensive benefits, says Phil Venables, CISO of Google Cloud.
“The safe, secure, and trusted use of AI involves a range of techniques that many teams have historically not brought together,” Venables noted during a virtual session at the recent conference. Cloud Security Alliance Global AI Symposium.
Lessons learned at Google Cloud
Venables advocated the importance of providing controls and common frameworks so that every AI instance or implementation doesn’t start from scratch.
“Remember that the problem is an end-to-end business process or mission objective, not just a technical problem in the environment,” he said.
By now, almost everyone is familiar with many of the risks associated with potential misuse of training data and refined data. “Mitigating the risks of data poisoning is critical, as is ensuring the suitability of the data for other risks,” Venables said.
Importantly, companies must ensure that the data used for training and tuning is cleansed and protected and that the provenance or provenance of that data is maintained with ‘strong integrity’.
“Of course you can’t just wish this were true,” Venables acknowledged. “You have to actually do the work to manage and track the use of data.”
This requires implementing specific controls and tools with built-in security that work together to deliver model training, refinement, and testing. This is especially important to ensure that models are not tampered with, neither in the software, nor in the weights, nor in any of their other parameters, Venables noted.
“If we don’t take care of this, we expose ourselves to multiple types of backdoor risks that could compromise the security and safety of the deployed business or mission process,” he said.
Filter to prevent rapid injection
Another major problem is model abuse by outsiders. Models can become contaminated by training data or other parameters that cause them to behave against broader controls, Venables said. This may include hostile tactics such as rapid manipulation and subversion.
Venables pointed out that there are plenty of examples of people manipulating cues both directly and indirectly to produce unintended results in the face of “naively defended or downright unprotected models.”
This could be text embedded in images or other inputs in single or multimodal models, where problematic cues ‘disrupt the output’.
“A lot of the headline-grabbing attention goes into generating unsafe content, and some of this can be quite funny,” says Venables.
It is important to ensure that inputs are filtered for a range of trust, safety and security goals, he said. This should include “ubiquitous logging” and observability, as well as strong access controls also enforced on models, code, data, and test data.
“The test data can influence model behavior in interesting and potentially risky ways,” says Venables.
Also check the output
That users cause models to misbehave is indicative of the need to manage not just inputs but also outputs, Venables pointed out. Companies can create filters and outbound controls (or “circuit breakers”) around how a model can manipulate data or control physical processes.
“It’s not just hostile behavior, it’s also coincidental model behavior,” Venables says.
Organizations must monitor and address software vulnerabilities in the supporting infrastructure themselves, Venables advised. End-to-end platforms can control the data and software lifecycle and help manage the operational risk of AI integration into business and mission-critical processes and applications.
“Ultimately, this is about mitigating the operational risk of the actions of the model’s output, essentially controlling agent behavior, providing defensive depth for unintended actions,” Venables said.
He recommended sandboxing and least-privilege enforcement for all AI applications. Models must be managed, protected, and properly shielded through independent monitoring API filters or constructs to validate and regulate behavior. Applications should also be run in lockdown loads and companies should focus on observation and logging activities.
Ultimately, it’s all about cleaning, protecting, and managing your training, tuning, and testing data. It’s about enforcing strong access controls on the models, the data, the software and the deployed infrastructure. It’s about filtering input and output to and from those models, ultimately ensuring that you sandbox more uses and applications into a risk and control framework that provides deep protection.”
Source link
Leave a Reply