AI Safety
"An ounce of prevention is worth a pound of cure"
"An ounce of prevention is worth a pound of cure"
My efforts in the area of AI SEA (Safety, Ethics and Alignment) have broadly been in these two areas:
Dataset auditing + bias mitigation and Model auditing and Red-Teaming + Adversarial attacks.
Here's a list of my recent publications spanning these two areas :
Model auditing, Red-Teaming and Adversarial attacks
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
Model weight theft with just noise inputs: The curious case of the petulant attacker
Understanding adversarial robustness through loss landscape geometries
Vulnerability of deep learning-based gait biometric recognition to adversarial perturbations
On detecting adversarial inputs with entropy of saliency maps
Art-attack! on style transfers with textures, label categories and adversarial examples
Smile in the face of adversity much? A print based spoofing attack
OODles of ODDs: The landscape of Out-of-distribution vulnerabilities of vision models
Did They Direct the Violence or Admonish It? A Cautionary Tale on Contronomy, Androcentrism and Back-Translation Foibles [Video]
Dataset auditing and bias mitigation