Follow
Shashwat Goel
Title
Cited by
Cited by
Year
Representation Engineering: A Top-Down Approach to AI Transparency
A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ...
arXiv preprint arXiv:2310.01405, 2023
1072023
Towards adversarial evaluations for inexact machine unlearning
S Goel, A Prabhu, A Sanyal, SN Lim, P Torr, P Kumaraguru
arXiv preprint arXiv:2201.06640, 2023
23*2023
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, ...
ICML, 2024
102024
Proportional Aggregation of Preferences for Sequential Decision Making
N Chandak, S Goel, D Peters
AAAI Outstanding Paper Award, 2024
62024
Bilingual dictionary generation and enrichment via graph exploration
S Goel, J Gracia, ML Forcada
Semantic Web 13 (6), 1103-1132, 2022
52022
Corrective Machine Unlearning
S Goel, A Prabhu, P Torr, P Kumaraguru, A Sanyal
ICLR Data-Centric Machine Learning (DMLR) Workshop, 2024
32024
From Pivots to Graphs: Augmented Cycle Density as a Generalization to One Time Inverse Consultation
S Goel, KSS Grover
arXiv preprint arXiv:2108.12459, 2021
22021
Low impact agency: review and discussion
D Naiff, S Goel
arXiv preprint arXiv:2303.03139, 2023
2023
Probing Negation in Language Models
S Singh, S Goel, S Vaduguru, P Kumaraguru
ACL Repl4NLP Workshop, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–9