Follow
Sharan Chetlur
Title
Cited by
Cited by
Year
cudnn: Efficient primitives for deep learning
S Chetlur, C Woolley, P Vandermersch, J Cohen, J Tran, B Catanzaro, ...
arXiv preprint arXiv:1410.0759, 2014
22292014
Shelhamer, E. cudnn: Efficient primitives for deep learning. arXiv 2014
S Chetlur, C Woolley, P Vandermersch, J Cohen, J Tran, B Catanzaro
arXiv preprint arXiv:1410.0759, 2019
592019
cuDNN: Efficient primitives for deep learning. CoRR abs/1410.0759 (2014)
S Chetlur, C Woolley, P Vandermersch, J Cohen, J Tran, B Catanzaro, ...
arXiv preprint arXiv:1410.0759, 2014
582014
cudnn: Efficient primitives for deep learning, CoRR abs/1410.0759
S Chetlur, C Woolley, P Vandermersch, J Cohen, J Tran, B Catanzaro, ...
arXiv preprint arXiv:1410.0759, 2014
132014
Wafer-scale fast fourier transforms
M Orenes-Vera, I Sharapov, R Schreiber, M Jacquelin, P Vandermersch, ...
Proceedings of the 37th International Conference on Supercomputing, 180-191, 2023
52023
What does it take to accelerate spice on the gpu?
M Naumov, F Lannutti, S Chetlur, L Chien, P Vandermersch
GPU Technology Conference, 2013
42013
System and method for re-factorizing a square matrix into lower and upper triangular matrices on a parallel processor
PV Maxim Naumov, Sharanyan Chetlur, Lung Sheng Chien, Robert Strzodka
US Patent US 9170836 B2, 2014
2014
The system can't perform the operation now. Try again later.
Articles 1–7