Jump to content

Ashish Vaswani

From Wikipedia, the free encyclopedia
Ashish Vaswani
Born1986 (age 38–39)
Alma mater
Known forTransformer (deep learning architecture)
Scientific career
Fields
Institutions
Thesis Smaller, Faster, and Accurate Models for Statistical Machine Translation  (2014)
Doctoral advisor
  • David Chiang
  • Liang Huang

Ashish Vaswani (born 1986)[1] is an Indian computer scientist. He is the co-founder and CEO of Essential AI. He previously worked as a research scientist at Google Brain and Information Sciences Institute.

Vaswani is known for his contributions in the field of deep learning, partially as a co-author of the 2017 paper "Attention Is All You Need," which introduced the Transformer neural network.[2] This breakthrough in artificial intelligence laid the foundation for GPT, BERT, ChatGPT, and their successors.

Career

[edit]

Vaswani completed his engineering in Computer Science from BIT Mesra in 2002. In 2004, he moved to the US to pursue higher studies at University of Southern California.[3] He did his PhD at the University of Southern California under the supervision of Prof. David Chiang.[4] He has worked as a researcher at Google,[5] where he was part of the Google Brain team. He was a co-founder of Adept AI Labs, but has since left the company.[6][7]

Notable works

[edit]

Vaswani's most notable paper, "Attention Is All You Need", was published in 2017.[8] The paper introduced the Transformer model, which eschews the use of recurrence in sequence-to-sequence tasks and relies entirely on self-attention mechanisms. The model has been instrumental in the development of several subsequent state-of-the-art models in NLP, including BERT,[9] GPT-2, and GPT-3.

References

[edit]
  1. ^ Nichil, Geoffrey (16 November 2024). "Who is Ashish Vaswani?". Synaptiks. Archived from the original on 15 December 2024.
  2. ^ Ashish Vaswani; Noam Shazeer; Niki Parmar; Jakob Uszkoreit; Llion Jones; Aidan N. Gomez; Łukasz Kaiser; Illia Polosukhin (12 June 2017). "Attention is All you Need" (PDF). Advances in Neural Information Processing Systems 30. Advances in Neural Information Processing Systems. arXiv:1706.03762. Wikidata Q30249683.
  3. ^ Team, OfficeChai (February 4, 2023). "The Indian Researchers Whose Work Led To The Creation Of ChatGPT". OfficeChai.
  4. ^ "Ashish Vaswani's webpage at ISI". www.isi.edu.
  5. ^ "Transformer: A Novel Neural Network Architecture for Language Understanding". ai.googleblog.com. August 31, 2017.
  6. ^ Rajesh, Ananya Mariam; Hu, Krystal; Rajesh, Ananya Mariam; Hu, Krystal (March 16, 2023). "AI startup Adept raises $350 mln in fresh funding". Reuters – via www.reuters.com.
  7. ^ Tong, Anna; Hu, Krystal; Tong, Anna; Hu, Krystal (2023-05-04). "Top ex-Google AI researchers raise funding from Thrive Capital". Reuters. Retrieved 2023-07-11.
  8. ^ Dawson, Caitlin (March 9, 2023). "USC Alumni Paved Path for ChatGPT". USC Viterbi | School of Engineering.
  9. ^ Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (May 24, 2019). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding". arXiv:1810.04805 [cs.CL].