Transformer (machine learning model)
A transformer is a computer model used for deep learning, which is a kind of machine learning where computers teach themselves. Transformers were introduced in a 2017 paper "Attention Is All You Need" by a Google Brain team.[1] Transformers are popular for large-scale language training and work by tokenizing text, which means they change words into a format (like a list of numbers) for easier analysis.[2] Transformers process multiple parts of an input sequence simultaneously.[3] This is in contrast to older and slower sequential models that process data one step at a time.[4] Transformers are used in various fields including language, images, and audio, leading to models like GPT which powers chatbot ChatGPT.
References
- "Attention Is All You Need". arxiv. Google Brain. Retrieved 14 August 2023.
- Lokare, Ganesh. "Preparing Text Data for Transformers: Tokenization, Mapping and Padding". medium. Retrieved 14 August 2023.
- "Parallel Attention Mechanisms in Neural Machine Translation". arxiv. 17th IEEE International Conference on Machine Learning and Applications 2018. Retrieved 14 August 2023.
- "Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks". arXiv. NAACL 2016. Retrieved 14 August 2023.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.