1. Vaswani, A. et al. Attention is all you need. Adv. Neural Info. Processing Syst. 30, (2017).
2. Devlin, J. et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
3. Radford, A. et al. Improving language understanding by generative pre-training. OpenAI: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (2018).
4. Touvron, H. et al. LLaMA: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
5. OpenAi, GPT-4 Technical Report. arXiv:2303.08774: https://arxiv.org/pdf/2303.08774.pdf (2023).