「誰でもわかる大規模言語モデル入門」に掲載したURLの一覧を、章ごとにまとめています。本ページを使うことで、掲載URL先へ簡単にアクセスできます。

■本書を読む前に

日経BPのサイト(本書を検索する際には書名もしくはISBNで検索してください。ISBNで検索する際は-(ハイフン)を抜いて入力してください)

https://bookplus.nikkei.com/catalog

■1章

脚注URL

■2章

本文内URL

脚注URL

図URL

■3章

脚注URL

図URL

■4章

脚注URL

■5章

本文内URL

脚注URL

図URL

■6章

脚注URL

図URL

■7章

脚注URL

■8章

本文内URL

脚注URL

図URL

■9章

脚注URL

図URL

■10章

本文内URL

図URL

■11章

図URL

■12章

脚注URL

図URL

■13章

脚注URL

図URL

◼️参照文献

第1章

[1] OpenAI et al.,“ GPT-4 Technical Report,” arXiv [cs.CL], Mar. 15, 2023. [Online]. Available:http://arxiv.org/abs/2303.08774

[2] Gemini Team et al.,“ Gemini: A Family of Highly Capable Multimodal Models,” arXiv [cs.CL], Dec. 19, 2023. [Online]. Available: http://arxiv.org/abs/2312.11805

[3] J. Kaplan et al.,“ Scaling Laws for Neural Language Models,” arXiv [cs.LG], Jan. 23, 2020.[Online]. Available: http://arxiv.org/abs/2001.08361

[4] J. Wei et al.,“ Emergent Abilities of Large Language Models,” arXiv [cs.CL], Jun. 15, 2022.[Online]. Available: http://arxiv.org/abs/2206.07682

[5] H. Touvron et al.,“ Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL],Jul. 18, 2023. [Online]. Available: http://arxiv.org/abs/2307.09288

第2章

[1] D. Jurafsky and J. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models, 3rd Edition. 2024. [Online]. Available: https://web.stanford.edu/~jurafsky/slp3/.

[2] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288

[3] J. FitzGerald et al., “MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds., Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 4277–4302.

[4] A. Radford and K. Narasimhan, “Improving Language Understanding by Generative Pre-Training,” 2018, Accessed: Feb. 18, 2024. [Online]. Available: https://www.semanticscholar.org/paper/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035

[5] J. Kaplan et al.,“ Scaling Laws for Neural Language Models,” arXiv[cs.LG], Jan. 23, 2020.[Online]. Available: http://arxiv.org/abs/2001.08361

[6] J. Hoffmann et al., “Training Compute-Optimal Large Language Models,” arXiv [cs.CL],Mar. 29, 2022.[Online]. Available: http://arxiv.org/abs/2203.15556

[7] P. Villalobos, J. Sevilla, L. Heim, T. Besiroglu, M. Hobbhahn, and A. Ho,“ Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning,” arXiv [cs.LG], Oct. 26, 2022.[Online]. Available: http://arxiv.org/abs/2211.04325

第3章

[1] T. Kudo and J. Richardson,“ SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing,” arXiv [cs.CL], Aug. 19, 2018.[Online]. Available: http://arxiv.org/abs/1808.06226

第4章

[1] A. Vaswani et al.,“ Attention Is All You Need,” arXiv[cs.CL], Jun. 12, 2017.[Online]. Available: http://arxiv.org/abs/1706.03762

[2] S. Wu et al.,“ BloombergGPT: A Large Language Model for Finance,” arXiv[cs.LG], Mar.30, 2023.[Online]. Available: http://arxiv.org/abs/2303.17564

[3] J. Cui, Z. Li, Y. Yan, B. Chen, and L. Yuan,“ ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases,” arXiv [cs.CL], Jun. 28, 2023.[Online]. Available: http://arxiv.org/abs/2306.16092

第5章

[1] L. Berglund et al.,“ The Reversal Curse: LLMs trained on‘ A is B’ fail to learn‘ B is A,’” arXiv[cs.CL], Sep. 21, 2023.[Online]. Available: http://arxiv.org/abs/2309.12288

[2] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288

[3] N. Sachdeva et al., “How to Train Data-Efficient LLMs,” arXiv [cs.LG], Feb. 15, 2024.[Online]. Available: http://arxiv.org/abs/2402.09668

[4] S. Zhang et al., “OPT: Open Pre-trained Transformer Language Models,” arXiv [cs.CL],May 02, 2022.[Online]. Available: http://arxiv.org/abs/2205.01068

第6章

[1] L. Ouyang et al.,“ Training language models to follow instructions with human feedback,” arXiv[cs.CL], Mar. 04, 2022.[Online]. Available: http://arxiv.org/abs/2203.02155

[2] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288

[3] V. Sanh et al., “Multitask Prompted Training Enables Zero-Shot Task Generalization,” arXiv[cs.LG], Oct. 15, 2021.[Online]. Available: http://arxiv.org/abs/2110.08207

[4] Y. Wang et al.,“ Self-Instruct: Aligning Language Models with Self-Generated Instructions,” arXiv[cs.CL], Dec. 20, 2022.[Online]. Available: http://arxiv.org/abs/2212.10560

[5] S. Gehman, S. Gururangan, M. Sap, Y. Choi, and N. A. Smith, “RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models,” arXiv [cs.CL], Sep. 24, 2020.[Online]. Available: http://arxiv.org/abs/2009.11462

[6] S. Lin, J. Hilton, and O. Evans, “TruthfulQA: Measuring How Models Mimic Human Falsehoods,” arXiv [cs.CL], Sep. 08, 2021. [Online]. Available: http://arxiv.org/ abs/2109.07958

[7] Y. Bai et al.,“ Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback,” arXiv [cs.CL], Apr. 12, 2022. [Online]. Available: http://arxiv.org/abs/2204.05862

第7章

[1] A. Vaswani et al.,“ Attention Is All You Need,” arXiv[cs.CL], Jun. 12, 2017.[Online]. Available: http://arxiv.org/abs/1706.03762

[2] J. Yang et al., “Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond,” arXiv [cs.CL], Apr. 26, 2023. [Online]. Available: http://arxiv.org/abs/2304.13712

[3] W. X. Zhao et al.,“ A Survey of Large Language Models,” arXiv[cs.CL], Mar. 31, 2023.[Online]. Available: http://arxiv.org/abs/2303.18223v13

[4] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288

第8章

[1] T. B. Brown et al., “Language Models are Few-Shot Learners,” arXiv [cs.CL], May 28, 2020.[Online]. Available: http://arxiv.org/abs/2005.14165

第11章

[1] A. Kong et al.,“ Better Zero-Shot Reasoning with Role-Play Prompting,” arXiv[cs.CL], Aug.15, 2023.[Online]. Available: http://arxiv.org/abs/2308.07702

[2] OpenAI et al.,“ GPT-4 Technical Report,” arXiv [cs.CL], Mar. 15, 2023. [Online]. Available:http://arxiv.org/abs/2303.08774

第12章

[1] W.-L. Chiang et al., “Chatbot Arena: An open platform for evaluating LLMs by human preference,” arXiv[cs.AI], Mar. 06, 2024. Accessed: Oct. 13, 2024.[Online]. Available: http://arxiv.org/abs/2403.04132

[2] OpenAI et al., “GPT-4 Technical Report,” arXiv [cs.CL], Mar. 15, 2023. [Online].Available: http://arxiv.org/abs/2303.08774

[3] D. Hendrycks et al., “Measuring massive multitask language understanding,” arXiv [cs.CY], Sep. 07, 2020. Accessed: Oct. 13, 2024. [Online]. Available: http://arxiv.org/abs/2009.03300

[4] R. Zellers, A. Holtzman, Y. Bisk, A. Farhadi, and Y. Choi,“ HellaSwag: Can a machine really finish your sentence?,” arXiv [cs.CL], May 19, 2019. [Online]. Available: http://arxiv.org/abs/1905.07830

[5] K. Sakaguchi, R. Le Bras, C. Bhagavatula, and Y. Choi, “WinoGrande: An adversarial Winograd Schema Challenge at scale,” Proc. Conf. AAAI Artif. Intell., vol. 34, no. 05, pp.8732–8740, Apr. 2020.

[6] M. Chen et al.,“ Evaluating large language models trained on code,” arXiv[cs.LG], Jul. 07, 2021.[Online]. Available: http://arxiv.org/abs/2107.03374