「誰でもわかる大規模言語モデル入門」に掲載したURLの一覧を、章ごとにまとめています。本ページを使うことで、掲載URL先へ簡単にアクセスできます。
■本書を読む前に
日経BPのサイト(本書を検索する際には書名もしくはISBNで検索してください。ISBNで検索する際は-(ハイフン)を抜いて入力してください)
https://bookplus.nikkei.com/catalog
■1章
脚注URL
- 注1 https://techcrunch.com/2023/11/06/openais-chatgpt-now-has-100-million-weekly-active-users
- 注2 https://fortune.com/2023/08/30/chatgpt-creator-openai-earnings-80-million-amonth-1-billion-annual-revenue-540-million-loss-sam-altman
- 注4 https://openai.com/index/openai-and-apple-announce-partnership
- 注5 https://x.ai/blog/grok-2
- 注6 https://jpn.nec.com/techrep/journal/g23/n02/230204.html
- 注9 https://blog.google/technology/ai/google-gemini-ai/#capabilities
■2章
本文内URL
- ChatGPT(https://chat.openai.com)
脚注URL
- 注10 令和2 年度「学校図書館の現状に関する調査」の結果について(URL:https://www.mext.go.jp/a_menu/shotou/dokusho/link/1410430_00001.htm)
- 注14 国内最大級の生成AI 開発向け計算基盤の稼働および国産大規模言語モデル(LLM)の開発を本格開始 (https://www.softbank.jp/corp/news/press/sbkk/2023/20231031_01)
- 注16 https://www.forbes.com/sites/craigsmith/2023/09/08/what-large-models-cost-you–there-is-nofree-ai-lunch/?sh=1b2cbe314af7
図URL
- 図2-15 多段階の算術演算を実行する能力(左)、大学レベルの試験に合格する能力(中央)、文脈内の言葉の意図された意味を特定する能力(右) Characterizing Emergent Phenomena in Large Language Models(URL:https://research.google/blog/characterizing-emergent-phenomena-in-large-language-models)
■3章
脚注URL
図URL
- 図3-11 LLaMA-2のGithubページ(URL:https://github.com/meta-llama/llama)
- 図3-12 Meta 公式のLLaMAホームページ(https://llama.meta.com)
■4章
脚注URL
- 注1 https://corp.rakuten.co.jp/news/press/2024/0321_01.html
- 注2 https://www.softbank.jp/corp/news/press/sbkk/2023/20231031_01
■5章
本文内URL
- CommonCrawl(https://commoncrawl.org)
- Wikipedia(https://ja.wikipedia.org/wiki/Wikipedia:Database_download)
- BookCorpus(https://huggingface.co/datasets/bookcorpus)
- Project Gutenberg(https://www.gutenberg.org)
- OpenWebTextCorpus(https://skylion007.github.io/OpenWebTextCorpus)
脚注URL
- 注2 https://openai.com/blog/chatgpt
- 注3 https://ja.wikipedia.org/wiki/日本の首都
- 注4 https://ja.wikipedia.org/wiki/日本の首都
- 注6 AWSのEC2 P4インスタンス(URL:https://aws.amazon.com/jp/ec2/instance-types/p4)
図URL
- 図5-13 Tayの不適切な発言 *BBCの記事(https://www.bbc.com/news/technology-35890188)から引用
■6章
脚注URL
- 注1 https://openai.com/research/instruction-following#sample6
- 注8 論文:(Bai et al. 2022)データセット:https://huggingface.co/datasets/Anthropic/hh-rlhf
図URL
- 図6-5 LLaMA-2のモデル一覧(https://huggingface.co/collections/meta-llama/llama-2-family-661da1f90a9d678b6f55773b)のスクリーンショット
- 図6-9 P3のデータセットが公開されているHuggingfaceのページ(https://huggingface.co/datasets/bigscience/P3)
- 図6-10 Self-instructのデータセットが公開されているHuggingfaceのページ(https://huggingface.co/datasets/yizhongw/self_instruct)
- 図6-14, 図6-15, 図6-16 基盤モデル(GPT)、ファインチューニング済みモデ ル(Supervised Fine-tuning)、ヒューマンフィードバック済みモデル(InstructGPT)の比較(グラフは OpenAI”Aligning language models to follow instructions”(https://openai.com/index/instruction-following)
- 図6-18 Anthropic Helpful and Harmlessのデータセットの一部(https://huggingface.co/datasets/Anthropic/hh-rlhfから引用)
- 図6-20 ChatGPTにおけるRLHFのフロー(引用:https://openai.com/index/chatgpt)
■7章
脚注URL
- 注1 Understanding searches better than ever before(https://blog.google/products/search/searchlanguage-understanding-bert)
- 注4 https://github.com/jessevig/bertviz
■8章
本文内URL
- Gemini APIのURL https://aistudio.google.com/app/apikey
脚注URL
図URL
- 図8-1 Geminiのプラン(URL:https://ai.google.dev/pricing?hl=ja)
- 図8-3 GeminiAPIの利用規約の一部(URL:https://ai.google.dev/gemini-api/terms?hl=ja)
- 図8-4 OpenAIの利用規約の一部(URL:https://openai.com/ja-JP/policies/row-privacy-policy)
- 図8-5 GoogleColabの説明(URL:https://colab.research.google.com/?hl=ja)
- 図8-25 GenerationConfigのドキュメント(https://ai.google.dev/api/generate-content?hl=ja#generationconfig)
■9章
脚注URL
- 注1 https://openai.com/index/gpt-3-apps
- 注2 https://www.amazon.science/blog/using-generative-ai-to-improve-extreme-multilabelclassification
- 注3 https://matplotlib.org
- 注4 https://seaborn.pydata.org
図URL
- 図9-2 GPT-3を活用したサービス例:Viable(引用:https://openai.com/index/gpt-3-apps)
- 図9-3 Amazonにおける生成AIを活用したレビューのハイライト機能 *2024 年5 月時点では日本では未実装(引用: https://www.aboutamazon.com/news/amazon-ai/amazon-improves-customer-reviews-with-generative-ai)
- 図9-17 Gemini APIのエラーリスト(https://ai.google.dev/gemini-api/docs/troubleshooting?hl=ja)
- 図9-18 OpenAI APIのエラーリスト(https://platform.openai.com/docs/guides/error-codes/api-errors)
- 図9-20 Matplotlibホームページ(https://matplotlib.org)
■10章
本文内URL
- OAuthによる認証のクイックスタート(図10-16) https://ai.google.dev/gemini-api/docs/oauth?hl=ja
図URL
- 図10-17 Geminiはブラウザ上でもファインチューニング可能(URL:https://aistudio.google.com/app/tune)
■11章
図URL
- 図11-1Googleのプロンプトギャラリー(引用:https://ai.google.dev/gemini-api/prompts?hl=ja)
■12章
脚注URL
- 注1 サイトURL:https://lmarena.ai,論文:(Chiangetal.2024)
- 注2 https://huggingface.co/tokyotech-llm
- 注3 https://huggingface.co/Qwen
- 注4 https://huggingface.co/mistralai
- 注5 https://huggingface.co/AdaptLLM/law-LLM
- 注8 https://blog.google/technology/ai/google-gemini-ai/#performance
- 注9 (”Solving math word Problems – OpenAI.”, URL:https://openai.com/research/solving-math-wordproblems)
図URL
- 図12-2 ChatbotArenaによる比較(Overall)(引用:https://lmarena.ai/?leaderboard)
- 図12-4 OpenAI 公式サイトで提供されているトークン数を確認できるツールの使用例(使用サイトURL:https://platform.openai.com/tokenizer)
- 図12-5 OpenAIが公開しているLLM(引用:https://platform.openai.com/docs/models)
- 図12-6 Huggingfaceホームページ(URL:https://huggingface.co)
- 図12-7 OpenLLMLeaderboard(URL:https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
- 図12-8 LLaMA-3 のモデル一覧(URL:https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6)
- 図12-9 Geminiのベンチマーク結果(引用:https://blog.google/technology/ai/google-gemini-ai/#performance)
■13章
脚注URL
図URL
- 図13-1 LLMと様々な機能の連携 LangChain公式サイトから引用(https://www.langchain.com/langchain)
- 図13-5 LangChain公式ドキュメント(https://python.langchain.com/docs/introduction)
◼️参照文献
第1章
[1] OpenAI et al.,“ GPT-4 Technical Report,” arXiv [cs.CL], Mar. 15, 2023. [Online]. Available:http://arxiv.org/abs/2303.08774
[2] Gemini Team et al.,“ Gemini: A Family of Highly Capable Multimodal Models,” arXiv [cs.CL], Dec. 19, 2023. [Online]. Available: http://arxiv.org/abs/2312.11805
[3] J. Kaplan et al.,“ Scaling Laws for Neural Language Models,” arXiv [cs.LG], Jan. 23, 2020.[Online]. Available: http://arxiv.org/abs/2001.08361
[4] J. Wei et al.,“ Emergent Abilities of Large Language Models,” arXiv [cs.CL], Jun. 15, 2022.[Online]. Available: http://arxiv.org/abs/2206.07682
[5] H. Touvron et al.,“ Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL],Jul. 18, 2023. [Online]. Available: http://arxiv.org/abs/2307.09288
第2章
[1] D. Jurafsky and J. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models, 3rd Edition. 2024. [Online]. Available: https://web.stanford.edu/~jurafsky/slp3/.
[2] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288
[3] J. FitzGerald et al., “MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds., Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 4277–4302.
[4] A. Radford and K. Narasimhan, “Improving Language Understanding by Generative Pre-Training,” 2018, Accessed: Feb. 18, 2024. [Online]. Available: https://www.semanticscholar.org/paper/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035
[5] J. Kaplan et al.,“ Scaling Laws for Neural Language Models,” arXiv[cs.LG], Jan. 23, 2020.[Online]. Available: http://arxiv.org/abs/2001.08361
[6] J. Hoffmann et al., “Training Compute-Optimal Large Language Models,” arXiv [cs.CL],Mar. 29, 2022.[Online]. Available: http://arxiv.org/abs/2203.15556
[7] P. Villalobos, J. Sevilla, L. Heim, T. Besiroglu, M. Hobbhahn, and A. Ho,“ Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning,” arXiv [cs.LG], Oct. 26, 2022.[Online]. Available: http://arxiv.org/abs/2211.04325
第3章
[1] T. Kudo and J. Richardson,“ SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing,” arXiv [cs.CL], Aug. 19, 2018.[Online]. Available: http://arxiv.org/abs/1808.06226
第4章
[1] A. Vaswani et al.,“ Attention Is All You Need,” arXiv[cs.CL], Jun. 12, 2017.[Online]. Available: http://arxiv.org/abs/1706.03762
[2] S. Wu et al.,“ BloombergGPT: A Large Language Model for Finance,” arXiv[cs.LG], Mar.30, 2023.[Online]. Available: http://arxiv.org/abs/2303.17564
[3] J. Cui, Z. Li, Y. Yan, B. Chen, and L. Yuan,“ ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases,” arXiv [cs.CL], Jun. 28, 2023.[Online]. Available: http://arxiv.org/abs/2306.16092
第5章
[1] L. Berglund et al.,“ The Reversal Curse: LLMs trained on‘ A is B’ fail to learn‘ B is A,’” arXiv[cs.CL], Sep. 21, 2023.[Online]. Available: http://arxiv.org/abs/2309.12288
[2] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288
[3] N. Sachdeva et al., “How to Train Data-Efficient LLMs,” arXiv [cs.LG], Feb. 15, 2024.[Online]. Available: http://arxiv.org/abs/2402.09668
[4] S. Zhang et al., “OPT: Open Pre-trained Transformer Language Models,” arXiv [cs.CL],May 02, 2022.[Online]. Available: http://arxiv.org/abs/2205.01068
第6章
[1] L. Ouyang et al.,“ Training language models to follow instructions with human feedback,” arXiv[cs.CL], Mar. 04, 2022.[Online]. Available: http://arxiv.org/abs/2203.02155
[2] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288
[3] V. Sanh et al., “Multitask Prompted Training Enables Zero-Shot Task Generalization,” arXiv[cs.LG], Oct. 15, 2021.[Online]. Available: http://arxiv.org/abs/2110.08207
[4] Y. Wang et al.,“ Self-Instruct: Aligning Language Models with Self-Generated Instructions,” arXiv[cs.CL], Dec. 20, 2022.[Online]. Available: http://arxiv.org/abs/2212.10560
[5] S. Gehman, S. Gururangan, M. Sap, Y. Choi, and N. A. Smith, “RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models,” arXiv [cs.CL], Sep. 24, 2020.[Online]. Available: http://arxiv.org/abs/2009.11462
[6] S. Lin, J. Hilton, and O. Evans, “TruthfulQA: Measuring How Models Mimic Human Falsehoods,” arXiv [cs.CL], Sep. 08, 2021. [Online]. Available: http://arxiv.org/ abs/2109.07958
[7] Y. Bai et al.,“ Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback,” arXiv [cs.CL], Apr. 12, 2022. [Online]. Available: http://arxiv.org/abs/2204.05862
第7章
[1] A. Vaswani et al.,“ Attention Is All You Need,” arXiv[cs.CL], Jun. 12, 2017.[Online]. Available: http://arxiv.org/abs/1706.03762
[2] J. Yang et al., “Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond,” arXiv [cs.CL], Apr. 26, 2023. [Online]. Available: http://arxiv.org/abs/2304.13712
[3] W. X. Zhao et al.,“ A Survey of Large Language Models,” arXiv[cs.CL], Mar. 31, 2023.[Online]. Available: http://arxiv.org/abs/2303.18223v13
[4] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” arXiv [cs.CL], Jul. 18, 2023.[Online]. Available: http://arxiv.org/abs/2307.09288
第8章
[1] T. B. Brown et al., “Language Models are Few-Shot Learners,” arXiv [cs.CL], May 28, 2020.[Online]. Available: http://arxiv.org/abs/2005.14165
第11章
[1] A. Kong et al.,“ Better Zero-Shot Reasoning with Role-Play Prompting,” arXiv[cs.CL], Aug.15, 2023.[Online]. Available: http://arxiv.org/abs/2308.07702
[2] OpenAI et al.,“ GPT-4 Technical Report,” arXiv [cs.CL], Mar. 15, 2023. [Online]. Available:http://arxiv.org/abs/2303.08774
第12章
[1] W.-L. Chiang et al., “Chatbot Arena: An open platform for evaluating LLMs by human preference,” arXiv[cs.AI], Mar. 06, 2024. Accessed: Oct. 13, 2024.[Online]. Available: http://arxiv.org/abs/2403.04132
[2] OpenAI et al., “GPT-4 Technical Report,” arXiv [cs.CL], Mar. 15, 2023. [Online].Available: http://arxiv.org/abs/2303.08774
[3] D. Hendrycks et al., “Measuring massive multitask language understanding,” arXiv [cs.CY], Sep. 07, 2020. Accessed: Oct. 13, 2024. [Online]. Available: http://arxiv.org/abs/2009.03300
[4] R. Zellers, A. Holtzman, Y. Bisk, A. Farhadi, and Y. Choi,“ HellaSwag: Can a machine really finish your sentence?,” arXiv [cs.CL], May 19, 2019. [Online]. Available: http://arxiv.org/abs/1905.07830
[5] K. Sakaguchi, R. Le Bras, C. Bhagavatula, and Y. Choi, “WinoGrande: An adversarial Winograd Schema Challenge at scale,” Proc. Conf. AAAI Artif. Intell., vol. 34, no. 05, pp.8732–8740, Apr. 2020.
[6] M. Chen et al.,“ Evaluating large language models trained on code,” arXiv[cs.LG], Jul. 07, 2021.[Online]. Available: http://arxiv.org/abs/2107.03374