{"id":3024,"date":"2025-12-10T00:21:20","date_gmt":"2025-12-09T16:21:20","guid":{"rendered":"https:\/\/www.tiptinker.com\/attention-is-all-you-need-a-visual-guide-to-the-transformer-architecture\/"},"modified":"2025-12-10T03:34:19","modified_gmt":"2025-12-09T19:34:19","slug":"attention-is-all-you-need-a-visual-guide-to-the-transformer-architecture","status":"publish","type":"post","link":"https:\/\/www.tiptinker.com\/ja\/attention-is-all-you-need-a-visual-guide-to-the-transformer-architecture\/","title":{"rendered":"\u3010\u56f3\u89e3\u3011Attention Is All You Need\uff1aTransformer\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u5fb9\u5e95\u30ac\u30a4\u30c9"},"content":{"rendered":"<p>2017\u5e74\u3001Google Brain\u306e\u7814\u7a76\u8005\u305f\u3061\u304c\u767a\u8868\u3057\u305f\u8ad6\u6587\u300cAttention Is All You Need\u300d\u306f\u3001AI\u306e\u6b74\u53f2\u306b\u304a\u3051\u308b\u8ee2\u63db\u70b9\u3068\u306a\u308a\u307e\u3057\u305f\u3002\u73fe\u5728\u306eChatGPT\u3001Claude\u3001DeepL\u306a\u3069\u306e\u57fa\u76e4\u3068\u306a\u3063\u3066\u3044\u308b\u6280\u8853\u3001<strong>Transformer<\/strong>\u306f\u3053\u306e\u8ad6\u6587\u304b\u3089\u751f\u307e\u308c\u307e\u3057\u305f\u3002<\/p>\n<p>\u5f93\u6765\u306eRNN\uff08\u30ea\u30ab\u30ec\u30f3\u30c8\u30cb\u30e5\u30fc\u30e9\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\uff09\u3084LSTM\u304c\u62b1\u3048\u3066\u3044\u305f\u300c\u4e26\u5217\u51e6\u7406\u304c\u3067\u304d\u306a\u3044\u300d\u300c\u9577\u8ddd\u96e2\u306e\u6587\u8108\u3092\u4fdd\u6301\u3067\u304d\u306a\u3044\u300d\u3068\u3044\u3046\u81f4\u547d\u7684\u306a\u30dc\u30c8\u30eb\u30cd\u30c3\u30af\u3092\u3001Transformer\u306f\u3044\u304b\u306b\u3057\u3066\u89e3\u6d88\u3057\u305f\u306e\u304b\uff1f \u672c\u8a18\u4e8b\u3067\u306f\u3001\u305d\u306e\u6838\u5fc3\u3067\u3042\u308b\u300cAttention\u30e1\u30ab\u30cb\u30ba\u30e0\u300d\u3068\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306e\u69cb\u9020\u3092\u8996\u899a\u7684\u304b\u3064\u6280\u8853\u7684\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002<\/p>\n<h2>Transformer\u304c\u9769\u547d\u7684\u3067\u3042\u308b\u7406\u7531<\/h2>\n<p>Transformer\u4ee5\u524d\u306e\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u306f\u3001\u5358\u8a9e\u3092\u5148\u982d\u304b\u3089\u9806\u756a\u306b\u51e6\u7406\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3057\u305f\u3002\u3057\u304b\u3057\u3001Transformer\u306f**\u300cAttention\uff08\u6ce8\u610f\u6a5f\u69cb\uff09\u300d<strong>\u3092\u7528\u3044\u308b\u3053\u3068\u3067\u3001\u6587\u4e2d\u306e\u3059\u3079\u3066\u306e\u5358\u8a9e\u306e\u95a2\u4fc2\u6027\u3092<\/strong>\u540c\u6642\u306b\uff08\u4e26\u5217\u306b\uff09**\u8a08\u7b97\u3059\u308b\u3053\u3068\u3092\u53ef\u80fd\u306b\u3057\u307e\u3057\u305f\u3002<\/p>\n<p>\u6838\u5fc3\u6982\u5ff5\u306f\u4ee5\u4e0b\u306e3\u70b9\u306b\u96c6\u7d04\u3055\u308c\u307e\u3059\uff1a<\/p>\n<ol>\n<li><strong>\u4e26\u5217\u51e6\u7406\uff08Parallelism\uff09\uff1a<\/strong> \u5168\u30c7\u30fc\u30bf\u3092\u4e00\u5ea6\u306b\u51e6\u7406\u3067\u304d\u308b\u305f\u3081\u3001\u5b66\u7fd2\u901f\u5ea6\u304c\u98db\u8e8d\u7684\u306b\u5411\u4e0a\u3002<\/li>\n<li><strong>\u81ea\u5df1\u6ce8\u610f\u6a5f\u69cb\uff08Self-Attention\uff09\uff1a<\/strong> \u6587\u4e2d\u306e\u3042\u308b\u5358\u8a9e\u304c\u3001\u4ed6\u306e\u3069\u306e\u5358\u8a9e\u3068\u5f37\u304f\u95a2\u9023\u3057\u3066\u3044\u308b\u304b\u3092\u6570\u5024\u5316\u3002<\/li>\n<li><strong>\u4f4d\u7f6e\u30a8\u30f3\u30b3\u30fc\u30c7\u30a3\u30f3\u30b0\uff08Positional Encoding\uff09\uff1a<\/strong> \u9806\u5e8f\u60c5\u5831\u3092\u6301\u305f\u306a\u3044\u69cb\u9020\u306b\u5bfe\u3057\u3001\u5358\u8a9e\u306e\u4f4d\u7f6e\u60c5\u5831\u3092\u6570\u5b66\u7684\u306b\u4ed8\u4e0e\u3002<\/li>\n<\/ol>\n<hr \/>\n<h2>\u3010\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u56f3\u89e3\u3011\u30c7\u30fc\u30bf\u306e\u6d41\u308c<\/h2>\n<p>Transformer\u306e\u69cb\u9020\u306f\u8907\u96d1\u306b\u898b\u3048\u307e\u3059\u304c\u3001\u5927\u304d\u304f\u300c\u30a8\u30f3\u30b3\u30fc\u30c0\u30fc\uff08\u60c5\u5831\u306e\u5727\u7e2e\uff09\u300d\u3068\u300c\u30c7\u30b3\u30fc\u30c0\u30fc\uff08\u60c5\u5831\u306e\u751f\u6210\uff09\u300d\u306b\u5206\u304b\u308c\u307e\u3059\u3002\u4ee5\u4e0b\u306f\u3001\u305d\u306e\u30c7\u30fc\u30bf\u30d5\u30ed\u30fc\u3092\u53ef\u8996\u5316\u3057\u305f\u3082\u306e\u3067\u3059\u3002<\/p>\n<div class=\"easy-mermaid-wrapper\">\n<pre><code class=\"language-mermaid\">graph TD\r\n    subgraph \"Encoder Stack (\u5165\u529b\u51e6\u7406)\"\r\n    A[\"\u5165\u529b\u30c6\u30ad\u30b9\u30c8 (Input Text)\"] --&gt; B[\"\u57cb\u3081\u8fbc\u307f\u5c64 (Input Embedding)\"]\r\n    B --&gt; C[\"\u4f4d\u7f6e\u30a8\u30f3\u30b3\u30fc\u30c7\u30a3\u30f3\u30b0 (Positional Encoding)\"]\r\n    C --&gt; D[\"\u30de\u30eb\u30c1\u30d8\u30c3\u30c9\u30fb\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3 (Multi-Head Attention)\"]\r\n    D --&gt; E[\"\u6b8b\u5dee\u63a5\u7d9a &amp; \u6b63\u898f\u5316 (Add &amp; Norm)\"]\r\n    E --&gt; F[\"\u30d5\u30a3\u30fc\u30c9\u30d5\u30a9\u30ef\u30fc\u30c9\u5c64 (Feed Forward)\"]\r\n    F --&gt; G[\"\u6b8b\u5dee\u63a5\u7d9a &amp; \u6b63\u898f\u5316 (Add &amp; Norm)\"]\r\n    end\r\n\r\n    subgraph \"Decoder Stack (\u51fa\u529b\u751f\u6210)\"\r\n    H[\"\u51fa\u529b (Outputs \/ Shifted Right)\"] --&gt; I[\"\u57cb\u3081\u8fbc\u307f\u5c64 (Output Embedding)\"]\r\n    I --&gt; J[\"\u4f4d\u7f6e\u30a8\u30f3\u30b3\u30fc\u30c7\u30a3\u30f3\u30b0 (Positional Encoding)\"]\r\n    J --&gt; K[\"\u30de\u30b9\u30af\u4ed8\u304d\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3 (Masked Multi-Head Attention)\"]\r\n    K --&gt; L[\"\u6b8b\u5dee\u63a5\u7d9a &amp; \u6b63\u898f\u5316 (Add &amp; Norm)\"]\r\n    G --&gt; M[\"\u30a8\u30f3\u30b3\u30fc\u30c0\u30fc\u30fb\u30c7\u30b3\u30fc\u30c0\u30fc\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3 (Cross Attention)\"]\r\n    L --&gt; M\r\n    M --&gt; N[\"\u6b8b\u5dee\u63a5\u7d9a &amp; \u6b63\u898f\u5316 (Add &amp; Norm)\"]\r\n    N --&gt; O[\"\u30d5\u30a3\u30fc\u30c9\u30d5\u30a9\u30ef\u30fc\u30c9\u5c64 (Feed Forward)\"]\r\n    O --&gt; P[\"\u6b8b\u5dee\u63a5\u7d9a &amp; \u6b63\u898f\u5316 (Add &amp; Norm)\"]\r\n    end\r\n\r\n    P --&gt; Q[\"\u7dda\u5f62\u5c64 (Linear)\"]\r\n    Q --&gt; R[\"\u30bd\u30d5\u30c8\u30de\u30c3\u30af\u30b9 (Softmax)\"]\r\n    R --&gt; S[\"\u6700\u7d42\u51fa\u529b\u78ba\u7387 (Output Probabilities)\"]\r\n\r\n    style A fill:#f9f,stroke:#333,stroke-width:2px\r\n    style S fill:#9ff,stroke:#333,stroke-width:2px\r\n    style D fill:#ff9,stroke:#333,stroke-width:2px\r\n    style M fill:#ff9,stroke:#333,stroke-width:2px\r\n<\/code><\/pre>\n<\/div>\n<hr \/>\n<h2>\u6838\u5fc3\u30b3\u30fc\u30c9\uff1aScaled Dot-Product Attention<\/h2>\n<p>Transformer\u306e\u5fc3\u81d3\u90e8\u306f\u3001<strong>Query (Q)\u3001Key (K)\u3001Value (V)<\/strong> \u3068\u3044\u30463\u3064\u306e\u30d9\u30af\u30c8\u30eb\u3092\u7528\u3044\u305f\u884c\u5217\u8a08\u7b97\u3067\u3059\u3002\u3053\u308c\u306f\u30c7\u30fc\u30bf\u30d9\u30fc\u30b9\u306e\u691c\u7d22\u306b\u4f3c\u3066\u3044\u307e\u3059\uff08\u30af\u30a8\u30ea\u3092\u6295\u3052\u3001\u30ad\u30fc\u3068\u7167\u5408\u3057\u3001\u5024\u3092\u53d6\u308a\u51fa\u3059\uff09\u3002<\/p>\n<p>\u4ee5\u4e0b\u306f\u3001\u8ad6\u6587\u306e\u6570\u5f0f $\\text{Attention}(Q, K, V) = \\text{softmax}(\\frac{QK^T}{\\sqrt{d_k}})V$ \u3092PyTorch\u3067\u5b9f\u88c5\u3057\u305f\u3082\u306e\u3067\u3059\u3002<\/p>\n<pre><code class=\"language-python\">import torch\r\nimport torch.nn.functional as F\r\nimport math\r\n\r\ndef scaled_dot_product_attention(query, key, value, mask=None):\r\n    \"\"\"\r\n    Scaled Dot-Product Attention\u306e\u5b9f\u88c5\r\n    Args:\r\n        query: \u691c\u7d22\u30af\u30a8\u30ea (Batch, Heads, Seq_Len_Q, Depth)\r\n        key: \u7167\u5408\u30ad\u30fc (Batch, Heads, Seq_Len_K, Depth)\r\n        value: \u62bd\u51fa\u3059\u308b\u5024 (Batch, Heads, Seq_Len_K, Depth)\r\n        mask: \u7279\u5b9a\u306e\u5358\u8a9e\u3092\u7121\u8996\u3059\u308b\u305f\u3081\u306e\u30de\u30b9\u30af (\u30aa\u30d7\u30b7\u30e7\u30f3)\r\n    \"\"\"\r\n    d_k = query.size(-1) # \u6b21\u5143\u306e\u30b5\u30a4\u30ba\r\n    \r\n    # 1. Q\u3068K\u306e\u8ee2\u7f6e\u306e\u5185\u7a4d\u3092\u8a08\u7b97\uff08\u95a2\u9023\u5ea6\u30b9\u30b3\u30a2\uff09\r\n    scores = torch.matmul(query, key.transpose(-2, -1))\r\n    \r\n    # 2. \u6b21\u5143\u306e\u30eb\u30fc\u30c8\u3067\u5272\u3063\u3066\u30b9\u30b1\u30fc\u30ea\u30f3\u30b0\uff08\u52fe\u914d\u6d88\u5931\u9632\u6b62\uff09\r\n    scores = scores \/ math.sqrt(d_k)\r\n    \r\n    # 3. \u30de\u30b9\u30af\u9069\u7528\uff08\u5fc5\u8981\u306a\u5834\u5408\u3001\u672a\u6765\u306e\u5358\u8a9e\u306a\u3069\u3092\u96a0\u3059\uff09\r\n    if mask is not None:\r\n        scores = scores.masked_fill(mask == 0, -1e9)\r\n    \r\n    # 4. Softmax\u3067\u78ba\u7387\u5206\u5e03\uff08\u91cd\u307f\uff09\u306b\u5909\u63db\r\n    attention_weights = F.softmax(scores, dim=-1)\r\n    \r\n    # 5. \u91cd\u307f\u306b\u57fa\u3065\u3044\u3066Value\u3092\u52a0\u91cd\u5e73\u5747\r\n    output = torch.matmul(attention_weights, value)\r\n    \r\n    return output, attention_weights\r\n<\/code><\/pre>\n<hr \/>\n<h2>\u30b9\u30c6\u30c3\u30d7\u30d0\u30a4\u30b9\u30c6\u30c3\u30d7\uff1a\u51e6\u7406\u306e\u4ed5\u7d44\u307f<\/h2>\n<p>Transformer\u304c\u300cI love AI\u300d\u3068\u3044\u3046\u6587\u3092\u7ffb\u8a33\u30fb\u7406\u89e3\u3059\u308b\u969b\u3001\u5185\u90e8\u3067\u306f\u4ee5\u4e0b\u306e\u51e6\u7406\u304c\u884c\u308f\u308c\u3066\u3044\u307e\u3059\u3002<\/p>\n<ol>\n<li><strong>\u30c8\u30fc\u30af\u30f3\u5316\u3068\u57cb\u3081\u8fbc\u307f (Embedding)<\/strong>\n<ul>\n<li>\u30c6\u30ad\u30b9\u30c8\u3092\u6570\u5024ID\u306b\u5909\u63db\u3057\u3001\u305d\u308c\u3092\u5bc6\u306a\u30d9\u30af\u30c8\u30eb\uff08\u4f8b\uff1a512\u6b21\u5143\u306e\u6570\u5024\u5217\uff09\u306b\u5909\u63db\u3057\u307e\u3059\u3002\u3053\u306e\u6642\u70b9\u3067\u300cKing\u300d\u3068\u300cQueen\u300d\u306e\u3088\u3046\u306a\u610f\u5473\u306e\u8fd1\u3044\u5358\u8a9e\u306f\u3001\u6570\u5b66\u7684\u306b\u8fd1\u3044\u8ddd\u96e2\u306b\u914d\u7f6e\u3055\u308c\u307e\u3059\u3002<\/li>\n<\/ul>\n<\/li>\n<li><strong>\u4f4d\u7f6e\u60c5\u5831\u306e\u4ed8\u4e0e (Positional Encoding)<\/strong>\n<ul>\n<li>Transformer\u306f\u540c\u6642\u4e26\u884c\u51e6\u7406\u3092\u884c\u3046\u305f\u3081\u3001\u300c\u9806\u5e8f\u300d\u306e\u6982\u5ff5\u304c\u3042\u308a\u307e\u305b\u3093\u3002\u305d\u3053\u3067\u3001\u30b5\u30a4\u30f3\u6ce2\u30fb\u30b3\u30b5\u30a4\u30f3\u6ce2\u3092\u7528\u3044\u305f\u6570\u5024\u3092\u30d9\u30af\u30c8\u30eb\u306b\u52a0\u7b97\u3057\u3001\u300c\u3053\u308c\u306f1\u756a\u76ee\u306e\u5358\u8a9e\u300d\u300c\u3053\u308c\u306f2\u756a\u76ee\u306e\u5358\u8a9e\u300d\u3068\u3044\u3046\u60c5\u5831\u3092\u57cb\u3081\u8fbc\u307f\u307e\u3059\u3002<\/li>\n<\/ul>\n<\/li>\n<li><strong>\u81ea\u5df1\u6ce8\u610f (Self-Attention)<\/strong>\n<ul>\n<li>\u3053\u3053\u304c\u6700\u91cd\u8981\u3067\u3059\u3002\u300cit\u300d\u3068\u3044\u3046\u5358\u8a9e\u304c\u3042\u308b\u5834\u5408\u3001\u305d\u308c\u304c\u524d\u306e\u6587\u306e\u300canimal\u300d\u3092\u6307\u3059\u306e\u304b\u300cstreet\u300d\u3092\u6307\u3059\u306e\u304b\u3092\u3001\u5468\u56f2\u306e\u5358\u8a9e\u3068\u306e\u95a2\u9023\u5ea6\uff08\u30b9\u30b3\u30a2\uff09\u8a08\u7b97\u306b\u3088\u3063\u3066\u7279\u5b9a\u3057\u307e\u3059\u3002<\/li>\n<\/ul>\n<\/li>\n<li><strong>\u30d5\u30a3\u30fc\u30c9\u30d5\u30a9\u30ef\u30fc\u30c9\u3068\u6b63\u898f\u5316<\/strong>\n<ul>\n<li>\u62bd\u51fa\u3055\u308c\u305f\u7279\u5fb4\u91cf\u3092\u3055\u3089\u306b\u52a0\u5de5\u3057\u3001\u6b21\u306e\u5c64\u3078\u6e21\u3057\u307e\u3059\u3002\u6b8b\u5dee\u63a5\u7d9a\uff08Residual Connection\uff09\u306b\u3088\u308a\u3001\u6df1\u3044\u5c64\u3067\u3082\u5b66\u7fd2\u304c\u5b89\u5b9a\u3057\u307e\u3059\u3002<\/li>\n<\/ul>\n<\/li>\n<li><strong>\u30c7\u30b3\u30fc\u30c7\u30a3\u30f3\u30b0\u3068\u51fa\u529b<\/strong>\n<ul>\n<li>\u30a8\u30f3\u30b3\u30fc\u30c0\u30fc\u304c\u4f5c\u3063\u305f\u6587\u8108\u60c5\u5831\u3092\u53d7\u3051\u53d6\u308a\u3001\u30c7\u30b3\u30fc\u30c0\u30fc\u304c1\u5358\u8a9e\u305a\u3064\uff08\u81ea\u5df1\u56de\u5e30\u7684\u306b\uff09\u78ba\u7387\u306e\u9ad8\u3044\u6b21\u306e\u5358\u8a9e\u3092\u4e88\u6e2c\u30fb\u751f\u6210\u3057\u307e\u3059\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<hr \/>\n<h2>\u6bd4\u8f03\uff1aRNN vs Transformer<\/h2>\n<p>\u306a\u305c\u4e16\u754c\u306fRNN\/LSTM\u3092\u6368\u3066\u3066Transformer\u306b\u79fb\u884c\u3057\u305f\u306e\u3067\u3057\u3087\u3046\u304b\uff1f<\/p>\n<table>\n<thead>\n<tr>\n<th align=\"left\">\u7279\u5fb4<\/th>\n<th align=\"left\">RNN \/ LSTM (\u5f93\u6765)<\/th>\n<th align=\"left\">Transformer (\u73fe\u5728)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td align=\"left\"><strong>\u51e6\u7406\u65b9\u6cd5<\/strong><\/td>\n<td align=\"left\">\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\uff08\u76f4\u5217\u30fb1\u5358\u8a9e\u305a\u3064\uff09<\/td>\n<td align=\"left\"><strong>\u30d1\u30e9\u30ec\u30eb\uff08\u4e26\u5217\u30fb\u5168\u5358\u8a9e\u540c\u6642\uff09<\/strong><\/td>\n<\/tr>\n<tr>\n<td align=\"left\"><strong>\u8a08\u7b97\u901f\u5ea6<\/strong><\/td>\n<td align=\"left\">\u9045\u3044\uff08GPU\u306e\u4e26\u5217\u6027\u80fd\u3092\u6d3b\u304b\u305b\u306a\u3044\uff09<\/td>\n<td align=\"left\"><strong>\u6975\u3081\u3066\u9ad8\u901f<\/strong><\/td>\n<\/tr>\n<tr>\n<td align=\"left\"><strong>\u9577\u8ddd\u96e2\u4f9d\u5b58<\/strong><\/td>\n<td align=\"left\">\u82e6\u624b\uff08\u6587\u304c\u9577\u304f\u306a\u308b\u3068\u6587\u8108\u3092\u5fd8\u308c\u308b\uff09<\/td>\n<td align=\"left\"><strong>\u5f97\u610f\uff08\u3069\u308c\u3060\u3051\u96e2\u308c\u3066\u3044\u3066\u3082\u53c2\u7167\u53ef\u80fd\uff09<\/strong><\/td>\n<\/tr>\n<tr>\n<td align=\"left\"><strong>\u5b66\u7fd2\u30c7\u30fc\u30bf\u91cf<\/strong><\/td>\n<td align=\"left\">\u9650\u5b9a\u7684<\/td>\n<td align=\"left\"><strong>Web\u5168\u4f53\u898f\u6a21\u306e\u30c7\u30fc\u30bf\u3067\u5b66\u7fd2\u53ef\u80fd<\/strong><\/td>\n<\/tr>\n<tr>\n<td align=\"left\"><strong>\u4e3b\u306a\u7528\u9014<\/strong><\/td>\n<td align=\"left\">\u4ee5\u524d\u306e\u7ffb\u8a33\u3001\u6642\u7cfb\u5217\u30c7\u30fc\u30bf<\/td>\n<td align=\"left\"><strong>LLM (GPT-4), BERT, \u753b\u50cf\u751f\u6210<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<hr \/>\n<h2>Pro-Tips: \u5b9f\u88c5\u3068\u6d3b\u7528\u306e\u30dd\u30a4\u30f3\u30c8<\/h2>\n<p>Transformer\u30e2\u30c7\u30eb\u3092\u6271\u3046\u3001\u3042\u308b\u3044\u306f\u7406\u89e3\u3092\u6df1\u3081\u308b\u305f\u3081\u306e\u5b9f\u8df5\u7684\u306a\u30d2\u30f3\u30c8\u3067\u3059\u3002<\/p>\n<ul>\n<li><strong>\u30b3\u30f3\u30c6\u30ad\u30b9\u30c8\u30a6\u30a3\u30f3\u30c9\u30a6\u306e\u7406\u89e3:<\/strong><br \/>\nTransformer\u306b\u304a\u3051\u308b\u300c\u5165\u529b\u30c8\u30fc\u30af\u30f3\u6570\u5236\u9650\uff08\u4f8b\uff1aGPT-4\u306e128k\uff09\u300d\u306f\u3001Attention\u884c\u5217\u306e\u30e1\u30e2\u30ea\u4f7f\u7528\u91cf\u304c\u30b7\u30fc\u30b1\u30f3\u30b9\u9577\u306e\u4e8c\u4e57\uff08$O(N^2)$\uff09\u3067\u5897\u52a0\u3059\u308b\u3053\u3068\u306b\u8d77\u56e0\u3057\u307e\u3059\u3002\u9577\u3044\u30d7\u30ed\u30f3\u30d7\u30c8\u3092\u6295\u3052\u308b\u969b\u306f\u3001\u3053\u306e\u8a08\u7b97\u30b3\u30b9\u30c8\u304c\u80cc\u5f8c\u306b\u3042\u308b\u3053\u3068\u3092\u610f\u8b58\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/li>\n<li><strong>\u30d5\u30a1\u30a4\u30f3\u30c1\u30e5\u30fc\u30cb\u30f3\u30b0\u306e\u52b9\u7387\u5316:<\/strong><br \/>\n\u30d5\u30eb\u30b9\u30af\u30e9\u30c3\u30c1\u3067Transformer\u3092\u5b66\u7fd2\u3055\u305b\u308b\u306b\u306f\u83ab\u5927\u306a\u30ea\u30bd\u30fc\u30b9\u304c\u5fc5\u8981\u3067\u3059\u3002\u5b9f\u52d9\u3067\u306f\u3001\u4e8b\u524d\u5b66\u7fd2\u6e08\u307f\u30e2\u30c7\u30eb\uff08BERT\u3084Llama\u306a\u3069\uff09\u306b\u5bfe\u3057\u3001<strong>LoRA (Low-Rank Adaptation)<\/strong> \u306a\u3069\u306e\u6280\u8853\u3092\u4f7f\u3063\u3066\u3001Attention\u5c64\u306e\u4e00\u90e8\u306e\u91cd\u307f\u3060\u3051\u3092\u8abf\u6574\u3059\u308b\u306e\u304c\u4e00\u822c\u7684\u3067\u3059\u3002<\/li>\n<li><strong>Temperature\u306e\u8a2d\u5b9a:<\/strong><br \/>\n\u6700\u5f8c\u306eSoftmax\u5c64\u306b\u304b\u304b\u308b\u6e29\u5ea6\u30d1\u30e9\u30e1\u30fc\u30bf\uff08Temperature\uff09\u306f\u3001\u78ba\u7387\u5206\u5e03\u306e\u300c\u5e73\u5766\u3055\u300d\u3092\u8abf\u6574\u3057\u307e\u3059\u3002\u5275\u9020\u6027\u304c\u5fc5\u8981\u306a\u5834\u5408\u306f\u6570\u5024\u3092\u4e0a\u3052\uff08\u5206\u5e03\u3092\u5e73\u5766\u306b\u3057\u3001\u30ec\u30a2\u306a\u5358\u8a9e\u3082\u9078\u3070\u308c\u3084\u3059\u304f\u3059\u308b\uff09\u3001\u6b63\u78ba\u6027\u304c\u5fc5\u8981\u306a\u5834\u5408\u306f\u4e0b\u3052\u307e\u3059\u3002<\/li>\n<\/ul>\n<hr \/>\n<p>\u300cAttention Is All You Need\u300d\u306f\u3001\u5358\u306a\u308b\u8ad6\u6587\u306e\u30bf\u30a4\u30c8\u30eb\u3067\u306f\u306a\u304f\u3001\u73fe\u4ee3AI\u306e\u57fa\u672c\u539f\u7406\u305d\u306e\u3082\u306e\u3092\u8868\u3057\u3066\u3044\u307e\u3059\u3002\u8907\u96d1\u306a\u518d\u5e30\u51e6\u7406\uff08Recurrence\uff09\u3084\u7573\u307f\u8fbc\u307f\uff08Convolution\uff09\u3092\u6368\u3066\u3001\u300c\u6ce8\u76ee\uff08Attention\uff09\u300d\u306e\u307f\u306b\u30ea\u30bd\u30fc\u30b9\u3092\u96c6\u4e2d\u3055\u305b\u305f\u3053\u306e\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u304c\u3001\u4eca\u306e\u751f\u6210AI\u30d6\u30fc\u30e0\u3092\u652f\u3048\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>\u6b21\u306bChatGPT\u3092\u4f7f\u3046\u3068\u304d\u306f\u3001\u3042\u306a\u305f\u306e\u5165\u529b\u3057\u305f\u5168\u5358\u8a9e\u306b\u5bfe\u3057\u3066\u3001\u77ac\u6642\u306b\u6570\u5104\u56de\u306e\u884c\u5217\u8a08\u7b97\uff08Attention\uff09\u304c\u884c\u308f\u308c\u3066\u3044\u308b\u3053\u3068\u3092\u60f3\u50cf\u3057\u3066\u307f\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n<h3>\u53c2\u8003\u6587\u732e<\/h3>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1706.03762\">Attention Is All You Need (Vaswani et al., 2017) &#8211; arXiv<\/a><\/li>\n<li><a href=\"https:\/\/jalammar.github.io\/illustrated-transformer\/\">The Illustrated Transformer (Jay Alammar)<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>2017\u5e74\u3001Google Brain\u306e\u7814\u7a76\u8005\u305f\u3061\u304c\u767a\u8868\u3057\u305f\u8ad6\u6587\u300cAttention Is All You Need\u300d\u306f\u3001AI\u306e\u6b74\u53f2\u306b\u304a\u3051\u308b\u8ee2\u63db\u70b9\u3068\u306a\u308a\u307e\u3057\u305f\u3002\u73fe\u5728\u306eChatGPT\u3001Claude\u3001DeepL\u306a\u3069\u306e\u57fa\u76e4\u3068\u306a\u3063\u3066\u3044\u308b\u6280\u8853\u3001Transformer\u306f\u3053\u306e\u8ad6\u6587\u304b\u3089\u751f\u307e\u308c\u307e\u3057\u305f\u3002 \u5f93\u6765\u306eRNN\uff08\u30ea\u30ab\u30ec\u30f3\u30c8\u30cb\u30e5\u30fc\u30e9\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\uff09\u3084LSTM\u304c\u62b1\u3048\u3066\u3044\u305f\u300c\u4e26\u5217\u51e6\u7406\u304c\u3067\u304d\u306a\u3044\u300d\u300c\u9577\u8ddd\u96e2\u306e\u6587\u8108\u3092\u4fdd\u6301\u3067\u304d\u306a\u3044\u300d\u3068\u3044\u3046\u81f4\u547d\u7684\u306a\u30dc\u30c8\u30eb\u30cd\u30c3\u30af\u3092\u3001Transformer\u306f\u3044\u304b\u306b\u3057\u3066\u89e3\u6d88\u3057\u305f\u306e\u304b\uff1f \u672c\u8a18\u4e8b\u3067\u306f\u3001\u305d\u306e\u6838\u5fc3\u3067\u3042\u308b\u300cAttention\u30e1\u30ab\u30cb\u30ba\u30e0\u300d\u3068\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306e\u69cb\u9020\u3092\u8996\u899a\u7684\u304b\u3064\u6280\u8853\u7684\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002 Transformer\u304c\u9769\u547d\u7684\u3067\u3042\u308b\u7406\u7531 Transformer\u4ee5\u524d\u306e\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u306f\u3001\u5358\u8a9e\u3092\u5148\u982d\u304b\u3089\u9806\u756a\u306b\u51e6\u7406\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3057\u305f\u3002\u3057\u304b\u3057\u3001Transformer\u306f**\u300cAttention\uff08\u6ce8\u610f\u6a5f\u69cb\uff09\u300d\u3092\u7528\u3044\u308b\u3053\u3068\u3067\u3001\u6587\u4e2d\u306e\u3059\u3079\u3066\u306e\u5358\u8a9e\u306e\u95a2\u4fc2\u6027\u3092\u540c\u6642\u306b\uff08\u4e26\u5217\u306b\uff09**\u8a08\u7b97\u3059\u308b\u3053\u3068\u3092\u53ef\u80fd\u306b\u3057\u307e\u3057\u305f\u3002 \u6838\u5fc3\u6982\u5ff5\u306f\u4ee5\u4e0b\u306e3\u70b9\u306b\u96c6\u7d04\u3055\u308c\u307e\u3059\uff1a \u4e26\u5217\u51e6\u7406\uff08Parallelism\uff09\uff1a \u5168\u30c7\u30fc\u30bf\u3092\u4e00\u5ea6\u306b\u51e6\u7406\u3067\u304d\u308b\u305f\u3081\u3001\u5b66\u7fd2\u901f\u5ea6\u304c\u98db\u8e8d\u7684\u306b\u5411\u4e0a\u3002 \u81ea\u5df1\u6ce8\u610f\u6a5f\u69cb\uff08Self-Attention\uff09\uff1a \u6587\u4e2d\u306e\u3042\u308b\u5358\u8a9e\u304c\u3001\u4ed6\u306e\u3069\u306e\u5358\u8a9e\u3068\u5f37\u304f\u95a2\u9023\u3057\u3066\u3044\u308b\u304b\u3092\u6570\u5024\u5316\u3002 \u4f4d\u7f6e\u30a8\u30f3\u30b3\u30fc\u30c7\u30a3\u30f3\u30b0\uff08Positional Encoding\uff09\uff1a \u9806\u5e8f\u60c5\u5831\u3092\u6301\u305f\u306a\u3044\u69cb\u9020\u306b\u5bfe\u3057\u3001\u5358\u8a9e\u306e\u4f4d\u7f6e\u60c5\u5831\u3092\u6570\u5b66\u7684\u306b\u4ed8\u4e0e\u3002 \u3010\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u56f3\u89e3\u3011\u30c7\u30fc\u30bf\u306e\u6d41\u308c Transformer\u306e\u69cb\u9020\u306f\u8907\u96d1\u306b\u898b\u3048\u307e\u3059\u304c\u3001\u5927\u304d\u304f\u300c\u30a8\u30f3\u30b3\u30fc\u30c0\u30fc\uff08\u60c5\u5831\u306e\u5727\u7e2e\uff09\u300d\u3068\u300c\u30c7\u30b3\u30fc\u30c0\u30fc\uff08\u60c5\u5831\u306e\u751f\u6210\uff09\u300d\u306b\u5206\u304b\u308c\u307e\u3059\u3002\u4ee5\u4e0b\u306f\u3001\u305d\u306e\u30c7\u30fc\u30bf\u30d5\u30ed\u30fc\u3092\u53ef\u8996\u5316\u3057\u305f\u3082\u306e\u3067\u3059\u3002 graph TD subgraph &#8220;Encoder Stack (\u5165\u529b\u51e6\u7406)&#8221; A[&#8220;\u5165\u529b\u30c6\u30ad\u30b9\u30c8 (Input Text)&#8221;] &#8211;&gt; B[&#8220;\u57cb\u3081\u8fbc\u307f\u5c64 (Input Embedding)&#8221;] B &#8211;&gt; C[&#8220;\u4f4d\u7f6e\u30a8\u30f3\u30b3\u30fc\u30c7\u30a3\u30f3\u30b0 (Positional Encoding)&#8221;] C &#8211;&gt; D[&#8220;\u30de\u30eb\u30c1\u30d8\u30c3\u30c9\u30fb\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3 (Multi-Head Attention)&#8221;] D &#8211;&gt; E[&#8220;\u6b8b\u5dee\u63a5\u7d9a &amp; \u6b63\u898f\u5316 (Add &amp; Norm)&#8221;] E &#8211;&gt; F[&#8220;\u30d5\u30a3\u30fc\u30c9\u30d5\u30a9\u30ef\u30fc\u30c9\u5c64 (Feed [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3017,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[56],"tags":[],"class_list":["post-3024","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tips-tutorials"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/posts\/3024","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/comments?post=3024"}],"version-history":[{"count":0,"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/posts\/3024\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/media\/3017"}],"wp:attachment":[{"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/media?parent=3024"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/categories?post=3024"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tiptinker.com\/ja\/wp-json\/wp\/v2\/tags?post=3024"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}