{"id":3019,"date":"2025-12-10T00:21:20","date_gmt":"2025-12-09T16:21:20","guid":{"rendered":"https:\/\/www.tiptinker.com\/attention-is-all-you-need-a-visual-guide-to-the-transformer-architecture\/"},"modified":"2025-12-10T03:30:54","modified_gmt":"2025-12-09T19:30:54","slug":"attention-is-all-you-need-a-visual-guide-to-the-transformer-architecture","status":"publish","type":"post","link":"https:\/\/www.tiptinker.com\/ko\/attention-is-all-you-need-a-visual-guide-to-the-transformer-architecture\/","title":{"rendered":"Attention Is All You Need: \ud2b8\ub79c\uc2a4\ud3ec\uba38 \uc544\ud0a4\ud14d\ucc98 \uc644\ubcbd \ube44\uc8fc\uc5bc \uac00\uc774\ub4dc"},"content":{"rendered":"<p>\uc6b0\ub9ac\uac00 \uc0ac\uc6a9\ud558\ub294 ChatGPT, Claude, Gemini\uc640 \uac19\uc740 \ubaa8\ub4e0 \ucd5c\uc2e0 AI \ubaa8\ub378\uc758 \uc2ec\uc7a5\ubd80\uc5d0\ub294 **&#8217;\ud2b8\ub79c\uc2a4\ud3ec\uba38(Transformer)&#8217;**\ub77c\ub294 \uc544\ud0a4\ud14d\ucc98\uac00 \uc788\uc2b5\ub2c8\ub2e4. 2017\ub144 \uad6c\uae00\uc774 \ubc1c\ud45c\ud55c \ub17c\ubb38 *&#8221;Attention Is All You Need&#8221;*\ub294 \uc778\uacf5\uc9c0\ub2a5\uc758 \uc5ed\uc0ac\ub97c \uc0c8\ub85c \uc37c\uc9c0\ub9cc, \uc218\uc2dd\uc73c\ub85c \uac00\ub4dd \ucc2c \ub17c\ubb38\uc744 \uc774\ud574\ud558\uae30\ub780 \uc27d\uc9c0 \uc54a\uc2b5\ub2c8\ub2e4.<\/p>\n<p>\uc774 \uae00\uc5d0\uc11c\ub294 \ubcf5\uc7a1\ud55c \uc218\ud559 \ub300\uc2e0 <strong>\uc9c1\uad00\uc801\uc778 \ube44\uc720\uc640 \uc2dc\uac01\uc801 \uad6c\uc870<\/strong>\ub97c \ud1b5\ud574 \ud2b8\ub79c\uc2a4\ud3ec\uba38\uac00 \uc5b4\ub5bb\uac8c \ubb38\ub9e5\uc744 \uc774\ud574\ud558\uace0 \uc5b8\uc5b4\ub97c \uc0dd\uc131\ud558\ub294\uc9c0 \uba85\ud655\ud558\uac8c \ud30c\ud5e4\uce69\ub2c8\ub2e4.<\/p>\n<h2>1. \ud575\uc2ec \uac1c\ub150: \uc65c \ud2b8\ub79c\uc2a4\ud3ec\uba38\uc778\uac00? (RNN vs Transformer)<\/h2>\n<p>\uacfc\uac70\uc758 \uc5b8\uc5b4 \ubaa8\ub378(RNN, LSTM)\uc740 \ubb38\uc7a5\uc744 \uc77d\uc744 \ub54c \uc778\uac04\ucc98\ub7fc \uc67c\ucabd\uc5d0\uc11c \uc624\ub978\ucabd\uc73c\ub85c \ud55c \ub2e8\uc5b4\uc529 \uc21c\uc11c\ub300\ub85c \ucc98\ub9ac\ud588\uc2b5\ub2c8\ub2e4. \uc774 \ubc29\uc2dd\uc740 \ubb38\uc7a5\uc774 \uae38\uc5b4\uc9c8\uc218\ub85d \uc55e\ubd80\ubd84\uc758 \ub0b4\uc6a9\uc744 \uc78a\uc5b4\ubc84\ub9ac\ub294 \uce58\uba85\uc801\uc778 \ub2e8\uc810\uc774 \uc788\uc5c8\uc2b5\ub2c8\ub2e4.<\/p>\n<p>\ud2b8\ub79c\uc2a4\ud3ec\uba38\ub294 **&#8217;\ubcd1\ub82c \ucc98\ub9ac(Parallel Processing)&#8217;**\uc640 **&#8217;\uc5b4\ud150\uc158(Attention)&#8217;**\uc774\ub77c\ub294 \ub450 \uac00\uc9c0 \ubb34\uae30\ub85c \uc774 \ubb38\uc81c\ub97c \ud574\uacb0\ud588\uc2b5\ub2c8\ub2e4.<\/p>\n<ul>\n<li><strong>\ubcd1\ub82c \ucc98\ub9ac:<\/strong> \ubb38\uc7a5 \uc804\uccb4\ub97c \ud55c \ubc88\uc5d0 \uc785\ub825\ubc1b\uc544 \ubaa8\ub4e0 \ub2e8\uc5b4\ub97c \ub3d9\uc2dc\uc5d0 \ubd84\uc11d\ud569\ub2c8\ub2e4.<\/li>\n<li><strong>\uc5b4\ud150\uc158(Attention):<\/strong> \ubb38\uc7a5 \ub0b4\uc758 \ub2e8\uc5b4\ub4e4\uc774 \uc11c\ub85c \uc5b4\ub5a4 \uad00\uacc4\uac00 \uc788\ub294\uc9c0(\uc5b4\ub514\uc5d0 \uc8fc\ubaa9\ud574\uc57c \ud558\ub294\uc9c0) \uacc4\uc0b0\ud569\ub2c8\ub2e4.<\/li>\n<\/ul>\n<h3>\ud83d\udca1 \ube44\uc720: \ub3c4\uc11c\uad00 \uc0ac\uc11c<\/h3>\n<ul>\n<li><strong>RNN:<\/strong> \uc0ac\uc11c\uac00 \ucc45\uc758 \uccab \ud398\uc774\uc9c0\ubd80\ud130 \ub05d\uae4c\uc9c0 \ud55c \uc904\uc529 \uc77d\uc73c\uba70 \ub0b4\uc6a9\uc744 \uae30\uc5b5\ud558\ub824\uace0 \uc560\uc501\ub2c8\ub2e4. \ucc45\uc774 \ub450\uaebc\uc6b0\uba74 \uc55e \ub0b4\uc6a9\uc744 \uc78a\uc5b4\ubc84\ub9bd\ub2c8\ub2e4.<\/li>\n<li><strong>Transformer:<\/strong> \uc0ac\uc11c\uac00 \ucc45\uc758 \ubaa8\ub4e0 \ud398\uc774\uc9c0\ub97c \ubc14\ub2e5\uc5d0 \ud3bc\uccd0\ub193\uace0, \uad00\ub828\ub41c \ub0b4\uc6a9\ub07c\ub9ac \ud615\uad11\ud39c\uc73c\ub85c \uc5f0\uacb0\ud558\uba70 \uc804\uccb4 \ub9e5\ub77d\uc744 \ud55c\ub208\uc5d0 \ud30c\uc545\ud569\ub2c8\ub2e4.<\/li>\n<\/ul>\n<h2>2. \ucf54\ub4dc \ube14\ub85d: \uc5b4\ud150\uc158 \uba54\ucee4\ub2c8\uc998\uc758 \uc218\ud559\uc801 \uad6c\ud604 (Python)<\/h2>\n<p>\ud2b8\ub79c\uc2a4\ud3ec\uba38\uc758 \ud575\uc2ec\uc778 &#8216;Scaled Dot-Product Attention&#8217;\uc744 \ud30c\uc774\uc36c \ucf54\ub4dc\ub85c \uad6c\ud604\ud558\uba74 \uadf8 \uc6d0\ub9ac\uac00 \ub354 \uba85\ud655\ud574\uc9d1\ub2c8\ub2e4.<\/p>\n<pre><code class=\"language-python\">import numpy as np\r\nimport torch\r\nimport torch.nn.functional as F\r\n\r\ndef scaled_dot_product_attention(query, key, value, mask=None):\r\n    \"\"\"\r\n    Query, Key, Value \ud589\ub82c\uc744 \uc0ac\uc6a9\ud55c \uc5b4\ud150\uc158 \uc2a4\ucf54\uc5b4 \uacc4\uc0b0\r\n    \"\"\"\r\n    d_k = query.size(-1) # \ucc28\uc6d0 \ud06c\uae30\r\n    \r\n    # 1. \uc5b4\ud150\uc158 \uc2a4\ucf54\uc5b4 \uacc4\uc0b0 (Query\uc640 Key\uc758 \ub0b4\uc801)\r\n    scores = torch.matmul(query, key.transpose(-2, -1)) \/ np.sqrt(d_k)\r\n    \r\n    # 2. \ub9c8\uc2a4\ud0b9 (\uc635\uc158: \ub514\ucf54\ub354\uc5d0\uc11c \ubbf8\ub798 \ub2e8\uc5b4 \ucc38\uc870 \ubc29\uc9c0 \ub4f1)\r\n    if mask is not None:\r\n        scores = scores.masked_fill(mask == 0, -1e9)\r\n    \r\n    # 3. Softmax \uc801\uc6a9 (\ud655\ub960 \ubd84\ud3ec\ub85c \ubcc0\ud658)\r\n    attention_weights = F.softmax(scores, dim=-1)\r\n    \r\n    # 4. Value\uc640 \uac00\uc911\ud569 (\ucd5c\uc885 \ubb38\ub9e5 \ubca1\ud130 \uc0dd\uc131)\r\n    output = torch.matmul(attention_weights, value)\r\n    \r\n    return output, attention_weights\r\n<\/code><\/pre>\n<h2>3. \ub2e8\uacc4\ubcc4 \uac00\uc774\ub4dc: \ud2b8\ub79c\uc2a4\ud3ec\uba38 \ub0b4\ubd80 \ub3d9\uc791 \uc6d0\ub9ac<\/h2>\n<p>\ud2b8\ub79c\uc2a4\ud3ec\uba38\uac00 &#8220;The bank of the river&#8221;\ub77c\ub294 \ubb38\uc7a5\uc744 \ucc98\ub9ac\ud558\ub294 \uacfc\uc815\uc744 5\ub2e8\uacc4\ub85c \uc2dc\uac01\ud654\ud574 \ubcf4\uaca0\uc2b5\ub2c8\ub2e4.<\/p>\n<ol>\n<li><strong>\uc784\ubca0\ub529 &amp; \ud3ec\uc9c0\uc154\ub110 \uc778\ucf54\ub529 (Input Embedding &amp; Positional Encoding)<\/strong>\n<ul>\n<li><strong>\ub3d9\uc791:<\/strong> \ub2e8\uc5b4\ub97c \uc22b\uc790\ub85c \ub41c \ubca1\ud130\ub85c \ubcc0\ud658\ud569\ub2c8\ub2e4. \ub3d9\uc2dc\uc5d0 \ub2e8\uc5b4\uc758 \uc704\uce58 \uc815\ubcf4(\uc21c\uc11c)\ub97c \ub354\ud574\uc90d\ub2c8\ub2e4.<\/li>\n<li><strong>\uc774\uc720:<\/strong> \ud2b8\ub79c\uc2a4\ud3ec\uba38\ub294 \ub2e8\uc5b4\ub97c \ub3d9\uc2dc\uc5d0 \ucc98\ub9ac\ud558\ubbc0\ub85c, \ub2e8\uc5b4\uc758 \uc21c\uc11c\ub97c \uc54c\ub824\uc8fc\ub294 \uc88c\ud45c\uac12\uc774 \ud544\uc694\ud569\ub2c8\ub2e4.<\/li>\n<\/ul>\n<\/li>\n<li><strong>\uc140\ud504 \uc5b4\ud150\uc158 (Self-Attention): \ubb38\ub9e5 \ud30c\uc545\uc758 \ud575\uc2ec<\/strong>\n<ul>\n<li><strong>\ub3d9\uc791:<\/strong> \uac01 \ub2e8\uc5b4\uac00 \ubb38\uc7a5 \ub0b4 \ub2e4\ub978 \ubaa8\ub4e0 \ub2e8\uc5b4\uc640 \uc5bc\ub9c8\ub098 \uc5f0\uad00\ub418\uc5b4 \uc788\ub294\uc9c0 \uacc4\uc0b0\ud569\ub2c8\ub2e4.<\/li>\n<li><strong>Q, K, V \uac1c\ub150:<\/strong>\n<ul>\n<li><strong>Query (\uc9c8\ubb38):<\/strong> &#8220;\ub098(\ud604\uc7ac \ub2e8\uc5b4)\uc640 \uad00\ub828\ub41c \uc815\ubcf4\uac00 \uc788\ub098\uc694?&#8221;<\/li>\n<li><strong>Key (\uc0c9\uc778):<\/strong> &#8220;\uc800\ub294 \uc774\ub7f0 \ub0b4\uc6a9\uc744 \ub2f4\uace0 \uc788\uc2b5\ub2c8\ub2e4.&#8221;<\/li>\n<li><strong>Value (\ub0b4\uc6a9):<\/strong> &#8220;\uc81c \uc2e4\uc81c \ub370\uc774\ud130\ub294 \uc774\uac83\uc785\ub2c8\ub2e4.&#8221;<\/li>\n<\/ul>\n<\/li>\n<li><strong>\uc608\uc2dc:<\/strong> &#8220;bank&#8221;\ub77c\ub294 \ub2e8\uc5b4\uac00 \uc788\uc744 \ub54c, \uc8fc\ubcc0\uc758 &#8220;river&#8221;\ub77c\ub294 \ub2e8\uc5b4\uc5d0 \ub192\uc740 \uc810\uc218(Attention Score)\ub97c \uc8fc\uc5b4 &#8220;\uc740\ud589&#8221;\uc774 \uc544\ub2cc &#8220;\uac15\ub451&#8221;\uc784\uc744 \ud30c\uc545\ud569\ub2c8\ub2e4.<\/li>\n<\/ul>\n<\/li>\n<li><strong>\uba40\ud2f0 \ud5e4\ub4dc \uc5b4\ud150\uc158 (Multi-Head Attention)<\/strong>\n<ul>\n<li><strong>\ub3d9\uc791:<\/strong> \uc140\ud504 \uc5b4\ud150\uc158\uc744 \uc5ec\ub7ec \uac1c(\uc608: 8\uac1c) \ubcf5\uc81c\ud558\uc5ec \ub3d9\uc2dc\uc5d0 \uc218\ud589\ud569\ub2c8\ub2e4.<\/li>\n<li><strong>\uc774\uc720:<\/strong> \ud558\ub098\uc758 \uad00\uc810\ub9cc\uc73c\ub85c\ub294 \ubd80\uc871\ud569\ub2c8\ub2e4. \uc5b4\ub5a4 \ud5e4\ub4dc\ub294 &#8216;\ubb38\ubc95&#8217;\uc744 \ubcf4\uace0, \uc5b4\ub5a4 \ud5e4\ub4dc\ub294 &#8216;\uc758\ubbf8&#8217;\ub97c \ubcf4\uace0, \uc5b4\ub5a4 \ud5e4\ub4dc\ub294 &#8216;\ub300\uba85\uc0ac \uad00\uacc4&#8217;\ub97c \ubd05\ub2c8\ub2e4.<\/li>\n<\/ul>\n<\/li>\n<li><strong>\ud53c\ub4dc \ud3ec\uc6cc\ub4dc \ub124\ud2b8\uc6cc\ud06c (Feed-Forward Networks)<\/strong>\n<ul>\n<li><strong>\ub3d9\uc791:<\/strong> \uc5b4\ud150\uc158\uc73c\ub85c \ubaa8\uc778 \uc815\ubcf4\ub97c \uac01 \ub2e8\uc5b4\ubcc4\ub85c \ub3c5\ub9bd\uc801\uc73c\ub85c \uac00\uacf5\ud558\uc5ec \ub354 \uae4a\uc740 \ud2b9\uc9d5\uc744 \ucd94\ucd9c\ud569\ub2c8\ub2e4.<\/li>\n<\/ul>\n<\/li>\n<li><strong>\uc778\ucf54\ub354-\ub514\ucf54\ub354 \uc5f0\uacb0 (Encoder-Decoder Attention)<\/strong>\n<ul>\n<li><strong>\ub3d9\uc791:<\/strong> (\ubc88\uc5ed \ub4f1\uc758 \uc791\uc5c5 \uc2dc) \uc778\ucf54\ub354\uac00 \ubd84\uc11d\ud55c \uc6d0\ubb38 \uc815\ubcf4\ub97c \ub514\ucf54\ub354\uc5d0 \uc804\ub2ec\ud569\ub2c8\ub2e4. \ub514\ucf54\ub354\ub294 \uc0dd\uc131\ud560 \ub2e8\uc5b4\ub97c \uacb0\uc815\ud560 \ub54c \uc6d0\ubb38\uc758 \uc5b4\ub290 \ubd80\ubd84\uc744 \ucc38\uace0\ud560\uc9c0 \uacb0\uc815\ud569\ub2c8\ub2e4.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<h2>4. \ube44\uc8fc\uc5bc \ub370\uc774\ud130: \ud2b8\ub79c\uc2a4\ud3ec\uba38 \uad6c\uc870 \ub2e4\uc774\uc5b4\uadf8\ub7a8<\/h2>\n<p>\uc544\ub798 \ub2e4\uc774\uc5b4\uadf8\ub7a8\uc740 \ud2b8\ub79c\uc2a4\ud3ec\uba38\uc758 \uc804\uccb4\uc801\uc778 \ub370\uc774\ud130 \ud750\ub984\uc744 \ubcf4\uc5ec\uc90d\ub2c8\ub2e4.<\/p>\n<div class=\"easy-mermaid-wrapper\">\n<pre><code class=\"language-mermaid\">graph TD\r\n    Input[\"\uc785\ub825 \ubb38\uc7a5 (Inputs)\"] --&gt; Emb[\"\uc784\ubca0\ub529 + \uc704\uce58 \uc778\ucf54\ub529\"]\r\n    Emb --&gt; EncBlock[\"\uc778\ucf54\ub354 \ube14\ub85d (Encoder Block)\"]\r\n    \r\n    subgraph Encoder_Layer [\"\uc778\ucf54\ub354 \ub808\uc774\uc5b4 (\ubc18\ubcf5)\"]\r\n        direction TB\r\n        MultiHead1[\"\uba40\ud2f0 \ud5e4\ub4dc \uc5b4\ud150\uc158 (Self-Attention)\"]\r\n        AddNorm1[\"Add &amp; Norm\"]\r\n        FFN1[\"Feed Forward\"]\r\n        AddNorm2[\"Add &amp; Norm\"]\r\n        \r\n        MultiHead1 --&gt; AddNorm1\r\n        AddNorm1 --&gt; FFN1\r\n        FFN1 --&gt; AddNorm2\r\n    end\r\n    \r\n    EncBlock --&gt; Encoder_Layer\r\n    \r\n    subgraph Decoder_Layer [\"\ub514\ucf54\ub354 \ub808\uc774\uc5b4 (\ubc18\ubcf5)\"]\r\n        direction TB\r\n        MaskedAtt[\"\ub9c8\uc2a4\ud06c\ub4dc \uba40\ud2f0 \ud5e4\ub4dc \uc5b4\ud150\uc158\"]\r\n        AddNorm3[\"Add &amp; Norm\"]\r\n        EncDecAtt[\"\uba40\ud2f0 \ud5e4\ub4dc \uc5b4\ud150\uc158 (Encoder-Decoder)\"]\r\n        AddNorm4[\"Add &amp; Norm\"]\r\n        FFN2[\"Feed Forward\"]\r\n        AddNorm5[\"Add &amp; Norm\"]\r\n        \r\n        MaskedAtt --&gt; AddNorm3\r\n        AddNorm3 --&gt; EncDecAtt\r\n        EncDecAtt --&gt; AddNorm4\r\n        AddNorm4 --&gt; FFN2\r\n        FFN2 --&gt; AddNorm5\r\n    end\r\n\r\n    Encoder_Layer -- \"K, V \ud589\ub82c \uc804\ub2ec\" --&gt; EncDecAtt\r\n    OutputShifted[\"\ucd9c\ub825 \ubb38\uc7a5 (Shifted Right)\"] --&gt; EmbDec[\"\uc784\ubca0\ub529 + \uc704\uce58 \uc778\ucf54\ub529\"]\r\n    EmbDec --&gt; Decoder_Layer\r\n    Decoder_Layer --&gt; Linear[\"Linear &amp; Softmax\"]\r\n    Linear --&gt; FinalProb[\"\ub2e4\uc74c \ub2e8\uc5b4 \ud655\ub960\"]\r\n<\/code><\/pre>\n<\/div>\n<h3>\ud2b8\ub79c\uc2a4\ud3ec\uba38 \uc8fc\uc694 \uad6c\uc131 \uc694\uc18c \ube44\uad50<\/h3>\n<table>\n<thead>\n<tr>\n<th align=\"left\">\uad6c\uc131 \uc694\uc18c<\/th>\n<th align=\"left\">\uc5ed\ud560<\/th>\n<th align=\"left\">\ube44\uc720 (Analogy)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td align=\"left\"><strong>Self-Attention<\/strong><\/td>\n<td align=\"left\">\ubb38\uc7a5 \ub0b4 \ub2e8\uc5b4 \uac04 \uad00\uacc4 \ud30c\uc545<\/td>\n<td align=\"left\">\ud30c\ud2f0\uc5d0\uc11c \ub0b4 \uc774\ub984\uc774 \ub4e4\ub9b4 \ub54c \uadf8\ucabd\uc73c\ub85c \uadc0\ub97c \uae30\uc6b8\uc774\ub294 \uac83<\/td>\n<\/tr>\n<tr>\n<td align=\"left\"><strong>Multi-Head Attention<\/strong><\/td>\n<td align=\"left\">\ub2e4\uc591\ud55c \uad00\uc810\uc5d0\uc11c \ubb38\ub9e5 \uc774\ud574<\/td>\n<td align=\"left\">\uc5ec\ub7ec \uba85\uc758 \uc804\ubb38\uac00\uac00 \ubb38\uc7a5\uc744 \ubb38\ubc95, \uc758\ubbf8, \ud654\uc790 \ub4f1 \uac01\uae30 \ub2e4\ub978 \uad00\uc810\uc73c\ub85c \ubd84\uc11d<\/td>\n<\/tr>\n<tr>\n<td align=\"left\"><strong>Positional Encoding<\/strong><\/td>\n<td align=\"left\">\ub2e8\uc5b4\uc758 \uc21c\uc11c \uc815\ubcf4 \uc8fc\uc785<\/td>\n<td align=\"left\">\uc11e\uc5ec \uc788\ub294 \ud37c\uc990 \uc870\uac01 \ub4a4\uc5d0 \uc801\ud78c \ubc88\ud638\ud45c<\/td>\n<\/tr>\n<tr>\n<td align=\"left\"><strong>Residual Connection<\/strong><\/td>\n<td align=\"left\">\ud559\uc2b5 \uc815\ubcf4 \uc190\uc2e4 \ubc29\uc9c0<\/td>\n<td align=\"left\">\uc815\ubcf4\ub97c \uc78a\uc9c0 \uc54a\ub3c4\ub85d \uc6d0\ubcf8 \ub370\uc774\ud130\ub97c \uadf8\ub300\ub85c \ub2e4\uc74c \ub2e8\uacc4\ub85c \ub118\uaca8\uc8fc\ub294 \uc9c0\ub984\uae38<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>5. \uc804\ubb38\uac00 \ud301 (Pro-Tips)<\/h2>\n<ul>\n<li><strong>\ubc30\uce58 \ud06c\uae30(Batch Size)\uac00 \uc911\uc694\ud558\ub2e4:<\/strong> \ud2b8\ub79c\uc2a4\ud3ec\uba38\ub294 \ubcd1\ub82c \ucc98\ub9ac\uc5d0 \ucd5c\uc801\ud654\ub418\uc5b4 \uc788\uc2b5\ub2c8\ub2e4. \ud559\uc2b5 \uc2dc GPU \uba54\ubaa8\ub9ac\uac00 \ud5c8\uc6a9\ud558\ub294 \ud55c \ubc30\uce58 \ud06c\uae30\ub97c \ud0a4\uc6b0\uba74 \ud559\uc2b5 \uc18d\ub3c4\uac00 \ube44\uc57d\uc801\uc73c\ub85c \uc0c1\uc2b9\ud569\ub2c8\ub2e4.<\/li>\n<li><strong>\uc0ac\uc804 \ud559\uc2b5(Pre-training) \ud65c\uc6a9:<\/strong> \ucc98\uc74c\ubd80\ud130 \ubaa8\ub378\uc744 \ud559\uc2b5\uc2dc\ud0a4\ub294 \uac83\uc740 \ube44\ud6a8\uc728\uc801\uc785\ub2c8\ub2e4. Hugging Face \ub4f1\uc5d0\uc11c BERT\ub098 GPT \uacc4\uc5f4\uc758 \uc0ac\uc804 \ud559\uc2b5\ub41c \ubaa8\ub378\uc744 \uac00\uc838\uc640 \ud30c\uc778\ud29c\ub2dd(Fine-tuning)\ud558\ub294 \uac83\uc774 \uc131\ub2a5 \ud655\ubcf4\uc758 \uc9c0\ub984\uae38\uc785\ub2c8\ub2e4.<\/li>\n<li><strong>Warm-up Steps:<\/strong> \ud559\uc2b5 \ucd08\uae30\uc5d0\ub294 \ud559\uc2b5\ub960(Learning Rate)\uc744 \ub9e4\uc6b0 \ub0ae\uac8c \uc2dc\uc791\ud558\uc5ec \uc810\ucc28 \uc62c\ub9ac\ub294 &#8216;Warm-up&#8217; \uc804\ub7b5\uc744 \uc0ac\uc6a9\ud574\uc57c \ubaa8\ub378\uc774 \ucd08\uae30\uc5d0 \ubd88\uc548\uc815\ud558\uac8c \ubc1c\uc0b0\ud558\ub294 \uac83\uc744 \ub9c9\uc744 \uc218 \uc788\uc2b5\ub2c8\ub2e4.<\/li>\n<\/ul>\n<h2>6. \uacb0\ub860 \ubc0f \uc694\uc57d<\/h2>\n<p>&#8220;Attention Is All You Need&#8221;\ub294 \ub2e8\uc21c\ud55c \ub17c\ubb38 \uc81c\ubaa9\uc744 \ub118\uc5b4 \ud604\ub300 AI\uc758 \ucca0\ud559\uc774 \ub418\uc5c8\uc2b5\ub2c8\ub2e4. \uc21c\ucc28\uc801\uc778 \ucc98\ub9ac \ubc29\uc2dd\uc744 \ubc84\ub9ac\uace0 **&#8217;\uad00\uacc4(Attention)&#8217;**\uc5d0 \uc9d1\uc911\ud568\uc73c\ub85c\uc368, \uae30\uacc4\ub294 \ube44\ub85c\uc18c \ubb38\ub9e5\uc744 \uae4a\uc774 \uc788\uac8c \uc774\ud574\ud558\uac8c \ub418\uc5c8\uc2b5\ub2c8\ub2e4.<\/p>\n<p>\uc624\ub298 \uc18c\uac1c\ud55c <strong>Query, Key, Value<\/strong> \uac1c\ub150\uacfc <strong>\uba40\ud2f0 \ud5e4\ub4dc \uad6c\uc870<\/strong>\ub9cc \uae30\uc5b5\ud55c\ub2e4\uba74, \uc55e\uc73c\ub85c \ub098\uc62c \ucd5c\uc2e0 LLM \ub17c\ubb38\ub4e4\ub3c4 \ud6e8\uc52c \uc27d\uac8c \uc774\ud574\ud560 \uc218 \uc788\uc744 \uac83\uc785\ub2c8\ub2e4. \uc9c0\uae08 \ubc14\ub85c PyTorch\ub098 TensorFlow \uacf5\uc2dd \ud29c\ud1a0\ub9ac\uc5bc\uc744 \ud1b5\ud574 \uac04\ub2e8\ud55c \ud2b8\ub79c\uc2a4\ud3ec\uba38 \ubaa8\ub378\uc744 \uc9c1\uc811 \ucf54\ub529\ud574 \ubcf4\uc2dc\uae30\ub97c \uad8c\uc7a5\ud569\ub2c8\ub2e4.<\/p>\n<h2>7. \ucc38\uace0 \uc790\ub8cc (References)<\/h2>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1706.03762\">Attention Is All You Need (Original Paper)<\/a><\/li>\n<li><a href=\"http:\/\/jalammar.github.io\/illustrated-transformer\/\">The Illustrated Transformer by Jay Alammar<\/a><\/li>\n<li><a href=\"https:\/\/pytorch.org\/tutorials\/beginner\/transformer_tutorial.html\">PyTorch Transformer Tutorial<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>\uc6b0\ub9ac\uac00 \uc0ac\uc6a9\ud558\ub294 ChatGPT, Claude, Gemini\uc640 \uac19\uc740 \ubaa8\ub4e0 \ucd5c\uc2e0 AI \ubaa8\ub378\uc758 \uc2ec\uc7a5\ubd80\uc5d0\ub294 **&#8217;\ud2b8\ub79c\uc2a4\ud3ec\uba38(Transformer)&#8217;**\ub77c\ub294 \uc544\ud0a4\ud14d\ucc98\uac00 \uc788\uc2b5\ub2c8\ub2e4. 2017\ub144 \uad6c\uae00\uc774 \ubc1c\ud45c\ud55c \ub17c\ubb38 *&#8221;Attention Is All You Need&#8221;*\ub294 \uc778\uacf5\uc9c0\ub2a5\uc758 \uc5ed\uc0ac\ub97c \uc0c8\ub85c \uc37c\uc9c0\ub9cc, \uc218\uc2dd\uc73c\ub85c \uac00\ub4dd \ucc2c \ub17c\ubb38\uc744 \uc774\ud574\ud558\uae30\ub780 \uc27d\uc9c0 \uc54a\uc2b5\ub2c8\ub2e4. \uc774 \uae00\uc5d0\uc11c\ub294 \ubcf5\uc7a1\ud55c \uc218\ud559 \ub300\uc2e0 \uc9c1\uad00\uc801\uc778 \ube44\uc720\uc640 \uc2dc\uac01\uc801 \uad6c\uc870\ub97c \ud1b5\ud574 \ud2b8\ub79c\uc2a4\ud3ec\uba38\uac00 \uc5b4\ub5bb\uac8c \ubb38\ub9e5\uc744 \uc774\ud574\ud558\uace0 \uc5b8\uc5b4\ub97c \uc0dd\uc131\ud558\ub294\uc9c0 \uba85\ud655\ud558\uac8c \ud30c\ud5e4\uce69\ub2c8\ub2e4. 1. \ud575\uc2ec \uac1c\ub150: [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3017,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[57],"tags":[],"class_list":["post-3019","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tips-tutorials"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/posts\/3019","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/comments?post=3019"}],"version-history":[{"count":0,"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/posts\/3019\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/media\/3017"}],"wp:attachment":[{"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/media?parent=3019"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/categories?post=3019"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tiptinker.com\/ko\/wp-json\/wp\/v2\/tags?post=3019"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}