-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathllm.html
352 lines (308 loc) · 15.5 KB
/
llm.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Smile - LLM</title>
<meta name="description" content="Statistical Machine Intelligence and Learning Engine">
<!-- prettify js and CSS -->
<script src="https://cdn.rawgit.com/google/code-prettify/master/loader/run_prettify.js?lang=scala&lang=kotlin&lang=clj"></script>
<style>
.prettyprint ol.linenums > li { list-style-type: decimal; }
</style>
<!-- Bootstrap core CSS -->
<link href="css/cerulean.min.css" rel="stylesheet">
<link href="css/custom.css" rel="stylesheet">
<script src="https://code.jquery.com/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"></script>
<!-- slider -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.carousel.min.js"></script>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.carousel.css" type="text/css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.transitions.css" type="text/css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.theme.min.css" type="text/css" />
<!-- table of contents auto generator -->
<script src="js/toc.js" type="text/javascript"></script>
<!-- styles for pager and table of contents -->
<link rel="stylesheet" href="css/pager.css" type="text/css" />
<link rel="stylesheet" href="css/toc.css" type="text/css" />
<!-- Vega-Lite Embed -->
<script src="https://cdn.jsdelivr.net/npm/vega@5"></script>
<script src="https://cdn.jsdelivr.net/npm/vega-lite@5"></script>
<script src="https://cdn.jsdelivr.net/npm/vega-embed@6"></script>
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-57GD08QCML"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-57GD08QCML');
</script>
<!-- Sidebar and testimonial-slider -->
<script type="text/javascript">
$(document).ready(function(){
// scroll/follow sidebar
// #sidebar is defined in the content snippet
// This script has to be executed after the snippet loaded.
// $.getScript("js/follow-sidebar.js");
$("#testimonial-slider").owlCarousel({
items: 1,
singleItem: true,
pagination: true,
navigation: false,
loop: true,
autoPlay: 10000,
stopOnHover: true,
transitionStyle: "backSlide",
touchDrag: true
});
});
</script>
</head>
<body>
<div class="container" style="max-width: 1200px;">
<header>
<div class="masthead">
<p class="lead">
<a href="index.html">
<img src="images/smile.jpg" style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px; margin-right: 20px;">
<span class="tagline">Smile — Statistical Machine Intelligence and Learning Engine</span>
</a>
</p>
</div>
<nav class="navbar navbar-default" role="navigation">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<!-- Collect the nav links, forms, and other content for toggling -->
<div class="collapse navbar-collapse" id="navbar-collapse">
<ul class="nav navbar-nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Overview <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="quickstart.html">Quick Start</a></li>
<li><a href="overview.html">What's Machine Learning</a></li>
<li><a href="data.html">Data Processing</a></li>
<li><a href="visualization.html">Data Visualization</a></li>
<li><a href="vegalite.html">Declarative Visualization</a></li>
<li><a href="gallery.html">Gallery</a></li>
<li><a href="faq.html">FAQ</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Supervised Learning <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="classification.html">Classification</a></li>
<li><a href="regression.html">Regression</a></li>
<li><a href="deep-learning.html">Deep Learning</a></li>
<li><a href="feature.html">Feature Engineering</a></li>
<li><a href="validation.html">Model Validation</a></li>
<li><a href="missing-value-imputation.html">Missing Value Imputation</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Unsupervised Learning <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="clustering.html">Clustering</a></li>
<li><a href="vector-quantization.html">Vector Quantization</a></li>
<li><a href="association-rule.html">Association Rule Mining</a></li>
<li><a href="mds.html">Multi-Dimensional Scaling</a></li>
<li><a href="manifold.html">Manifold Learning</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">LLM & NLP <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="llm.html">Large Language Model (LLM)</a></li>
<li><a href="nlp.html">Natural Language Processing (NLP)</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Math <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="linear-algebra.html">Linear Algebra</a></li>
<li><a href="statistics.html">Statistics</a></li>
<li><a href="wavelet.html">Wavelet</a></li>
<li><a href="interpolation.html">Interpolation</a></li>
<li><a href="graph.html">Graph Data Structure</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">API <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="api/java/index.html" target="_blank">Java</a></li>
<li><a href="api/scala/index.html" target="_blank">Scala</a></li>
<li><a href="api/kotlin/index.html" target="_blank">Kotlin</a></li>
<li><a href="api/clojure/index.html" target="_blank">Clojure</a></li>
<li><a href="api/json/index.html" target="_blank">JSON</a></li>
</ul>
</li>
<li><a href="https://mybinder.org/v2/gh/haifengl/smile/notebook?urlpath=lab%2Ftree%2Fshell%2Fsrc%2Funiversal%2Fnotebooks%2Findex.ipynb" target="_blank">Try It Online</a></li>
</ul>
</div>
<!-- /.navbar-collapse -->
</nav>
</header>
<div id="content" class="row">
<div class="col-md-3 col-md-push-9 hidden-xs hidden-sm">
<div id="sidebar">
<div class="sidebar-toc" style="margin-bottom: 20px;">
<p class="toc-header">Contents</p>
<div id="toc"></div>
</div>
<div id="search">
<script>
(function() {
var cx = '010264411143030149390:ajvee_ckdzs';
var gcse = document.createElement('script');
gcse.type = 'text/javascript';
gcse.async = true;
gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') +
'//cse.google.com/cse.js?cx=' + cx;
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(gcse, s);
})();
</script>
<gcse:searchbox-only></gcse:searchbox-only>
</div>
</div>
</div>
<div class="col-md-9 col-md-pull-3">
<h1 id="llm-top" class="title">Large Language Model</h1>
<p>A large language model (LLM) is a computational model notable for its ability
to achieve general-purpose language generation and other NLP tasks such as
classification. Transformer, the state-of-the-art LLM architecture, is based on
the multi-head attention mechanism. Text is converted to numerical
representations called tokens, and each token is converted into a vector
via looking up from a word embedding table. At each layer, each token is
then contextualized within the scope of the context window with other
(unmasked) tokens via a parallel multi-head softmax-based attention mechanism
allowing the signal for key tokens to be amplified and less important tokens
to be diminished. GPTs (Generative pre-trained transformers) are based on
the decoder-only transformer architecture. Each generation of GPT models
is significantly more capable than the previous, due to increased model size
(number of trainable parameters) and larger training data.</p>
<h2 id="llama" class="title">Llama 3</h2>
<p>Smile provides an inference implementation of Llama 3 from Meta AI,
the latest version of Llama that is accessible to individuals, organizations,
and businesses of all sizes.</p>
<p>To build a model instance, call <code>Llama.build()</code> with the directory
path of checkpoint files, the path of tokenizer model file, maximum batch size, maximum
sequence length, and optionally CUDA device id:</p>
<ul class="nav nav-tabs">
<li class="active"><a href="#java_1" data-toggle="tab">Java</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane active" id="java_1">
<div class="code" style="text-align: left;">
<pre class="prettyprint lang-java"><code>
import smile.llm.llama.*;
var model = Llama.build("model/Llama3-8B-Instruct",
"model/Llama3-8B-Instruct/tokenizer.model",
4, // maximum batch size
8192, // maximum sequence length for input text
0 // CUDA device id
);
</code></pre>
</div>
</div>
</div>
<p>For a pretrained model, one should call <code>generate()</code> method with a batch
of prompots to generate the text. For fine-tuned chat models, we should instead
call <code>chat()</code> method to generate conversation responses.</p>
<ul class="nav nav-tabs">
<li class="active"><a href="#java_2" data-toggle="tab">Java</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane active" id="java_2">
<div class="code" style="text-align: left;">
<pre class="prettyprint lang-java"><code style="white-space: preserve nowrap;">
// List of conversational dialogs, where each dialog is a list of messages.
Message[][] dialogs = {
{
new Message(Role.user, "what is the recipe of mayonnaise?"),
},
{
new Message(Role.user, "I am going to Paris, what should I see?"),
new Message(Role.assistant, """
Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:
1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.
These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world."""),
new Message(Role.user, "What is so great about #1?"),
},
{
new Message(Role.system, "Always answer with Haiku"),
new Message(Role.user, "I am going to Paris, what should I see?"),
},
{
new Message(Role.system, "Always answer with emojis"),
new Message(Role.user, "How to go from Beijing to NY?"),
},
};
model.chat(dialogs,
2048, // maximum length of the generated text sequence
0.6, // temperature for controlling randomness in sampling
0.9, // top_p probability threshold for nucleus sampling
false, // true to compute token log probabilities
null, // the optional random number generation seed
null // streaming publisher
);
</code></pre>
</div>
</div>
</div>
<h2 id="serving" class="title">Serving</h2>
<p>Smile also provides an LLM inference server for quick start. The script <code>bin/serve.sh</code>
builds and starts the inference server. You should have Node.js and npm installed to build the
frontend.</p>
<ul class="nav nav-tabs">
<li class="active"><a href="#bash_3" data-toggle="tab">Shell</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane active" id="bash_3">
<div class="code" style="text-align: left;">
<pre class="prettyprint lang-bash"><code>
$ bin/serve.sh --help
SmileServe - Large Language Model (LLM) Inference Server
Usage: smile-serve [options]
--model <value> The model checkpoint directory path
--tokenizer <value> The tokenizer model file path
--max-seq-len <value> The maximum sequence length
--max-batch-size <value> The maximum batch size
--device <value> The CUDA device ID
--host <value> The IP address to listen on (0.0.0.0 for all available addresses)
--port <value> The port number
--help Display the usage information
</code></pre>
</div>
</div>
</div>
<p>By default, SmileServe binds at localhost:3801. If you prefer a different port and/or
want to expose the server to other hosts, you may set the binding interface and port
with <code>--host=0.0.0.0</code> and <code>--port=8000</code>, for example. The service
exposes the API <code>/v1/chat/completions</code> that is compatible with OpenAI API.</p>
<div id="btnv">
<span class="btn-arrow-left">← </span>
<a class="btn-prev-text" href="graph.html" title="Previous Section: Graph"><span>Graph</span></a>
<a class="btn-next-text" href="faq.html" title="Next Section: FAQ"><span>FAQ</span></a>
<span class="btn-arrow-right"> →</span>
</div>
</div>
<script type="text/javascript">
$('#toc').toc({exclude: 'h1, h5, h6', context: '', autoId: true, numerate: false});
</script>
</div>
</div>
<a href=https://github.com/haifengl/smile><img style="position: fixed; top: 0; right: 0; border: 0" src=/images/forkme_right_orange.png alt="Fork me on GitHub"></a>
<!-- Place this tag right after the last button or just before your close body tag. -->
<script async defer id="github-bjs" src="https://buttons.github.io/buttons.js"></script>
</body>
</html>