llm

LLM

Learn what LLM (Large Language Model) means in AI and machine learning, with examples and related concepts.

Definition

LLM stands for Large Language Model — a type of AI model trained on massive amounts of text data that can understand and generate human language.

LLMs are the technology behind tools like ChatGPT, Claude, and Gemini. They work by predicting the next word (or token) in a sequence, but this simple mechanism produces remarkably sophisticated behavior: answering questions, writing code, translating languages, and reasoning through complex problems.

The “large” in LLM refers to both the size of the training data (often trillions of words from books, websites, and code) and the number of parameters in the model (ranging from a few billion to over a trillion).

How It Works

An LLM processes text in three stages:

  1. Tokenization — Input text is broken into tokens (roughly word pieces). “I love programming” might become ["I", " love", " program", "ming"].
  2. Processing — Tokens pass through dozens of Transformer layers, where attention mechanisms determine how each token relates to every other token.
  3. Prediction — The model outputs a probability distribution over the vocabulary for the next token, then samples from it.
Input: "The capital of France is"
        ↓ Tokenize
Tokens: [The, capital, of, France, is]
        ↓ Transformer layers (96+ layers)
        ↓ Attention + Feed-forward
Output: { "Paris": 0.95, "Lyon": 0.02, "the": 0.01, ... }

The key insight is that by training on enough text, the model learns not just grammar but facts, reasoning patterns, and even coding conventions.

Why It Matters

LLMs have transformed software development and knowledge work:

The practical impact: tasks that took hours (writing documentation, analyzing data, translating content) now take minutes.

Example

from anthropic import Anthropic

client = Anthropic()

# Basic LLM usage — ask a question
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=256,
    messages=[
        {"role": "user", "content": "Explain quantum computing in one paragraph."}
    ]
)

print(response.content[0].text)
# Using OpenAI's API
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the difference between RAM and ROM?"}
    ]
)

print(response.choices[0].message.content)

Key Takeaways


Part of the DeepRaft Glossary — AI and ML terms explained for developers.