tokenization and positional encoding ...
The Big Picture Consider traditional AI/ML as systems learning patterns for predictions, whereas generative AI/LLMs learn representations of the world with which to generate novel things: text, images, code, music, or even steps in reasoning. In short: Traditional AI/ML → Predicts. Generative AI/LLMRead more
The Big Picture
Consider traditional AI/ML as systems learning patterns for predictions, whereas generative AI/LLMs learn representations of the world with which to generate novel things: text, images, code, music, or even steps in reasoning.
In short:
- Traditional AI/ML → Predicts.
- Generative AI/LLMs → create and comprehend.
Traditional AI/ Machine Learning — The Foundation
1. Purpose
Traditional AI and ML are mainly discriminative, meaning they classify, forecast, or rank things based on existing data.
For example:
- Predict whether an email is spam or not.
- Detect a tumor in an MRI scan.
- Estimate tomorrow’s temperature.
- Recommend the product that a user is most likely to buy.
Focus is placed on structured outputs obtained from structured or semi-structured data.
2. How It Works
Traditional ML follows a well-defined process:
- Collect and clean labeled data (inputs + correct outputs).
- Feature selection selects features-the variables that truly count.
- Train a model, such as logistic regression, random forest, SVM, or gradient boosting.
- Optimize metrics, whether accuracy, precision, recall, F1 score, RMSE, etc.
- Deploy and monitor for prediction quality.
Each model is purpose-built, meaning you train one model per task.
If you want to perform five tasks, say, detect fraud, recommend movies, predict churn, forecast demand, and classify sentiment, you build five different models.
3. Examples of Traditional AI
Application Example Type
Classification, Span detection, image recognition, Supervised
Forecasting Sales prediction, stock movement, and Regression
Clustering\tMarket segmentation\tUnsupervised
Recommendation: Product/content suggestions, collaborative filtering
Optimization, Route planning, inventory control, Reinforcement learning (early)
Many of them are narrow, specialized models that call for domain-specific expertise.
Generative AI and Large Language Models: The Revolution
1. Purpose
Generative AI, particularly LLMs such as GPT, Claude, Gemini, and LLaMA, shifts from analysis to creation. It creates new content with a human look and feel.
They can:
- Generate text, code, stories, summaries, answers, and explanations.
- Translation across languages and modalities, such as text → image, image → text, etc.
- Reason across diverse tasks without explicit reprogramming.
They’re multi-purpose, context-aware, and creative.
2. How It Works
LLMs have been constructed using deep neural networks, especially the Transformer architecture introduced in 2017 by Google.
Unlike traditional ML:
- They train on massive unstructured data: books, articles, code, and websites.
- They learn the patterns of language and thought, not explicit labels.
- They predict the next token in a sequence, be it a word or a subword, and through this, they learn grammar, logic, facts, and how to reason implicitly.
These are pre-trained on enormous corpora and then fine-tuned for specific tasks like chatting, coding, summarizing, etc.
3. Example
Let’s compare directly:
Task, Traditional ML, Generative AI LLM
Spam Detection Classifies a message as spam/not spam. Can write a realistic spam email or explain why it’s spam.
Sentiment Analysis outputs “positive” or “negative.” Write a movie review, adjust the tone, or rewrite it neutrally.
Translation rule-based/ statistical models, understand contextual meaning and idioms like a human.
Chatbots: Pre-programmed, single responses, Conversational, contextually aware responses
Data Science Predicts outcomes, generates insights, explains data, and even writes code.
Key Differences — Side by Side
Aspect Traditional AI/ML Generative AI/LLMs
Objective – Predict or Classify from data; Create something entirely new
Data Structured (tables, numeric), Unstructured (text, images, audio, code)
Training Approach ×Task-specific ×General pretraining, fine-tuning later
Architecture: Linear models, decision trees, CNNs, RNNs, Transformers, attention mechanisms
Interpretability Easier to explain Harder to interpret (“black box”)
Adaptability needs to be retrained for new tasks reachable via few-shot prompting
Output Type: Fixed labels or numbers, Free-form text, code, media
Human Interaction LinearGradientInput → OutputConversational, Iterative, Contextual
Compute Scale\tRelatively small\tExtremely large (billions of parameters)
Why Generative AI Feels “Intelligent”
Generative models learn latent representations, meaning abstract relationships between concepts, not just statistical correlations.
That’s why an LLM can:
- Write a poem in Shakespearean style.
- Debug your Python code.
- Explain a legal clause.
- Create an email based on mood and tone.
Traditional AI could never do all that in one model; it would have to be dozens of specialized systems.
Large language models are foundation models: enormous generalists that can be fine-tuned for many different applications.
The Trade-offs
Advantages of Generative AI Bring , But Be Careful About
Creativity ↓ can produce human-like contextual output, can hallucinate, or generate false facts
Efficiency: Handles many tasks with one model. Extremely resource-hungry compute, energy
Accessibility: Anyone can prompt it – no coding required. Hard to control or explain inner reasoning
Generalization Works across domains. May reflect biases or ethical issues in training data
Traditional AI models are narrow but stable; LLMs are powerful but unpredictable.
A Human Analogy
Think of traditional AI as akin to a specialist, a person who can do one job extremely well if properly trained, whether that be an accountant or a radiologist.
Think of Generative AI/LLMs as a curious polymath, someone who has read everything, can discuss anything, yet often makes confident mistakes.
Both are valuable; it depends on the problem.
Earth Impact
- Traditional AI powers what is under the hood: credit scoring, demand forecasting, route optimization, and disease detection.
- Generative AI powers human interfaces, including chatbots, writing assistants, code copilots, content creation, education tools, and creative design.
Together, they are transformational.
For example, in healthcare, traditional AI might analyze X-rays, while generative AI can explain the results to a doctor or patient in plain language.
The Future — Convergence
The future is hybrid AI:
- Employ traditional models for accurate, data-driven predictions.
- Use LLMs for reasoning, summarizing, and interacting with humans.
- Connect both with APIs, agents, and workflow automation.
This is where industries are going: “AI systems of systems” that put together prediction and generation, analytics and conversation, data science and storytelling.
In a Nutshell,
Dimension\tTraditional AI / ML\tGenerative AI / LLMs
Core Idea: Learn patterns to predict outcomes. Learn representations to generate new content. Task Focus Narrow, single-purpose Broad, multi-purpose Input Labeled, structured data High-volume, unstructured data Example Predict loan default Write a financial summary Strengths\tAccuracy, control\tCreativity, adaptability Limitation Limited scope Risk of hallucination, bias.
Human Takeaway
Traditional AI taught machines how to think statistically. Generative AI is teaching them how to communicate, create, and reason like humans. Both are part of the same evolutionary journey-from automation to augmentation-where AI doesn’t just do work but helps us imagine new possibilities.
See less
The World of Tokens Humans read sentences as words and meanings. Consider it like breaking down a sentence into manageable bits, which the AI then knows how to turn into numbers. “AI is amazing” might turn into tokens: → [“AI”, “ is”, “ amazing”] Or sometimes even smaller: [“A”, “I”, “ is”, “ ama”,Read more
The World of Tokens
Each token gets a unique ID number, and these numbers are turned into embeddings, or mathematical representations of meaning.
But There’s a Problem Order Matters!
Let’s say we have two sentences:
They use the same words, but the order completely changes the meaning!
A regular bag of tokens doesn’t tell the AI which word came first or last.
That would be like giving somebody pieces of the puzzle and not indicating how to lay them out; they’d never see the picture.
So, how does the AI discern the word order?
An Easy Analogy: Music Notes
Imagine a song.
Each of them, separately, is just a sound.
Now, imagine if you played them out of order the music would make no sense!
Positional encoding is like the sheet music, which tells the AI where each note (token) belongs in the rhythm of the sentence.
Position Selection – How the Model Uses These Positions
Once tokens are labeled with their positions, the model combines both:
These two signals together permit the AI to:
Why This Is Crucial for Understanding and Creativity
Put together, they represent the basis for how LLMs understand and generate human-like language.
In stories,
This is why models like GPT or Gemini can write essays, summarize books, translate languages, and even generate code-because they “see” text as an organized pattern of meaning and order, not just random strings of words.
How Modern LLMs Improve on This
Earlier models had fixed positional encodings meaning they could handle only limited context (like 512 or 1024 tokens).
But newer models (like GPT-4, Claude 3, Gemini 2.0, etc.) use rotary or relative positional embeddings, which allow them to process tens of thousands of tokens entire books or multi-page documents while still understanding how each sentence relates to the others.
That’s why you can now paste a 100-page report or a long conversation, and the model still “remembers” what came before.
Bringing It All Together
but because it knows how meaning changes with position and context.
Final Thoughts
If you think of an LLM as a brain, then:
Together, they make language models capable of something almost magical understanding human thought patterns through math and structure.
See less