How LLMs Work (Tokens, Transformers, Embeddings)

πŸ€–GenAI from Scratch β€” Post 2 of 24 Β |Β  Series: GenAI Foundations Β 

← Post 1: What is GenAI? + UV Setup Post 3: Tokens, Embeddings & Transformers β†’

πŸ“‹ Table of Contents

  1. Today’s blog Agenda
  2. The Full AI Tree β€” From ML to GenAI (Complete Map)
  3. Deep Learning Branches: ANN, CNN, RNN, GAN, Transformer
  4. NLP Evolution β€” Why RNNs Failed and Transformers Won
  5. The Modern Route to GenAI Engineer
  6. Career Path: DA β†’ DE β†’ DS β†’ MLE β†’ GenAI Eng
  7. The Complete GenAI Tools Ecosystem (30+ Tools)
  8. AI Coding Assistants: GitHub Copilot, Claude Code, OpenAI Codex
  9. Step-by-Step: Set Up GitHub Copilot in VS Code (Free)
  10. Complete UV Command Reference
  11. Common Errors and Fixes
  12. Key Takeaways
  13. What’s Next

In previous blog we set up our environment. Now it’s time to answer the question every DBA asks when they start this journey: “Where exactly does Generative AI fit in the AI universe? And which tools do I actually need to learn?”

In this post β€” we’ll map the complete AI landscape from ML all the way to Agentic AI, introduce the 30+ tools you’ll use throughout this series, and set up GitHub Copilot in VS Code so AI starts helping you write code from today.

One of our team mateβ€” a DB developer with 9 years of experience β€” asked : “I come from a database background, basic Python, no ML. How do I navigate all this?” This will answer shapes this entire post.

What you’ll learn:

  • The complete AI tree: ML β†’ DL β†’ NLP β†’ GenAI β†’ Agentic AI, with all branches
  • Why RNNs and LSTMs are now obsolete and Transformers dominate
  • The “Modern Route” β€” the shortcut path for DBAs jumping straight to GenAI
  • 30+ tools categorized by role (foundations, RAG, agents, cloud, evaluation)
  • How to set up GitHub Copilot free in VS Code β€” step by step with screenshots
  • The full UV command reference you’ll use every class

πŸ”¬ Lab Validated: GitHub Copilot setup and all UV commands tested in VS Code 1.88+ on Windows 11 and Ubuntu 22.04.

Prerequisites

  • β˜‘ Completed Post 1 β€” UV installed, VS Code set up, project folder created
  • β˜‘ A free GitHub account β€” github.com
  • β˜‘ VS Code open with your genai-bootcamp project from Post 1

1. Today’s blog Agenda

Here’s exactly what this blog covered:

πŸ“‹ Agenda β€” 28 March 2026

  1. Dashboard + Resources Walkthrough β€” GitHub repo, Notion notes, G-sheet calendar
  2. Pre-requisites β€” Python, Git, VS Code; ML/DL/NLP/CV (optional but helpful)
  3. VS Code + GitHub Copilot β€” Set up your AI coding assistant
  4. GenAI Overview β€” Deep dive: where GenAI lives in the AI tree
  5. Module 1 Preview β€” Encoding, Embeddings, Tokenisation (next blog)

This is the approach we’ll follow throughout this series. Understand the concept deeply, then use AI coding assistants to accelerate the implementation.

2. The Full AI Tree β€” From ML to GenAI (Complete Map)


                        AI (Artificial Intelligence)
                       /           |              \
                      /            |               \
              Machine           Deep              Reinforcement
              Learning         Learning             Learning
           (Statistics)    (Neural Networks)    (Q-Learning, PPO)
               |                  |
        pandas / numpy /     PyTorch / TF
        sklearn               (tools)
               |                  |
           Classification    β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           Prediction        β”‚                          β”‚
           Regression      ANN/MLP    CNN    RNN    GAN    DRL
                                      β”‚      β”‚
                                   Vision  NLP
                                  (CV)   (Natural Lang)
                                           β”‚
                                     LSTM / GRU
                                    (Advanced NLP)
                                           β”‚
                                    Encoder-Decoder
                                           β”‚
                                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                  β”‚  TRANSFORMER   β”‚  ← 2017: Attention is All You Need
                                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                           β”‚
                              LLMs / SLMs / MultiModal LLMs
                                           β”‚
                                      Fine-Tuning
                                           β”‚
                                      RAG / mmRAG
                                           β”‚
                                     Agentic AI
                                           β”‚
                                        LLMOps

πŸ—„οΈ DBA Analogy β€” The AI Tree as a Database Architecture

Think of this tree like a database technology stack. AI is the enterprise platform (like Oracle). ML is the query engine (statistics-based). Deep Learning is the advanced optimizer (neural network-based). GenAI is the intelligent automation layer on top. And Agentic AI is the autonomous DBA β€” the system that reads your alerts, diagnoses problems, and executes fixes by itself.

3. Deep Learning Branches: ANN, CNN, RNN, GAN, Transformer

Deep Learning is the most important sub-domain to understand because every GenAI model is built on top of Deep Learning.

Network TypeFull NameWhat it doesUsed forStatus in 2026
ANN / MLPArtificial Neural Network / Multi-Layer PerceptronThe fundamental building block β€” layers of neurons that learn patternsTabular data, basic classificationβœ… Foundation β€” still used
CNNConvolutional Neural NetworkApplies filters to detect spatial features (edges, shapes, patterns)Computer Vision β€” image classification, object detectionβœ… Active β€” images/video
RNNRecurrent Neural NetworkProcesses sequences by passing state forward β€” reads text word-by-wordTime series, early NLP❌ Largely obsolete
LSTM / GRULong Short-Term Memory / Gated Recurrent UnitAdvanced RNNs that remember longer context β€” solved the “forgetting” problem of RNNsSpeech, translation, advanced NLP (2018-2019)❌ Replaced by Transformers
GANGenerative Adversarial NetworkTwo networks compete: Generator creates fake data, Discriminator detects fakes. Both improve.Image generation, deepfakes, synthetic data⚠️ Niche β€” mostly replaced
TransformerTransformer (Attention is All You Need)Processes entire sequences at once using “attention” β€” understands context across long distancesLLMs, GenAI, BERT, GPT, every modern AI modelβœ… Dominant β€” the foundation of all GenAI

πŸ’‘ Key Takeaway : For the Modern Route to GenAI, you only need a theoretical understanding of ANN, CNN, and RNN β€” not the code. Your practical focus starts from the Transformer onwards.

4. NLP Evolution β€” Why RNNs Failed and Transformers Won

The notes show a clear evolution path with RNNs and LSTMs crossed out β€” meaning they are now obsolete for new projects. Here’s why that happened and why it matters for DBAs:

The NLP Timeline


TEXT  ──→  must be converted to NUMBERS for ML/AI
              β”‚
              β–Ό
    ENCODING / EMBEDDING  ←──────────────────── LLMs use this TODAY
    (Vector + Tokenisation)
              β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚                   β”‚
   Classical           Modern
   Approach            Approach
      β”‚                   β”‚
  One-hot /           Contextual
  TF-IDF             Embeddings
  (numbers only,      (meaning +
  no meaning)         context)
              β”‚
              β–Ό
   Old approach tried: Encoder-Decoder Architecture
   β”‚
   └── RNN  ──→  LSTM/GRU  (2018-2019)
       ❌ Problem: couldn't handle very long sequences
       ❌ Slow: had to process tokens one at a time
       ❌ Forgot: context was lost over long text
              β”‚
              β–Ό
   2017: TRANSFORMER ──→ "Attention is All You Need"  βœ…
       βœ… Processes ALL tokens simultaneously (parallel)
       βœ… Attention mechanism: focuses on relevant parts
       βœ… Scales with more data + compute
       βœ… Led to: BERT β†’ GPT β†’ ChatGPT β†’ Claude β†’ Gemini

πŸ—„οΈ DBA Analogy β€” RNN vs Transformer = Full Table Scan vs Index

An RNN reads text like a full table scan β€” it processes records one by one, left to right, and by the time it reaches the end of a long table, it has forgotten what was at the beginning. A Transformer is like a bitmap index on every column simultaneously β€” it can instantly see all relationships across the entire dataset at once. That’s why Transformers scale and RNNs don’t.

The practical impact: this is why ChatGPT can hold a long conversation and still remember what you said in the first message. An LSTM-based chatbot from 2019 couldn’t do that reliably.

5. The Modern Route to GenAI Engineer

This is the most valuable part of this blog for experienced engineers coming from a database or infrastructure background.

πŸš€ The Modern Route (for DBAs and Infra Engineers)

Skip: sklearn, Keras, NumPy in-depth, TensorFlow, PyTorch in-depth, classical ML from scratch, CNN coding, RNN coding.

Light theory only (no coding needed): ANN basics, basic NLP concepts, how RNNs/LSTMs work conceptually, Encoder-Decoder architecture.

Start your practical work from here:

  1. Transformer β€” understand the architecture conceptually
  2. PyTorch β€” basic understanding (not from-scratch training, just enough to follow Hugging Face code)
  3. Hugging Face (HF) β€” load and run pre-trained models
  4. Unsloth β€” efficient fine-tuning of LLMs
  5. LangChain β€” build LLM applications (chains, agents, tools)
  6. LlamaIndex β€” build knowledge bases and RAG pipelines
  7. LangGraph β€” build multi-agent systems
  8. Data Parsing tools β€” process documents for AI (PDFs, Word, web)
  9. Vector Databases β€” store and search embeddings
  10. Cloud platforms β€” AWS Bedrock, Azure AI, GCP Vertex AI
  11. OpenAI SDK β€” call LLM APIs programmatically
  12. Guardrails β€” safety and validation for LLM outputs
  13. MCP (Model Context Protocol) β€” connect LLMs to external tools and data

⚠️ Common Mistake: Many DBAs starting GenAI think they need to go back and learn sklearn, pandas, and classical ML from scratch first. You don’t. “You can directly start from GenAI. Give 2 to 3 months to yourself along with this course and you will get everything.” Your database and systems knowledge is an advantage, not a gap.

6. Career Path: DA β†’ DE β†’ DS β†’ MLE β†’ GenAI Engineer

Let me draw the traditional data career progression and then showed where GenAI Engineers fit β€” and how DBAs can take a shortcut straight there.

RoleFocusToolsPath for DBAs
DA / BA
Data/Business Analyst
Reporting, dashboards, business insightsSQL, Excel, Power BI, TableauYou likely already have this foundation
DE
Data Engineer
Pipelines, ETL, data infrastructureSpark, Airflow, Kafka, dbtYour DB admin skills overlap heavily here
DS
Data Scientist
Statistical modeling, ML experimentsPython, sklearn, pandas, JupyterOptional for the Modern Route
MLE / MLOps
ML Engineer
Model training, deployment pipelinesPyTorch, TF, SageMaker, MLflowOptional for the Modern Route
GenAI Engineer
Modern Route
Build LLM applications, RAG pipelines, agentsLangChain, HuggingFace, OpenAI SDK, LangGraphβœ… Jump here directly from DBA/DE
Agentic AI Engineer
Next Level
Build autonomous AI systems that take actionsLangGraph, AutoGen, MCP, cloud platforms🎯 Our next destination

πŸ—„οΈ DBA to GenAI Engineer β€” What Transfers Directly

What you know as a DBADirect equivalent in GenAI
SQL queries and joinsVector similarity search queries in VectorDBs
Stored procedures and functionsLangChain chains and tools
Database schemasPydantic data models for LLM inputs/outputs
Connection pooling and APIsOpenAI SDK client management
Query performance tuningToken cost optimization and prompt tuning
RMAN backup automationLLMOps pipelines and evaluation automation
Alert logs and diagnosticsLLM observability (LangSmith, MLflow)

7. The Complete GenAI Tools Ecosystem (30+ Tools)

The blog shared a categorized list of every tool you’ll encounter in this series. This is your reference map for the entire 6-month series. Bookmark this section.

πŸ’‘ How to use this list: You don’t learn all of these at once. Each tool category maps to a specific module in the bootcamp. We’ve noted which posts in this series cover each category.

🧠 LLM Foundations

The core libraries for working with pre-trained models:

ToolWhat it doesCovered in
Hugging Face (HF)The GitHub of AI models. Download, run, and fine-tune 400,000+ pre-trained models. Your primary source for open-source LLMs.Post 14 β€” Fine-tuning
PyTorchThe deep learning framework behind most modern LLMs. You’ll use it through Hugging Face, not from scratch.Post 14

πŸ”§ Fine-Tuning Tools

For adapting pre-trained LLMs to specific domains (e.g., a DBA-specific SQL generator):

ToolWhat it doesCovered in
Unsloth2x faster fine-tuning with 60% less memory. The go-to tool for fine-tuning on free Google Colab GPUs.Post 14
LLaMA FactoryWeb UI for fine-tuning β€” drag-and-drop model training without writing code.Post 14
Hugging Face TrainerStandard fine-tuning API from Hugging Face for full control.Post 14

πŸ—„οΈ Vector Databases (RAG Backbone)

Store and search embeddings β€” the storage layer for RAG applications. As a DBA, this is your territory:

ToolWhat it doesBest for
FAISSFacebook’s in-memory vector search. Fast, open-source, no server needed.Local development, small datasets
ChromaDBEasiest to set up vector DB. Great for prototyping RAG apps locally.Development and POC projects
QdrantProduction-grade vector DB with filtering support. Rust-based, very fast.Production RAG systems
PineconeManaged cloud vector database. No infrastructure to manage.Cloud production, no-ops teams
pgvectorVector search extension for PostgreSQL. Run embeddings directly in your existing PostgreSQL database.DBAs who want to keep data in Postgres
SupabaseOpen-source Firebase alternative with built-in pgvector support.Full-stack apps with auth + storage
MilvusCloud-native vector DB, built for billion-scale similarity search.Enterprise, very large datasets
AWS OpenSearchAWS managed vector search β€” integrates natively with other AWS services.AWS-centric enterprise deployments
Azure AI SearchAzure managed vector + full-text search hybrid.Azure-centric enterprise deployments

πŸ“¦ Databases and Storage

Traditional databases used in GenAI applications for structured data storage alongside vector DBs:

ToolRole in GenAI
PostgreSQLRelational storage for user data, conversation history, metadata + pgvector for embeddings
MongoDBDocument storage for unstructured data and conversation logs
RedisCache layer for LLM responses, session state, and rate limiting
Neo4jGraph database for knowledge graphs in advanced RAG systems
SQLAlchemyPython ORM for connecting GenAI apps to relational databases

βš™οΈ LLM Application Frameworks

The frameworks you’ll use daily to build GenAI applications:

ToolWhat it doesCovered in
LangChainThe most popular framework for building LLM applications. Chains, agents, tools, memory β€” all in one library.Post 13
LlamaIndexSpecialised in building knowledge bases and RAG pipelines. Better than LangChain for document-heavy applications.Post 14
LangGraphBuild multi-agent systems where multiple AI agents collaborate. Built on top of LangChain.Post 16
LangSmithObservability and debugging for LangChain applications. Like a query execution plan viewer but for LLM chains.Post 13
HaystackAlternative to LangChain, focused on NLP pipelines and search.Reference

πŸ›‘οΈ Guardrails, Validation and Safety

ToolWhat it does
Guardrails.aiValidate and fix LLM outputs β€” ensure responses match expected format, tone, and content rules
Nvidia NeMo GuardrailsEnterprise-grade conversational guardrails β€” block off-topic, unsafe, or hallucinated responses
OpenAI Moderation APIBuilt-in content safety filter for OpenAI-powered applications
PydanticPython data validation β€” enforce structured outputs from LLMs (e.g., ensure the LLM returns valid JSON)

πŸ–₯️ Local LLMs and Runtime

ToolWhat it does
OllamaRun LLMs locally on your laptop β€” Llama 3, Mistral, Gemma, and more. No API key, no cost, no data leaves your machine. Essential for enterprise environments with data residency requirements.

πŸ”— MCP (Model Context Protocol)

ToolWhat it does
FastMCPBuild MCP servers that connect LLMs to external tools and databases. The standard protocol for Agentic AI integrations.

πŸ“„ Document AI and Parsing

For processing documents before feeding them to LLMs (PDFs, Word files, web pages):

ToolBest for
Llama ParserParse complex PDFs with tables, charts, and layouts accurately
Docling (IBM)Convert any document format to clean markdown for LLM ingestion
PyMuPDFFast, reliable PDF text extraction in Python
pdfplumberExtract tables from PDFs β€” great for financial and DBA reports
python-docxRead and write Word documents programmatically
Amazon TextractAWS OCR service β€” extract text from scanned documents and images
Azure Document IntelligenceAzure OCR + document understanding with pre-built models
Unstructured.ioUniversal document parser β€” handles 20+ file types with one library

πŸ€– Agentic AI Frameworks

ToolWhat it doesCovered in
LangGraphGraph-based multi-agent orchestration β€” the leading framework for complex agent workflowsPost 16
AutoGen (Microsoft)Multi-agent conversation framework β€” agents that talk to each other to solve problemsPost 16
DeepAgentDeep research agent β€” autonomously browses, reads, and synthesises informationPost 16
n8nVisual workflow automation β€” connect AI agents to external services without codePost 16

☁️ Cloud GenAI Platforms

PlatformWhat it providesCovered in
AWS BedrockManaged foundation models (Claude, Llama, Titan) β€” no infrastructure, IAM-controlled, data stays in AWSPost 19
Azure AI FoundryMicrosoft’s unified GenAI platform β€” models, fine-tuning, evaluation, deploymentPost 20
GCP Vertex AIGoogle Cloud’s AI platform β€” Gemini models, AutoML, deployment pipelinesPost 20
Google ADKAgent Development Kit β€” build and deploy Google AI agentsPost 16
Agent Core (AWS)AWS fully managed agent runtime for enterprise Agentic AIPost 19

πŸ“Š Evaluation and Observability

ToolWhat it does
RAGASEvaluate RAG pipeline quality β€” faithfulness, relevance, context precision scores
TruLensEvaluate and track LLM app performance over time
DeepEvalUnit testing framework for LLM outputs β€” write pytest-style tests for AI quality
MLflowExperiment tracking β€” log model versions, parameters, metrics across fine-tuning runs
LangSmithLangChain’s native observability β€” trace every step of your chain execution

πŸš€ Deployment, Infrastructure and CI/CD

ToolRole in GenAI deployment
FastAPIBuild REST APIs that wrap your LLM applications β€” the standard for serving AI models
AWS ECS + Fargate + EC2Container-based deployment for GenAI apps on AWS
AWS SageMakerManaged ML platform β€” fine-tuning, hosting, and monitoring at scale
Azure AKS / Container AppsKubernetes-based deployment for Azure-hosted GenAI services
GitHub ActionsCI/CD pipelines for automated testing and deployment of GenAI apps
Apache AirflowPipeline orchestration for batch GenAI workflows (data ingestion, embedding generation)
pytestUnit testing β€” write tests for your LLM application logic

8. AI Coding Assistants: GitHub Copilot, Claude Code, OpenAI Codex

The first practical tool was GitHub Copilot. “a game changer β€” even in industry, this thing is required.” Here’s the landscape:

Coding AssistantAccessCostWorks inBest for
GitHub CopilotGitHub accountFree (basic) / $10/month (premium: GPT-4.1, 4o, GPT-5)VS Code, JetBrains, Neovim, GitHub.comInline code suggestions, tab-complete, chat
Claude CodeAnthropic API keyUsage-basedVS Code, PyCharm (via extension), terminalComplex refactoring, multi-file editing, agentic tasks
OpenAI CodexOpenAI API keyUsage-basedVS Code, browserCode generation from natural language, terminal agent
Cursorcursor.shFree tier / $20/monthStandalone IDE (VS Code fork)AI-native development, entire codebase context
Windsurf (Antigravity)windsurf.aiFree tier availableStandalone IDEAgentic coding, multi-file edits, AI flow

Recommended setup for this series: Start with GitHub Copilot free tier in VS Code. It’s enough for all the labs in this series, and free. We’ll show you how to set it up right now.

9. Step-by-Step: Set Up GitHub Copilot in VS Code (Free)

Follow these exact steps:

Step 1: Sign Up / Sign In to GitHub

If you don’t have a GitHub account, create one free at github.com. GitHub Copilot free tier requires a verified GitHub account.

Step 2: Enable GitHub Copilot Free

  1. Go to github.com/settings/copilot
  2. Click “Start using Copilot for free” (no credit card needed for free tier)
  3. The free tier gives you: 2,000 code completions/month + 50 chat requests/month with Claude Sonnet or GPT-4o as the base model

Step 3: Install GitHub Copilot Extension in VS Code

  1. Open VS Code
  2. Press Ctrl+Shift+X (Windows/Linux) or Cmd+Shift+X (Mac) to open Extensions
  3. Search: GitHub Copilot
  4. Click Install on “GitHub Copilot” by GitHub (the first result)
  5. Also install “GitHub Copilot Chat” β€” this adds the chat panel

Step 4: Sign In to GitHub from VS Code

  1. After installation, a notification appears: “Sign in to use GitHub Copilot” β€” click it
  2. Alternatively: Click the Accounts icon (person icon) at the bottom left of VS Code
  3. Click “Sign in with GitHub to use GitHub Copilot”
  4. Your browser opens β€” sign in with your GitHub account and authorize VS Code
  5. Return to VS Code β€” you’re connected

πŸ“Έ Screenshot: VS Code sign-in prompt β€” click to authorize with GitHub account.

Step 5: Test GitHub Copilot is Working

Open any Python file in your project (or create a new one: test_copilot.py) and start typing:

# test_copilot.py
# Type the comment below and watch Copilot suggest the code

# Function to connect to Oracle database and run a query

After typing the comment, pause for 1-2 seconds. Copilot will show a greyed-out code suggestion. Press Tab to accept it, or Esc to dismiss.

πŸ“Έ Screenshot: VS Code showing Copilot’s inline suggestion (greyed text) β€” press Tab to accept.

Step 6: Use Copilot Chat

This is the most powerful feature. Open the chat panel:

  • Press Ctrl+Alt+I (Windows/Linux) or Cmd+Option+I (Mac)
  • Or click the Copilot icon in the left sidebar

Try asking it something DBA-specific:

Write a Python function that connects to PostgreSQL using psycopg2,
runs a query to find the top 10 longest-running queries from pg_stat_activity,
and returns the results as a list of dictionaries.

Copilot will generate the complete function. This is exactly what Sunny demonstrated in class β€” “your thoughts work here, the AI writes the code.”

πŸ“Έ Screenshot: Copilot Chat panel showing the generated PostgreSQL function.

πŸ’‘ Copilot Keyboard Shortcuts (memorize these):

Tab β€” Accept full suggestion
Ctrl+β†’ β€” Accept next word only
Esc β€” Dismiss suggestion
Alt+] β€” See next suggestion
Alt+[ β€” See previous suggestion
Ctrl+Alt+I β€” Open Copilot Chat

10. Complete UV Command Reference

Here is the complete UV command reference you’ll use throughout this bootcamp. Keep this as your quick-reference guide from previous blog.

Virtual Environment Management

# Create virtual environment β€” generic (uses pinned version)
uv venv

# Create with a specific name
uv venv myenv

# Create with a specific Python version (exact build string)
uv venv env --python cpython-3.12.18-windows-x86_64-none

# Generic version syntax
uv venv myenv --python 3.12

# Sync environment β€” creates/updates .venv to match pyproject.toml
# Run this after every git pull or when adding new packages
uv sync

Package Management

# Add a single package
uv add openai

# Add multiple packages at once
uv add openai python-dotenv langchain

# Add from a requirements.txt file
uv add -r requirements.txt

# Install using pip syntax (also works)
uv pip install openai
uv pip install -r requirements.txt

# Remove a package
uv remove numpy

# List all installed packages
uv pip list

Running Code

# Run a Python file (uses .venv automatically β€” no activation needed)
uv run hello_genai.py

# Run Python interactively
uv run python

# OR: Activate manually (classic method)
# Windows:
.venv\Scripts\activate
# macOS / Linux:
source .venv/bin/activate

# After activation, run normally
python hello_genai.py

# Deactivate
deactivate

Maintenance

# Clear UV cache (use when things get weird β€” package corruption etc.)
uv cache clean

# Update UV itself
uv self update

What to Commit to Git

# βœ… COMMIT these files:
pyproject.toml      # your dependency manifest
uv.lock             # exact pinned versions
.python-version     # Python version pin

# ❌ DO NOT COMMIT:
.venv/              # virtual environment (too large, OS-specific)
.env                # API keys (security risk)
__pycache__/        # Python compiled bytecode

11. Common Errors and Fixes

Error 1: VS Code not picking up UV environment

Symptom: Works with uv run file.py in terminal but fails when clicking the Play button in VS Code.

ModuleNotFoundError: No module named 'numpy'

Cause: VS Code’s Play button uses the system Python, not your .venv. The Code Runner extension may also override the Python interpreter.

Fix:

# Step 1: Select the correct interpreter
# Press Ctrl+Shift+P β†’ "Python: Select Interpreter"
# Choose: Python 3.12 ('.venv': venv) ./genai-bootcamp/.venv/...

# Step 2: If using Code Runner extension, configure it to use venv
# In VS Code Settings (Ctrl+,), search "code-runner.executorMap"
# Set python to: "${workspaceRoot}/.venv/Scripts/python" (Windows)
# or "${workspaceRoot}/.venv/bin/python" (Mac/Linux)

# Step 3 (simplest): Always use uv run instead of the Play button
uv run your_file.py

Error 2: GitHub Copilot shows “No completions available”

Cause: Not signed in, monthly free tier limit reached, or file type not supported.

Fix:

# Check sign-in: Click Accounts icon (bottom-left VS Code)
# Ensure GitHub account shown with "GitHub Copilot" listed

# Check usage: github.com/settings/copilot
# Shows remaining completions for current month

# If limit reached: upgrade to $10/month for unlimited
# OR use Claude Code / OpenAI Codex as alternative

Error 3: uv python list shows no Python versions

Symptom:

No Python versions found

Fix:

# UV shows both installed and downloadable versions
# If empty, it means no internet connectivity to UV's Python registry

# Check connectivity (corporate proxy)
set UV_NATIVE_TLS=true   # Windows
export UV_NATIVE_TLS=true  # Mac/Linux

# Then retry
uv python list
uv python install 3.12

Error 4: uv sync fails with “No module named pip”

Fix:

# UV manages its own pip β€” reinstall pip inside the venv
uv run python -m ensurepip --upgrade
# Then retry
uv sync

12. Key Takeaways

βœ… What you learned in this post:

  • The full AI tree runs: ML (statistics) β†’ DL (neural networks) β†’ NLP β†’ GenAI (Transformer-based) β†’ Agentic AI. Each level builds on the one below.
  • Deep Learning has 4 main branches: ANN, CNN, RNN, GAN. For GenAI, all roads lead to Transformers β€” RNNs and LSTMs are now obsolete.
  • The Transformer (2017) won because it processes sequences in parallel with attention, unlike RNNs that process token-by-token and forget long context.
  • DBAs and infra engineers take the Modern Route: skip classical ML/DL coding, start from Transformers, and learn the 13 key GenAI frameworks directly.
  • The GenAI tools ecosystem has 12 categories β€” vector DBs, LLM frameworks, document parsing, guardrails, cloud platforms, evaluation, and more. You don’t learn all at once β€” each module covers one category.
  • GitHub Copilot free tier is sufficient for all labs in this series. Set it up now and use it for every coding session going forward.
  • Always use uv run file.py for running Python files β€” it automatically uses your .venv without activation.

Test Your Knowledge

🧠 AI Roadmap and Tools Quiz

20 questions Β· Covers AI tree, DL branches, tools ecosystem Β· FreeπŸ‘‰ Take the Quiz Now

πŸ‘‰ Next Post: Tokens, Embeddings and Transformers β€” Explained for DBAs with Code

References


Part of the GenAI from Scratch series for DBAs and Infrastructure Engineers. Published every Friday at gradeupnow.in/genai-blog/

Scroll to Top