May 2025·10 min read·GenAIOracle 23aiConcept
In Post 1, you learned that an LLM reads text and predicts the next word — billions of times — until it learns how language works. That was the what.
Now here’s the how: How does a computer, which only understands numbers, make sense of words like “database,” “query,” or “tablespace”?
The answer is vectors. Once you understand them, everything else in AI — embeddings, semantic search, Oracle 23ai Vector Search, RAG — will suddenly click.
What you’ll learn
- What a vector actually is (using a DBA analogy you already know)
- Why computers need to turn words into numbers
- What “embedding” means and why it’s not scary
- How Oracle 23ai uses vectors to power AI search
- Why this concept is the foundation of everything that follows
Prerequisites
- Read Post 1: What is AI, ML and LLM? — Plain English for Oracle DBAs
- Know what a database table is (basic DBA knowledge)
- No math, no code, no AI background required
✓No lab needed. This is a concept-only post. No commands, no server, no OCI free tier required. Just 10 minutes.
Part 1 The Problem — Computers Only Understand Numbers
Here’s the fundamental problem.
You write: SELECT * FROM employees WHERE job = ‘DBA’;
The computer sees: a string of characters. It has no idea what a “DBA” is, what it means, or whether it’s related to “Database Administrator” or “database engineer.”
Computers are extraordinarily good at one thing: numbers. Addition, comparison, sorting, indexing — all of that happens at lightning speed. But meaning? Context? Similarity? That’s hard.
So AI researchers asked: what if we could turn every word into a number — or a list of numbers — in a way that preserves meaning?
That list of numbers is called a vector.
Part 2 What Is a Vector? — The DBA Way
Let’s forget math for a moment. Think about how you already describe things as DBAs.
Imagine you’re describing an Oracle database instance. You might record:
| Property | Value |
|---|---|
| Number of CPUs | 8 |
| RAM (GB) | 64 |
| Data size (TB) | 2.5 |
| Number of connections | 300 |
| Is it on OCI? | 1 (yes) |
That row of numbers — [8, 64, 2.5, 300, 1] — is a vector.
It’s just a list of numbers that describes something. The order matters. Each position means something specific.
The Key Insight
If you measure a word along many different dimensions — how “technical” it is, how “positive” it sounds, how related to databases it is, and hundreds more — you end up with a long list of numbers. That list of numbers captures the meaning of the word in a way a computer can work with.
That’s a word vector. That’s an embedding.
Part 3 Why Lists of Numbers Are Powerful
If two words have similar meaning, their vectors will be close together in that multi-dimensional space. If two words mean something very different, their vectors will be far apart.
Let’s make that concrete:
| Word pair | Vector relationship | Why |
|---|---|---|
| “DBA” ↔ “database administrator” | Very close | Same meaning, different words |
| “tablespace” ↔ “storage” | Fairly close | Related concepts |
| “tablespace” ↔ “pizza” | Very far apart | No meaningful relationship |
This is the magic: you can now measure similarity between words by measuring distance between numbers. And computers are incredible at measuring distance between numbers.
This is why AI can answer: “What questions are similar to ‘How do I tune Oracle queries?'” — not by matching exact words, but by comparing vectors.
Part 4 A Concrete Analogy — Map Coordinates
Here’s an analogy that might click immediately.
Imagine a map. Every location has two coordinates: latitude and longitude. For example:
| Location | Coordinates |
|---|---|
| Your office | [43.4516, -79.6790] |
| Nearest coffee shop | [43.4521, -79.6795] |
| Airport in another city | [43.6777, -79.6248] |
These are 2-dimensional vectors. From these numbers alone, you can instantly know: the coffee shop is close, the airport is far away.
Scale That Up
Now imagine instead of a 2-dimensional map, you have a 1,536-dimensional space — one for every nuance of meaning a word can carry. Every word gets placed in that space based on its meaning. Words that mean similar things end up in the same neighbourhood. Words that are unrelated end up in completely different parts of the space.
That’s exactly how word embeddings work. The AI model is the system that figures out where in that space each word belongs — by training on billions of documents.
Part 5 What Is an Embedding?
You’ll hear “embedding” and “vector” used almost interchangeably. Here’s the precise difference:
Vector
Any list of numbers
A list of numbers that describes something. Could describe a database instance, a point on a map, or anything else.
[8, 64, 2.5, 300, 1]
Embedding
A meaningful vector
A vector produced by an AI model specifically to capture the meaning of text. A subset of vectors.
[0.24, -0.87, 0.03, …, 1536 dims]
All embeddings are vectors. Not all vectors are embeddings.
When an LLM converts the sentence “How do I create a tablespace in Oracle?” into a long list of numbers — that’s called creating an embedding for that sentence. The key: the embedding captures meaning, not just characters.
Part 6 Why Oracle DBAs Need to Care — Right Now
Oracle 23ai introduced a new data type: VECTOR.
You can now store vectors directly in an Oracle table — the same way you store a NUMBER or VARCHAR2. And Oracle 23ai includes built-in functions to search those vectors at scale, finding the most similar items rather than exact matches.
Here’s a real-world scenario:
Real-World Scenario — Support Ticket Search
Your company has 50,000 support tickets stored in Oracle. A new ticket arrives: “Database crashed after applying patch on primary server.”
Traditional SQL
Finds tickets with that exact word. Misses anything that says the same thing differently.
WHERE description LIKE '%crash%'
Vector Search
Returns 10 most semantically similar tickets — even if they say “ORA-04031 after patching” or “standby went down during maintenance.” Same meaning, different words.
This is the foundational capability behind RAG (Retrieval Augmented Generation), AI chatbots over enterprise data, and Oracle 23ai AI features — all covered in later posts.
Part 7 How Big Is a Vector?
Common sizes across models and use cases:
| Model / Use Case | Vector Dimensions |
|---|---|
| Simple word2vec model | 100–300 |
| BERT (base) | 768 |
| OpenAI text-embedding-3-small | 1,536 |
| Oracle AI Vector Search (typical) | 1,536 |
| OpenAI text-embedding-3-large | 3,072 |
For a table of 1 million support tickets, each with a 1,536-dimension vector: that’s 1 million × 1,536 numbers. Oracle 23ai handles this with specialised indexing — HNSW and IVF indexes — that make searching that space extremely fast. That’s a dedicated post in itself.
Part 8 The End-to-End Flow
Let’s put it all together:
Input
Your text — a question, a support ticket, a document
“How do I create a tablespace in Oracle?”
Embedding Model
AI model converts text → list of numbers capturing its meaning
Vector
A 1,536-number fingerprint of the sentence’s meaning
[0.12, -0.45, 0.88, 0.03, …, 1536 numbers]
Oracle 23ai VECTOR Column
Stored alongside your regular data in a familiar Oracle table
Distance Function
Oracle compares vectors using cosine similarity or Euclidean distance — using SQL
SELECT * FROM tickets ORDER BY VECTOR_DISTANCE(v, :query_vec) FETCH FIRST 10 ROWS ONLY;
Results
The most semantically similar rows — even if the words are completely different
Part 9 What You Don’t Need to Worry About
A lot of DBA anxiety about AI comes from seeing things like this in tutorials:
v = np.array([0.12, -0.45, 0.88, 0.03])
similarity = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))
Good News for DBAs
You do not need to write this. You do not need to understand the linear algebra behind it. When you work with Oracle 23ai Vector Search, you’ll write SQL. Oracle handles all the vector mathematics internally. Your job is to understand what vectors represent so you can design schemas, choose the right index types, and explain the capability to business stakeholders.
The math is Oracle’s problem. The design is yours.
Key Takeaways
✓
A vector is simply a list of numbers — not a math concept to fear, but a data structure you already work with as a DBA
✓
AI models convert words, sentences, and documents into vectors so computers can measure similarity by measuring distance
✓
An embedding is a vector produced by an AI model that captures the meaning of text — not just its characters
✓
Oracle 23ai introduces the VECTOR data type, letting you store and search embeddings directly using SQL you already know
✓
Vectors are the foundation beneath LLMs, semantic search, RAG, and AI-powered applications — understanding them unlocks everything that follows
References
- Oracle 23ai AI Vector Search Documentation
- Oracle 23ai Free Edition Download
- Post 1: What is AI, ML and LLM? — Plain English for Oracle DBAs
Found this helpful? Share it with your DBA team.
Questions? Drop them in the comments below.