|
Planet MySQL
|
Planet MySQL - https://planet.mysql.com
|
-
Scoped Vector Search with the MyVector Plugin for MySQL — Part III
From Concepts to Production: Real-World Patterns, Query Plans, and What’s Next
In Part I, we introduced scoped vector search in MySQL using the MyVector plugin, focusing on how semantic similarity and SQL filtering work together.
In Part II, we explored schema design, embedding strategies, HNSW indexing, hybrid queries, and tuning — and closed with a promise to show real-world usage and execution behavior.
This final part completes the series.
Semantic Search with Explicit Scope
In real systems, semantic search is almost never global. Results must be filtered by tenant, user, or domain before ranking by similarity.
Copy
SELECT id, titleFROM knowledge_baseWHERE tenant_id = 42ORDER BY myvector_distance(embedding, ?, 'COSINE')LIMIT 10;
This follows the same pattern introduced earlier in the series:
SQL predicates define scope
Vector distance defines relevance
MySQL remains in control of execution
Real-Time Document Recall (Chunk-Based Retrieval)
Document-level embeddings are often too coarse. Most AI workflows retrieve chunks.
CopySQL
SELECT chunk_textFROM document_chunksWHERE document_id = ?ORDER BY myvector_distance(chunk_embedding, ?, 'L2')LIMIT 6;
This query pattern is commonly used for:
Knowledge-base lookups
Assistant context retrieval
Pre-RAG recall stages
Chat Message Memory and Re-Ranking
Chronological chat history is rarely useful on its own. Semantic re-ranking allows systems to recall relevant prior messages.
CopySQL
SELECT messageFROM chat_historyWHERE session_id = ?ORDER BY myvector_distance(message_embedding, ?, 'COSINE')LIMIT 8;
The result set can be fed directly into an LLM prompt as conversational memory.
Using MyVector in RAG Pipelines
MyVector integrates naturally into Retrieval-Augmented Generation workflows by acting as the retrieval layer.
CopySQL
SELECT id, contentFROM documentsWHERE MYVECTOR_IS_ANN( 'mydb.documents.embedding', 'id', ?)LIMIT 12;
At this point:
Embeddings are generated externally
Retrieval happens inside MySQL
Generation happens downstream
No additional vector database is required.
Query Execution and Fallback Behavior
ANN Execution Path (HNSW Enabled)
Once an HNSW index is created and loaded, MySQL uses the ANN execution path provided by the plugin.Candidate IDs are retrieved first, followed by row lookups.
This behavior is visible via EXPLAIN.
Brute-Force Fallback (No HNSW Index)
When no ANN index is available, MyVector falls back to deterministic KNN evaluation.
CopySQL
SELECT idFROM documentsORDER BY myvector_distance(embedding, ?, 'L2')LIMIT 20;
This results in a full scan and sort — slower, but correct and predictable.
Understanding this fallback is critical for production sizing and diagnostics.
Project Update: MyVector v1.26.1
The project continues to move quickly.
MyVector v1.26.1 is now available, introducing enhanced Docker support for:
MySQL 8.4 LTS
MySQL 9.0
This release significantly improves:
Local testing
CI pipelines
Evaluation and onboarding
Repository: https://github.com/askdba/myvector
Release v1.26.1: https://github.com/askdba/myvector/releases/tag/v1.26.1
Stop Moving Data — Start Searching It Where It Lives
Across all three parts, the conclusion is consistent:
Vector search does not require a separate database.
With MyVector, you can:
Keep data in MySQL
Apply strict SQL scoping
Use ANN when available
Fall back safely when it isn’t
All with observable execution plans and predictable behavior.
Join the Community
Development happens in the open:
GitHub: https://github.com/askdba/myvector
Releases: https://github.com/askdba/myvector/releases
Feedback and contributions are welcome.
Next Up: Powering AI-Ready MySQL — When MyVector Meets ProxySQL
The next step is production architecture.
In the next post, we’ll explore:
Integrated MCP Server
Improved Full Text Search operations
Routing vector-heavy queries with ProxySQL
Isolating ANN workloads from OLTP traffic
Designing AI-ready MySQL deployments that scale safely
MyVector brings semantic search into MySQL.ProxySQL helps it run at scale.
Stay tuned…
-
Separating FUD and Reality: Has MySQL Really Been Abandoned?
Over the past weeks, we have seen renewed discussion/concern in the MySQL community around claims that “Oracle has stopped developing MySQL” or that “MySQL is being abandoned.” These concerns were amplified by graphs showing an apparent halt in GitHub commits after October 2025, as well as by blog posts and forum discussions that interpreted these […]
-
Native Password Legacy for 9.6
In the previous article, I shared a solution for people who want to try the latest and greatest MySQL version. We just released MySQL Innovation 9.6, and for those willing to test it with their old application and require the unsafe old authentication method, here are some RPMs of the legacy authentication plugin for EL/OL […]
-
Where can you find MySQL during January to April 2026
As a follow-up to our previous blog post, we are excited to invite you to a variety of shows, meetups, and events that we will be participating in from January 2026 through April 2026. Below, you will find the specific dates and locations. We look forward to connecting with you and sharing valuable insights during […]
-
What Oracle Missed, We Fixed: More Performant Query Processing in Percona Server for MySQL, Part 2
Remember when Percona significantly improved query processing time by fixing the optimizer bug? I have described all the details in More Performant Query Processing in Percona Server for MySQL blog post. This time, we dug deeper into all the ideas from Enhanced for MySQL and based on our analysis, we proposed several new improvements. All […]
|