[Documents] → [Chunking] → [Embeddings] → [Qdrant]
↓
[Question] → [Embedding requête] → [Similarity Search] → [LLM] → [Réponse]
↓
[Question] → [Embedding requête] → [Similarity Search] → [LLM] → [Réponse]
🔧 Stack technique
| Composant | Rôle | Version |
|---|---|---|
| Qdrant | Base de données vectorielle | Latest |
| Embeddings | Transformation texte → vecteurs | nomic-embed-text |
| LangChain | Orchestration pipeline RAG | 0.3.x |
| Ollama | LLM en sortie du RAG | Local |
📂 Datasets ZFS
| Dataset | Contenu |
|---|---|
| rag | Documents sources (PDF, TXT, MD) |
| embeddings | Vecteurs précalculés |
| qdrant | Base Qdrant persistée |
🐍 Pipeline Python (cible)
python
from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Qdrant
# 1. Chargement
loader = DirectoryLoader('/mnt/qdrant/rag/')
docs = loader.load()
# 2. Découpage
splitter = RecursiveCharacterTextSplitter(
chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)
# 3. Embeddings + indexation
embeddings = OllamaEmbeddings(model='nomic-embed-text')
vectorstore = Qdrant.from_documents(
chunks, embeddings,
url='http://localhost:6333',
collection_name='portfolio')
# 4. Recherche
results = vectorstore.similarity_search('Qu\'est-ce que VFIO ?', k=3)