Quick Tutorial¶
This tutorial walks you through a complete RAG pipeline from scratch using RAGWire and Ollama (fully local, no API key required). By the end you will have ingested a PDF and retrieved relevant chunks for any query.
Prerequisites¶
Step 1 — Install RAGWire¶
fastembed is needed for hybrid search (dense + sparse vectors).
Step 2 — Start Qdrant¶
Qdrant is the vector database where your document chunks are stored.
Verify it is running:
You should see:
Step 3 — Pull Ollama Models¶
You need two models — one for embeddings, one for metadata extraction.
ollama pull nomic-embed-text # Embedding model (274MB)
ollama pull qwen3.5:9b # Chat model for metadata extraction
Verify both are available:
Step 4 — Create Configuration¶
Create a config.yaml file in your project directory:
loader:
extensions: [".pdf", ".docx", ".xlsx", ".pptx", ".txt", ".md"]
splitter:
chunk_size: 10000
chunk_overlap: 2000
strategy: "markdown"
embeddings:
provider: "ollama"
model: "nomic-embed-text"
base_url: "http://localhost:11434"
llm:
provider: "ollama"
model: "qwen3.5:9b"
base_url: "http://localhost:11434"
num_ctx: 16384
vectorstore:
url: "http://localhost:6333"
collection_name: "my_docs"
use_sparse: true
force_recreate: false
retriever:
search_type: "hybrid"
top_k: 5
auto_filter: false # set true to enable LLM-based filter extraction from every query
logging:
level: "INFO"
console_output: true
colored: true
Step 5 — Add Documents¶
Create a data/ folder and place your PDF (or DOCX, XLSX, etc.) inside:
Step 6 — Run the Pipeline¶
Create a Python script run.py:
from ragwire import RAGWire
# Initialize — loads config, connects to Qdrant, initializes models
rag = RAGWire("config.yaml")
# Ingest documents
stats = rag.ingest_documents(["data/Apple_10k_2025.pdf"])
print(f"Processed : {stats['processed']}/{stats['total']}")
print(f"Skipped : {stats['skipped']} (already ingested)")
print(f"Chunks : {stats['chunks_created']}")
Run it:
Expected output:
INFO - Created new collection: my_docs
INFO - Processed data/Apple_10k_2025.pdf: 42 chunks
Processed : 1/1
Skipped : 0 (already ingested)
Chunks : 42
Deduplication
Run the script a second time — the file will be skipped automatically because RAGWire checks the SHA256 hash before ingesting.
Step 7 — Retrieve Documents¶
Add retrieval to your script:
from ragwire import RAGWire
rag = RAGWire("config.yaml")
queries = [
"What is the total revenue?",
"What are the main product categories?",
"What are the key risk factors?",
]
for query in queries:
print(f"\nQuery: {query}")
print("-" * 50)
results = rag.retrieve(query, top_k=3)
for i, doc in enumerate(results, 1):
print(f"\nResult {i}:")
print(f" Company : {doc.metadata.get('company_name', 'Unknown')}")
print(f" Doc Type : {doc.metadata.get('doc_type', 'Unknown')}")
print(f" Year : {doc.metadata.get('fiscal_year', 'Unknown')}")
print(f" Content : {doc.page_content[:300]}...")
Step 8 — Explore Component Usage¶
You can also use individual components directly:
from ragwire import (
MarkItDownLoader,
get_markdown_splitter,
get_splitter,
get_embedding,
QdrantStore,
MetadataExtractor,
)
# Load a document
loader = MarkItDownLoader()
result = loader.load("data/Apple_10k_2025.pdf")
print(f"Loaded: {result['file_name']}, chars: {len(result['text_content'])}")
# Split into chunks
splitter = get_markdown_splitter(chunk_size=10000, chunk_overlap=2000)
chunks = splitter.split_text(result["text_content"])
print(f"Chunks: {len(chunks)}")
# Create embeddings
embedding = get_embedding({
"provider": "ollama",
"model": "nomic-embed-text",
"base_url": "http://localhost:11434",
})
vector = embedding.embed_query("test query")
print(f"Embedding dimension: {len(vector)}")
# Connect to vector store
store = QdrantStore(
config={"url": "http://localhost:6333"},
embedding=embedding,
collection_name="my_docs",
)
vectorstore = store.get_store(use_sparse=True)
results = vectorstore.similarity_search("total revenue", k=3)
print(f"Retrieved: {len(results)} chunks")
Step 9 — Hybrid Search¶
Hybrid search combines dense (semantic) and sparse (keyword) vectors for better recall. It is enabled by default when use_sparse: true and search_type: "hybrid" are set in config.yaml.
You can also call it directly:
from ragwire import RAGWire
rag = RAGWire("config.yaml")
# Hybrid search (dense + sparse)
results = rag.hybrid_search("Apple total revenue fiscal 2025", k=5)
for doc in results:
print(doc.page_content[:200])
Step 10 — Switch Providers¶
Switching providers only requires changing config.yaml — no code changes needed.
Switching embedding models
If you change the embedding model, you must set force_recreate: true in vectorstore once to rebuild the collection — then set it back to false.
Troubleshooting¶
| Error | Fix |
|---|---|
Qdrant connection refused |
Run docker run -p 6333:6333 qdrant/qdrant |
Ollama model not found |
Run ollama pull <model-name> |
fastembed missing |
pip install fastembed |
markitdown[pdf] missing |
pip install "markitdown[pdf]" |
| Embedding dimension mismatch | Set force_recreate: true once, then back to false |
| Collection has no sparse vectors | Set force_recreate: true once, then back to false |