RAGWire with Google Gemini¶
Use Google Gemini for both embeddings and the metadata extraction LLM.
Prerequisites¶
- Google AI API key — aistudio.google.com
- RAGWire installed:
pip install "ragwire[google]" - Qdrant running:
docker run -d -p 6333:6333 qdrant/qdrant
1. Install Dependencies¶
2. Set API Key¶
# Linux / macOS
export GOOGLE_API_KEY="AIza..."
# Windows (PowerShell)
$env:GOOGLE_API_KEY="AIza..."
Or add it to a .env file at the project root:
3. Configuration¶
embeddings:
provider: "google"
model: "models/gemini-embedding-001" # Stable, recommended for production
llm:
provider: "google"
model: "gemini-2.5-flash" # Best price/performance
# model: "gemini-2.5-pro" # Most advanced, deep reasoning
vectorstore:
url: "http://localhost:6333"
collection_name: "my_docs"
use_sparse: true
force_recreate: false
retriever:
search_type: "hybrid"
top_k: 5
auto_filter: false # set true to enable LLM-based filter extraction from every query
4. Python Usage¶
from ragwire import RAGWire
rag = RAGWire("config.yaml")
# Ingest
stats = rag.ingest_documents(["data/Apple_10k_2025.pdf"])
print(f"Chunks created: {stats['chunks_created']}")
# Retrieve
results = rag.retrieve("What is Apple's total revenue?", top_k=5)
for doc in results:
print(doc.metadata.get("company_name"), doc.page_content[:200])
5. Run the Example¶
Embedding Model Comparison¶
| Model | Notes |
|---|---|
models/gemini-embedding-001 |
Stable, recommended for production |
models/gemini-embedding-2-preview |
Newer multimodal embedding (preview) |
Chat Model Comparison¶
| Model | Notes |
|---|---|
gemini-2.5-flash |
Best price/performance — recommended |
gemini-2.5-pro |
Most advanced, deep reasoning |
gemini-2.5-flash-lite |
Fastest and most budget-friendly |
Notes¶
- Use
provider: "google"orprovider: "gemini"— both are accepted. - The API key can also be passed directly in config:
api_key: "AIza..."— but environment variables are preferred. - Free tier has rate limits. For production use, upgrade to a paid plan.