RAGWire with OpenAI¶
Use OpenAI for both embeddings and the metadata extraction LLM.
Prerequisites¶
- OpenAI API key — platform.openai.com
- RAGWire installed:
pip install "ragwire[openai]" - Qdrant running:
docker run -d -p 6333:6333 qdrant/qdrant
1. Install Dependencies¶
2. Set API Key¶
Or add it to a .env file at the project root:
3. Configuration¶
embeddings:
provider: "openai"
model: "text-embedding-3-small" # 1536-dim, best price/performance
# model: "text-embedding-3-large" # 3072-dim, highest quality
llm:
provider: "openai"
model: "gpt-5.4-nano" # Latest — fast, affordable, good for metadata extraction
# model: "gpt-4o-mini" # Previous generation
vectorstore:
url: "http://localhost:6333"
collection_name: "my_docs"
use_sparse: true
force_recreate: false
retriever:
search_type: "hybrid"
top_k: 5
auto_filter: false # set true to enable LLM-based filter extraction from every query
4. Python Usage¶
from ragwire import RAGWire
rag = RAGWire("config.yaml")
# Ingest
stats = rag.ingest_documents(["data/Apple_10k_2025.pdf"])
print(f"Chunks created: {stats['chunks_created']}")
# Retrieve
results = rag.retrieve("What is Apple's total revenue?", top_k=5)
for doc in results:
print(doc.metadata.get("company_name"), doc.page_content[:200])
5. Run the Example¶
Embedding Model Comparison¶
| Model | Dimensions | Notes |
|---|---|---|
text-embedding-3-small |
1536 | Best price/performance — recommended |
text-embedding-3-large |
3072 | Highest quality, multilingual |
text-embedding-ada-002 |
1536 | Legacy — avoid for new projects |
6. Build a RAG Agent¶
Use create_agent to wrap the retriever as a tool and build a conversational Q&A app:
from langchain.agents import create_agent
from langchain.tools import tool
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import InMemorySaver
from ragwire import RAGWire
rag = RAGWire("config.yaml")
rag.ingest_directory("data/")
@tool
def search_documents(query: str) -> str:
"""Search the document knowledge base for relevant information."""
results = rag.retrieve(query, top_k=5)
if not results:
return "No relevant documents found."
return "\n\n---\n\n".join(
f"[{doc.metadata.get('file_name')}]\n{doc.page_content}"
for doc in results
)
agent = create_agent(
model=ChatOpenAI(model="gpt-5.4-nano"),
tools=[search_documents],
system_prompt=(
"You are a helpful document assistant. "
"Always use search_documents to retrieve information before answering — never answer from general knowledge. "
"If no relevant documents are found, say so — do not guess or fabricate an answer. "
"Always cite the source document in your answer."
),
checkpointer=InMemorySaver(),
)
config = {"configurable": {"thread_id": "session-1"}}
response = agent.invoke(
{"messages": [HumanMessage("What is the total revenue?")]},
config=config,
)
print(response["messages"][-1].content)
See RAG Agent for the full guide including multi-turn memory and structured output.
Notes¶
- If you change embedding model after ingestion, set
force_recreate: trueonce to rebuild the collection (dimensions will differ). - The API key can also be passed directly in config:
api_key: "sk-..."— but environment variables are preferred.