Inspect What's in Your Collection¶
Before querying, understand what's actually stored.
Collection Stats¶
from ragwire import RAGWire
rag = RAGWire("config.yaml")
stats = rag.get_stats()
print(f"Collection : {stats['collection_name']}")
print(f"Total chunks: {stats['total_documents']}")
print(f"Vector size : {stats['vector_size']}")
Discover Metadata Fields¶
# What metadata fields exist across all stored documents?
fields = rag.discover_metadata_fields()
print(f"Fields: {fields}")
# → ['company_name', 'doc_type', 'fiscal_year', 'file_name', 'file_type', 'chunk_index', ...]
Get Unique Values per Field¶
# What values are stored for the key fields?
values = rag.get_field_values(["company_name", "doc_type", "fiscal_year"])
print(values)
# → {
# 'company_name': ['apple', 'microsoft', 'google'],
# 'doc_type': ['10-k', '10-q'],
# 'fiscal_year': ['2024', '2025'],
# }
get_field_values uses Qdrant's native facet API — fast and exact regardless of collection size. Results are ordered by frequency (most common values first).
Single Field Shorthand¶
Increase the Value Limit¶
By default up to 50 unique values are returned per field. Increase for high-cardinality fields:
Use Inspection to Build a Smart Agent¶
Feed the discovered values into your agent's system prompt so it knows what to filter on:
values = rag.get_field_values(["company_name", "doc_type", "fiscal_year"])
system_prompt = f"""
You are a financial document assistant.
Available companies: {values['company_name']}
Available doc types: {values['doc_type']}
Available years: {values['fiscal_year']}
"""
See RAG Agent for the full metadata-aware agent example.