Troubleshooting FAQ Chatbot: Document Retrieval Issues Explained
Hey there 👋 I'm setting up an FAQ chatbot that sources content from a Confluence space (a set of ~250 pages). I’ve noticed the Agent is returning irrelevant answers for some queries. After some digging, it looks like the RAG system isn't retrieving documents dated before Nov 2024 (see attached message), which is likely why relevant content is being missed. A few questions:
Why is document retrieval limited by date? Shouldn't semantic relevance take precedence?
With a relatively small number of pages, I wouldn't expect such a hard cutoff.
Is there something I’m overlooking, or a way to override this behavior? Thanks a lot for your insights.