Hey!
I like the Include Data feature/tool. However, I’m running into a limitation that significantly reduces its value: it seems to be constrained by the context window in a single pass.
For example, when I query “check calls transcripts over the last 3 months,” it only includes documents since Aug 13th (yesterday only — as shown in my screenshot), even when using a Gemini model with 1M tokens.
This makes the tool feel quite limited for broader queries.
Suggestion: Could Dust automatically handle this with some kind of iterative loop or chunking mechanism? The tool would be incredibly powerful if it could:
Automatically paginate through results beyond the context window
Summarize/compress information from multiple passes
Intelligently prioritize the most relevant documents
Show users what was included vs. what was filtered out
Suggestion 2: It would be really helpful to display the actual token consumption for the search process. Sometimes I struggle to understand when/why I’m hitting limits, and seeing something like “Used 850K/1M tokens for document retrieval” would provide much better visibility into what’s happening under the hood.
Thanks team 🙏