Enhancing Include Data Tool: Suggestions for Improved Functionality
Hey! I like the Include Data feature/tool. However, I’m running into a limitation that significantly reduces its value: it seems to be constrained by the context window in a single pass. For example, when I query “check calls transcripts over the last 3 months,” it only includes documents since Aug 13th (yesterday only — as shown in my screenshot), even when using a Gemini model with 1M tokens. This makes the tool feel quite limited for broader queries. Suggestion: Could Dust automatically handle this with some kind of iterative loop or chunking mechanism? The tool would be incredibly powerful if it could:
Automatically paginate through results beyond the context window
Summarize/compress information from multiple passes
Intelligently prioritize the most relevant documents
Show users what was included vs. what was filtered out
Suggestion 2: It would be really helpful to display the actual token consumption for the search process. Sometimes I struggle to understand when/why I’m hitting limits, and seeing something like “Used 850K/1M tokens for document retrieval” would provide much better visibility into what’s happening under the hood. Thanks team 🙏
