Dust Community Icon

Clarifying Query Structure for Effective Data Source in App Development

·
·

Hi there, I’ve built an app that I use in one of my bots. The first block of the app is a data_source block, which I use to find related discussions (from call transcript files). From what I understand, the query section in the block specifies what the semantic search should target. However, no matter what I input in the query, I consistently get the same 24 chunks returned—even though there are over 100 documents with many more chunks available. I might be misunderstanding how to structure the query. I’ve gone through the documentation, but it’s still unclear. Could you provide a clear example of a properly written query so I can ensure I’m using it correctly? Thanks in advance!

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    Also cannot access this section of the doc: https://docs.dust.tt/docs/core-blocks Getting a blank page

  • Avatar of Remi
    Remi
    ·
    ·

    Hey Nathys Marchal 👋 Thanks for you question ! Does the following answer from @help help? Happy to forward this to someone from the team if needed.

  • Avatar of Remi
    Remi
    ·
    ·

    Also cannot access this section of the doc:

    Not sure what you mean here. Could you clarify?

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    The page just do not charge, it stay full white

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    Yes I tried to tell him find discussions about... but whatever i say after about it returns me the same chuncks and same number of chunks, so doesnt look like working

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    Even worst when I say "find discussions about companyX" 10% of the time I get 26 items return and 90% of the time I get nothing return. query isn't working well and isn"t constant. Do you have similar issues on your side ?

  • Avatar of Remi
    Remi
    ·
    ·

    Can you share a screenshot of your Dust app? 🙏

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    Getting 24 chuncks (all from the same document) while when I remove the query, I am getting the 800 chunks and while manually checking, most are mentionning and speaking about the query made.

  • Avatar of Remi
    Remi
    ·
    ·

    Thanks ! Just to make sure I understand, why did you decide to create an app instead of an assistant with a "search" tool plugged to your data source?

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    cause I have thousands of files and each file got many pages. I already try to use the search fonction directly and results are very poor. Purpose of the app is to give in sources only related topic in the file instead of full file

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    which is working pretty good in front of normal search but the semantic search isn't working well at all

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    The app also lets me use scripts to remove specific elements, such as salesperson speeches, which tend to confuse the results.

  • Avatar of Remi
    Remi
    ·
    ·

    Ok ! I forwarded the thread to the team. Someone will jump in soon 😊

  • Avatar of Nathys Marchal
    Nathys Marchal
    ·
    ·

    Appreciate 🙂

  • Avatar of Guillaume Sondag
    Guillaume Sondag
    ·
    ·

    Hello Remi, I'm looking to do more or less the same thing as Nathys Marchal I have a folder with one file per transcript. And I want to search by ID to return only the data in this file so that I can work with this information only. What did you end up doing?

  • Avatar of Remi
    Remi
    ·
    ·

    Guillaume Sondag I think the best way to go about doing this is add the ID as tags on the doc using the API, then, to use a "data source" block and to add a "tag" filter with a variable for the ID. Let me know if you need more specific guidance, I can ask someone else in the team. 😊