Hello team 🙂 The pdf parser doens't work (most of the time, the assitant tell me that it cannot retrieve the text from the pdf). But if I upload the same Pdf in Claude, they can retrieve the content without any problem. Will be super cool to have a better pdf parser in Dust 🙂 Thanks!
Thank you for flagging Eliot Hallak! Do you mean when pushing the pdf using the "upload file" icon in the chat interface? Or do you mean something else?
Yes that's what I mean, ahah I just realize with your screen that this format is not supported yet
Pdf should be supported 👀 However, there is an intermediary step that will convert the pdf to raw text file. Some info can be "lost" along the way. In particular, everything in graphs, images, etc. Could this be the problem?
And we agree that a drive folder with PDFs, Dust can’t read them? Like we need .docx files etc. I find that curious too
Thanks for sharing. Greyg Sinigaglia de Malibran ! By default it's indeed deactivated but it can be activated for your workspace if needed 😊 The reason is that many companies at a ton of "irrelevant" pdfs in their drive that were adding more noise than help.
Ahhhh ! Yesss please 🙏 Thanks Remi !
Greyg Sinigaglia de Malibran just to follow up on this - can you share an example of conversation where we were not able to retrieve the content of your PDF? And just to confirm: your PDF is not only images/graphs?
Hey Adèle Dugré I’ve just done some tests on Security / Compliance questions, the documents are 100% text (classic big legal documents 😂 )
Greyg Sinigaglia de Malibran would you mind sharing the conversation ID or url? Also, are you able to share the PDF (in private) or is it too sensitive? Thank you!