The other day I saw chatpdf.com on HN. It works fine, but it’s paid. Moreover, I’m not sure about how they handle sensitive data in PDFs (can we actually trust privacy policy statements?) So I decided to find alternatives. The discussion on HN gave some interesting clues.
Bing+Edge is underrated IMO. These days I’m more and more motivated to switch from Brave to Edge because of Bing integration. My jaw dropped when I saw someone on YT ask questions about a PDF right in the Edge browser. In my experience, it doesn’t work all the time (some PDFs are suspiciously resistant to Bing) but when it does, it’s amazing! I was worried that Bing wouldn’t be able to read all of the PDF due to GPT-4’s limited context window. But surprisingly, Bing was able to give me answers to questions on page 67! It even mentions the page number.
How do we handle long PDF files? Chunk the PDF text and create embeddings. Get cosine similarity between user query and each chunk, and send the top N chunks to OpenAI that fit within token memory.