ScholarTurbo: Use ChatGPT to chat with PDFs (supports GPT-4)

ScholarTurbo: Use ChatGPT to chat with PDFs (supports GPT-4)

120

by marcametz

s1mon

I've tried this and chatpdf with some Computer Aided Geometric Design mathematics theses and papers. These tools do not seem to "read" the math parts of the papers, only the text. It would be amazing to have something which could digest a bunch of papers and start to help me navigate them, but a lot of the answers I get are "I don't have enough context to help you". Is there a better tool or way to use LLMs with scholarly mathematics papers?

alsodumb

You could just use the latex source code of the paper in the context window if you want it to be aware of the math. You can download latex source for any paper or arxiv.

If you have many papers, perhaps upload the latex source code of all of them into some vector database like pinecone and use them with LangChain for retrieval.

muffles

Great point, I think this is important knowledge to spread. Using latex format is a sweet hack to do math based analysis for papers when using LLM frameworks. As long as you have your work in latex already or even if you can convert it to Latex. It seems to be able to understand Lyapunov stability equations, Kalman filter analysis, or Hamiltonian equations when I paste them in. There is definitely hallucinations here and there but the understanding is amazing. For instance, you can convert a Word document with Mathtype equations that use Latex and then copy it into ChatGPT and it can articulate what every line item equation is doing. With reasonable coherence. Likely its not accurate with cutting edge scientists, but pretty sweet for the engineers.

arthurcolle

this is a great idea, totally doable with what is out there

boredemployee

It's not always great, but try explainpaper [1].

I have nothing to do with it, but to me it's the best, they seem to put some real effort on it since day one.

[1] explainpaper.com

erichocean

Comment of the day, this is why I keep checking HN daily.

boredemployee

you're welcome :D

nextworddev

It’s because embedding math equations and data requires a different paradigm

stuckkeys

Yeah it is mediocre. I think the issue is having to first upload/read then push back data. It has not been optimized yet. It needs lot of work. I am not sure which open-source has that capability yet to run it locally. Perhaps that will be faster but I have not played with or explored that area.

technics256

Do you have any example papers i can play with?

toddmorey

In answer to file security: "We utilize Amazon Web Services (AWS) for uploading and storing files, ensuring robust security measures, including data encryption and access control."

Well, sure, but that's the file at rest. What about the security / privacy of the data as it's fed to ChatGPT? More and more of these "ask your doc / data a question" apps are popping up. Isn't this about equivalent to putting the document on the public internet or is there some sort of sandboxing involved?

slashdev

If you use the API, which this clearly does, they don't train on the input. You can see their guarantees in their terms and conditions, but you have a reasonable expectation of privacy.

I say reasonable, because in a company growing that fast, everything is on fire all the time and security is the last thing considered.

aik

Azure OpenAI’s GPT APIs I would imagine is likely more secure, and now is HIPAA compliant as well.

anonymouse008

I’d really love a consultant to stick their neck out and say these environments are hipaa/hi tech compliant. A press release does not make a defense in a breach.

This may be a dumb question, but how would Azure ensure GPU clean tenancy and segmentation of data in these pipelines at those prices?

[0] https://learn.microsoft.com/en-us/compliance/regulatory/offe...

‘Microsoft in-scope cloud platforms & services Azure and Azure Government Azure DevOps Services Dynamics 365 and Dynamics 365 U.S. Government Intune Microsoft Defender for Cloud Apps Microsoft Healthcare Bot Service’

Are they piggybacking on that?

jeron

source for the curious: https://azure.microsoft.com/en-us/products/cognitive-service...

Comment was deleted :(

dndn1

I look at these tools to see how they communicate these data processing/security details to their users.

There appears to be a lot of apathy/total disregard to giving users any clear clue.

It might not matter to me or this audience - we probably know the direction our data is heading - but it matters to less techie people, to ordinary employees and to civil servants and government people at the bottom and the top, who have no notion where their data is processed and who they are trusting with it.

beoberha

I think these GPT wrapper apps are not too unlike Dropbox. They face a good amount of (fair) criticism as too simple of a solution for any technically minded person. But if you wrap some of the annoying parts into a nice app, you could have a very successful business for non technical people. Even technical enterprises can benefit from enterprise level security features that they themselves may not want to deal with.

quickthrower2

I think Dropbox won on doing something well when anything similar really sucked, and I think a lot of that was in making the syncing 'just work'. A lot of online cloud drives were (and still are) untenable because of the babysitting and frustrating as files don't sync or are slow to access.

With all these ChatGPT wrappers, the hard bit is pretty much a monopolized commodity, so they better do the annoying bits really really well.

This is the second PDF chatbot I have tried, and they both suffer from the technical constraint of how many tokens you can fit into ChatGPT, meaning they refuse to answer global questions like "are there any mistakes in the document since the ChatGPT may be unaware that there is a document at all, but can answer, for example for a CV, "what jobs might you hire this person for".

This might be fixable by making round trips, for example asking chatgpt YES/NO for does this query apply to this 1000 token section of a larger document, etc. etc.

And of course if they get this right, and do it well, it becomes very useful. If it can answer based on 10000 stored pdfs, word docs, JIRA tickets, confluence pages and slack threads, then it would be super useful for organizations. It would hardly be just a wrapper then though.

kelseyfrog

Finally, I can have a conversation with Pepsi Gravitational Field[1] by Arnell Group, and Chicken Chicken[2].

1. https://archive.org/details/pepsi_gravitational_field/mode/2...

2. https://isotropic.org/papers/chicken.pdf

shanusmagnus

The chicken one speaks for itself, but any context on wtf is the origin story on the Pepsi one?

kelseyfrog

It has been discussed before

https://news.ycombinator.com/item?id=32064324 (2022)

https://news.ycombinator.com/item?id=19756602 (2019)

https://news.ycombinator.com/item?id=11038059 (2016)

DeltaCoast

That Pepsi doc made the rounds like 15 years ago. I remember seeing it first on Reddit, someone had claimed it was a branding pitch. There was a lot of back and forth on wether or not it's a hoax because it's absurd, but I've been in some meetings where we've gone through branding presentations that sound pretty similar.

yuanyuanshi

I want to try this, but it's stuck at processing, showing 100% but not usable.

I‘ve tried a lot of chat-with-pdf apps, including ChatDOC, ChatPDF, Humata and others. For serious paper reading, I must say that ChatDOC(https://chatdoc.com/) is the best option.

The accuracy of ChatDOC’s understanding of tables, data, and texts is significantly higher than other products. Also you can upload a folder of files and chat with the collection, easily navigating through each one.

For example: https://chatdoc.com/chatdoc/#/share/ub8Z7aarcXJHY_SzGwl3LFVw...

I uploaded a paper about GPT and asked ChatDOC to summarize the main content, explain the experimental data, and find relevant data. It completed all these tasks very well. Amazing.

Amelia21

ChatDOC’s document processing ability is excellent. I can preview the document online and find the corresponding original content for the answer, but the other app does not have this feature yet

yu07

same problem with this... can't upload my PDF... Thanks for the recommendation! I Just tried ChatDOC and I really like that the answers list all the page numbers of the cited sources. Cool! I'll continue to try out the multi-file chat feature you mentioned. Hope it can save me from a pile of files!

marcametz

Since HN took off we reached max. capacity with our vector database. Working on it!

traxleo

ChatDOC can also support doc/docx files. Nice!

Xunxi

There's a gentleman I follow on Twitter who run a bootcamp after chatgpt was unveiled https://twitter.com/mayowaoshin.

He had demo videos on how to build similar applications. From what I understand the first cohort was made up of people from all manner of professions, notably consulting, finance, academia, manufacturing etc.

I knew it was only a matter of time before we see clones offering this as a service. To put it mildly, some even went as far as copying his code verbatim and using it to launch such services without acknowledging his effort.

One major downside of releasing content and open source code is the handful of users who copy and paste for commercial use without giving credit.

All of a sudden there's a wave of "chat with pdf" youtube videos using my exact content, diagrams and images.

https://twitter.com/mayowaoshin/status/1650181657746370561

nextworddev

No, he didn’t “invent” Chat with PDF use case given there were already Chat with PDF SaaS apps around even in January, and Langchain etc had PDF loaders even back in Jan. Plus most of this code base seem to come straight out of OpenAI cookbook.

But yes Twitter is a wild Wild West right now of people ripping ideas off any trending repos

pirate787

How do we know ChatGPT isn't hallucinating parts of its answers? I think this is an interesting service but what are the guardrails so we can trust the answer?

oofbey

GPT definitely makes mistakes. GPT4 less than GPT3.5, and way less than Bard.

Having spent quite a lot of professional time working through real-world problems with GPT4, I'll say it very rarely does something I'd call hallucination. It almost never invents crazy things out of whole cloth. Sometimes it makes wrong guesses about things.

You gotta keep your own head and be skeptical. Often you can tell if it's coming unhinged. If it makes mistakes and you notice, it'll definitely own them unlike many LLMs. But of course it's not perfect.

williamcotton

Hallucinations mainly occur when the prompt does not contain the information required for the completion.

When you add parts of the content of the PDF to the prompt, which is how these tools work, then it is very likely to produce a completion consistent with the PDF.

I say “mainly” and “very likely”, but in my experience I have never had a hallucination when the prompt is augmented with factual data.

Hallucinations occur with regularity when the prompt does not contain what is requested to be completed.

boredemployee

That's the problem, we don't. All the tools on top of GPT that I tried has the same problem.

freedmand

That's a big part of the reason I built the open source tool Semantra[1]. I only really trust the semantic search part of this problem as it is constrained by context, so I tried to make the user experience good on top of simpler semantic search without any chatbot capabilities.

[1] https://github.com/freedmand/semantra

boredemployee

that seems a really good tool, will try it later. thanks for sharing, I hope you get the deserved traction with it!

slashdev

The same you get with ChatGPT - no guarantees. It's BYOB (Bring Your Own Brain), so don't trust it more than you would anything else you read on the Internet.

I get that people have an objection to how LLMs invent things, but so do people. You should always verify things.

Comment was deleted :(

paulnovacovici

What's sets this apart from https://www.chatpdf.com/?

gharman

Looks like Scholar Turbo has GPT-4 and Chatpdf is still on the waitlist. But yeah, these chat-with-your-doc webapps are popping up out of the woodwork.

ajhai

For fun, I whipped a quick one using our platform at https://trypromptly.com/app/65573e71-c05b-4567-87e8-49688c82.... (Initial chunking of PDF takes a bit of time because of rate limits with my account)

vyrotek

Also https://pdf.ai

mattfrommars

Woah, really like the UI/UX of this site. Anyone know what UI framework they could be using?

ch1234

Looks like tailwinds from the style sheet code

Comment was deleted :(

wskish

My partner and I recently launched something similar we are calling JiggyBase:

https://jiggy.ai

The core is open source (https://github.com/jiggy-ai)

hammeiam

How far do we think this product is from something that use uses langchain to parse pdfs, create & store chunks / embeddings, and feed them to ChatGPT (or a local model)? Do we think there's any secret sauce?

jrpt

I can say as someone who made something similar (Docalysis.com) that there's a lot of "secret sauce" in actually getting the outputs to be higher quality, especially with processing larger files. Just today I got another email from a user saying the quality was better than other AI products they tried.

I also don't use Langchain.

jerpint

This is probably it +/- implementation details

ralphc

I'm not using anything like this for personal or corporate information but this is real handy for laws or government documents. I uploaded the Durham Report, a 300 page PDF, and have been asking questions.

chopete3

This is similar to wordpress websites or android apps. Iphone store had a high toll bar, $99/year .

There will be a lot these quick to build interesting utilities to show off. There is room to feed the demand in wordpress sites and ad-word based websites.

I think OpenAI will keep the platform cheap long enough to collect the small toll and make the innovation go wild.

There is perhaps room for one more OpenAI style provider, like what Android did to iOS to feed this demand. It is a pity if its not Google.

jxy

True. I can't believe how simple OpenAI made their API to be. They really made it trivial to start using their API. Comparing that to Google's, words fail me to even describe the frustration with GCP just to start testing it's text-bison.

matrix2596

Cool product! seamless to try out and works quite well. But I am not able to ask questions outside the papers but relevant to it. "I'm sorry, but the given context does not provide information on the advantages of using quaternions over Euler angles in general. My programming is limited to answering questions related to the given context. Is there anything else I can help you with?"

ukuina

Isn't that kind of the point? I would greatly prefer it stays within the confines of the supplied documents.

florakel

If you are a ChatGPT Plus user you can also use ChatGPT 4 with the AskYourPDF plugin. The advantage is that you can just feed the PDF through your own URL and don't have to upload it to a third party.

alsodumb

I mean I'm assuming the plugin would still download the PDF from your url to process it and create embeddings, right? I don't see how it's any different from uploading it.

jll29

This demo (similar to ExplainPaper and pdf.ai) seems to be offered by a one-person company in the South Western German city of Überlingen (near Freiburg).

ada1981

Can’t your drop the full text into Claude?

airocker

We are doing a Hackathon currently with the exact same idea. What are the chances?

mhitza

High. Everybody, myself included, had this idea . There's also been (at least 2) posts in the past about similar products. I stopped from testing these out as they all "cheat", or at least in my opinion they cheat, by creating embeddings and vector search before hitting up GPT with the context/answer.

williamcotton

Why do you consider this cheating? The context window is currently too small for it to work in any other manner.

Terretta

You could ask Claude+ 100k to give you the text context(s) containing anything pertinent then have GPT4 prompted with that.

williamcotton

With an emphasis on the word “could”! Very few people have access to Anthropic’s services at this point!

Embeddings still seem like a more economical approach.

o_____________o

Very high, there are a lot of these. It's not too far from a hello world project.

airocker

when will I get over hello world? Gosh

victor9000

Specialize

bvan

Having to upload pdf files is a non-starter for any business user.

chaxor

It's pretty disgusting to see how many of these exist, yet so few are local-only with local LMs. It's exceedingly obvious that local LMs are desired and far better for this task, and only idiots would be interested in paying for software like this that uses OpenAI.

RhodesianHunter

A very healthy portion of paid software fits in the category of "things you could do on your own machine, but would rather pay someone to run on their machine with a simplified UI".

This doesn't make the users idiots, it just means they value their time.

Crafted by Rajat

Source Code

hckrnws

ScholarTurbo: Use ChatGPT to chat with PDFs (supports GPT-4)