hckrnws
I've tried this and chatpdf with some Computer Aided Geometric Design mathematics theses and papers. These tools do not seem to "read" the math parts of the papers, only the text. It would be amazing to have something which could digest a bunch of papers and start to help me navigate them, but a lot of the answers I get are "I don't have enough context to help you". Is there a better tool or way to use LLMs with scholarly mathematics papers?
You could just use the latex source code of the paper in the context window if you want it to be aware of the math. You can download latex source for any paper or arxiv.
If you have many papers, perhaps upload the latex source code of all of them into some vector database like pinecone and use them with LangChain for retrieval.
Great point, I think this is important knowledge to spread. Using latex format is a sweet hack to do math based analysis for papers when using LLM frameworks. As long as you have your work in latex already or even if you can convert it to Latex. It seems to be able to understand Lyapunov stability equations, Kalman filter analysis, or Hamiltonian equations when I paste them in. There is definitely hallucinations here and there but the understanding is amazing. For instance, you can convert a Word document with Mathtype equations that use Latex and then copy it into ChatGPT and it can articulate what every line item equation is doing. With reasonable coherence. Likely its not accurate with cutting edge scientists, but pretty sweet for the engineers.
this is a great idea, totally doable with what is out there
It's not always great, but try explainpaper [1].
I have nothing to do with it, but to me it's the best, they seem to put some real effort on it since day one.
[1] explainpaper.com
Comment of the day, this is why I keep checking HN daily.
you're welcome :D
It’s because embedding math equations and data requires a different paradigm
Yeah it is mediocre. I think the issue is having to first upload/read then push back data. It has not been optimized yet. It needs lot of work. I am not sure which open-source has that capability yet to run it locally. Perhaps that will be faster but I have not played with or explored that area.
Do you have any example papers i can play with?
In answer to file security: "We utilize Amazon Web Services (AWS) for uploading and storing files, ensuring robust security measures, including data encryption and access control."
Well, sure, but that's the file at rest. What about the security / privacy of the data as it's fed to ChatGPT? More and more of these "ask your doc / data a question" apps are popping up. Isn't this about equivalent to putting the document on the public internet or is there some sort of sandboxing involved?
If you use the API, which this clearly does, they don't train on the input. You can see their guarantees in their terms and conditions, but you have a reasonable expectation of privacy.
I say reasonable, because in a company growing that fast, everything is on fire all the time and security is the last thing considered.
Azure OpenAI’s GPT APIs I would imagine is likely more secure, and now is HIPAA compliant as well.
I’d really love a consultant to stick their neck out and say these environments are hipaa/hi tech compliant. A press release does not make a defense in a breach.
This may be a dumb question, but how would Azure ensure GPU clean tenancy and segmentation of data in these pipelines at those prices?
[0] https://learn.microsoft.com/en-us/compliance/regulatory/offe...
‘Microsoft in-scope cloud platforms & services Azure and Azure Government Azure DevOps Services Dynamics 365 and Dynamics 365 U.S. Government Intune Microsoft Defender for Cloud Apps Microsoft Healthcare Bot Service’
Are they piggybacking on that?
source for the curious: https://azure.microsoft.com/en-us/products/cognitive-service...
Comment was deleted :(
I look at these tools to see how they communicate these data processing/security details to their users.
There appears to be a lot of apathy/total disregard to giving users any clear clue.
It might not matter to me or this audience - we probably know the direction our data is heading - but it matters to less techie people, to ordinary employees and to civil servants and government people at the bottom and the top, who have no notion where their data is processed and who they are trusting with it.
I think these GPT wrapper apps are not too unlike Dropbox. They face a good amount of (fair) criticism as too simple of a solution for any technically minded person. But if you wrap some of the annoying parts into a nice app, you could have a very successful business for non technical people. Even technical enterprises can benefit from enterprise level security features that they themselves may not want to deal with.
I think Dropbox won on doing something well when anything similar really sucked, and I think a lot of that was in making the syncing 'just work'. A lot of online cloud drives were (and still are) untenable because of the babysitting and frustrating as files don't sync or are slow to access.
With all these ChatGPT wrappers, the hard bit is pretty much a monopolized commodity, so they better do the annoying bits really really well.
This is the second PDF chatbot I have tried, and they both suffer from the technical constraint of how many tokens you can fit into ChatGPT, meaning they refuse to answer global questions like "are there any mistakes in the document since the ChatGPT may be unaware that there is a document at all, but can answer, for example for a CV, "what jobs might you hire this person for".
This might be fixable by making round trips, for example asking chatgpt YES/NO for does this query apply to this 1000 token section of a larger document, etc. etc.
And of course if they get this right, and do it well, it becomes very useful. If it can answer based on 10000 stored pdfs, word docs, JIRA tickets, confluence pages and slack threads, then it would be super useful for organizations. It would hardly be just a wrapper then though.
Finally, I can have a conversation with Pepsi Gravitational Field[1] by Arnell Group, and Chicken Chicken[2].
1. https://archive.org/details/pepsi_gravitational_field/mode/2...
The chicken one speaks for itself, but any context on wtf is the origin story on the Pepsi one?
It has been discussed before
https://news.ycombinator.com/item?id=32064324 (2022)
That Pepsi doc made the rounds like 15 years ago. I remember seeing it first on Reddit, someone had claimed it was a branding pitch. There was a lot of back and forth on wether or not it's a hoax because it's absurd, but I've been in some meetings where we've gone through branding presentations that sound pretty similar.
I want to try this, but it's stuck at processing, showing 100% but not usable.
I‘ve tried a lot of chat-with-pdf apps, including ChatDOC, ChatPDF, Humata and others. For serious paper reading, I must say that ChatDOC(https://chatdoc.com/) is the best option.
The accuracy of ChatDOC’s understanding of tables, data, and texts is significantly higher than other products. Also you can upload a folder of files and chat with the collection, easily navigating through each one.
For example: https://chatdoc.com/chatdoc/#/share/ub8Z7aarcXJHY_SzGwl3LFVw...
I uploaded a paper about GPT and asked ChatDOC to summarize the main content, explain the experimental data, and find relevant data. It completed all these tasks very well. Amazing.
ChatDOC’s document processing ability is excellent. I can preview the document online and find the corresponding original content for the answer, but the other app does not have this feature yet
same problem with this... can't upload my PDF... Thanks for the recommendation! I Just tried ChatDOC and I really like that the answers list all the page numbers of the cited sources. Cool! I'll continue to try out the multi-file chat feature you mentioned. Hope it can save me from a pile of files!
Since HN took off we reached max. capacity with our vector database. Working on it!
ChatDOC can also support doc/docx files. Nice!
There's a gentleman I follow on Twitter who run a bootcamp after chatgpt was unveiled https://twitter.com/mayowaoshin.
He had demo videos on how to build similar applications. From what I understand the first cohort was made up of people from all manner of professions, notably consulting, finance, academia, manufacturing etc.
I knew it was only a matter of time before we see clones offering this as a service. To put it mildly, some even went as far as copying his code verbatim and using it to launch such services without acknowledging his effort.
One major downside of releasing content and open source code is the handful of users who copy and paste for commercial use without giving credit.
All of a sudden there's a wave of "chat with pdf" youtube videos using my exact content, diagrams and images.
No, he didn’t “invent” Chat with PDF use case given there were already Chat with PDF SaaS apps around even in January, and Langchain etc had PDF loaders even back in Jan. Plus most of this code base seem to come straight out of OpenAI cookbook.
But yes Twitter is a wild Wild West right now of people ripping ideas off any trending repos
How do we know ChatGPT isn't hallucinating parts of its answers? I think this is an interesting service but what are the guardrails so we can trust the answer?
GPT definitely makes mistakes. GPT4 less than GPT3.5, and way less than Bard.
Having spent quite a lot of professional time working through real-world problems with GPT4, I'll say it very rarely does something I'd call hallucination. It almost never invents crazy things out of whole cloth. Sometimes it makes wrong guesses about things.
You gotta keep your own head and be skeptical. Often you can tell if it's coming unhinged. If it makes mistakes and you notice, it'll definitely own them unlike many LLMs. But of course it's not perfect.
Hallucinations mainly occur when the prompt does not contain the information required for the completion.
When you add parts of the content of the PDF to the prompt, which is how these tools work, then it is very likely to produce a completion consistent with the PDF.
I say “mainly” and “very likely”, but in my experience I have never had a hallucination when the prompt is augmented with factual data.
Hallucinations occur with regularity when the prompt does not contain what is requested to be completed.
That's the problem, we don't. All the tools on top of GPT that I tried has the same problem.
That's a big part of the reason I built the open source tool Semantra[1]. I only really trust the semantic search part of this problem as it is constrained by context, so I tried to make the user experience good on top of simpler semantic search without any chatbot capabilities.
that seems a really good tool, will try it later. thanks for sharing, I hope you get the deserved traction with it!
The same you get with ChatGPT - no guarantees. It's BYOB (Bring Your Own Brain), so don't trust it more than you would anything else you read on the Internet.
I get that people have an objection to how LLMs invent things, but so do people. You should always verify things.
Comment was deleted :(
What's sets this apart from https://www.chatpdf.com/?
Looks like Scholar Turbo has GPT-4 and Chatpdf is still on the waitlist. But yeah, these chat-with-your-doc webapps are popping up out of the woodwork.
For fun, I whipped a quick one using our platform at https://trypromptly.com/app/65573e71-c05b-4567-87e8-49688c82.... (Initial chunking of PDF takes a bit of time because of rate limits with my account)
Also https://pdf.ai
Woah, really like the UI/UX of this site. Anyone know what UI framework they could be using?
Looks like tailwinds from the style sheet code
Comment was deleted :(
My partner and I recently launched something similar we are calling JiggyBase:
The core is open source (https://github.com/jiggy-ai)
How far do we think this product is from something that use uses langchain to parse pdfs, create & store chunks / embeddings, and feed them to ChatGPT (or a local model)? Do we think there's any secret sauce?
I can say as someone who made something similar (Docalysis.com) that there's a lot of "secret sauce" in actually getting the outputs to be higher quality, especially with processing larger files. Just today I got another email from a user saying the quality was better than other AI products they tried.
I also don't use Langchain.
This is probably it +/- implementation details
I'm not using anything like this for personal or corporate information but this is real handy for laws or government documents. I uploaded the Durham Report, a 300 page PDF, and have been asking questions.
This is similar to wordpress websites or android apps. Iphone store had a high toll bar, $99/year .
There will be a lot these quick to build interesting utilities to show off. There is room to feed the demand in wordpress sites and ad-word based websites.
I think OpenAI will keep the platform cheap long enough to collect the small toll and make the innovation go wild.
There is perhaps room for one more OpenAI style provider, like what Android did to iOS to feed this demand. It is a pity if its not Google.
True. I can't believe how simple OpenAI made their API to be. They really made it trivial to start using their API. Comparing that to Google's, words fail me to even describe the frustration with GCP just to start testing it's text-bison.
Cool product! seamless to try out and works quite well. But I am not able to ask questions outside the papers but relevant to it. "I'm sorry, but the given context does not provide information on the advantages of using quaternions over Euler angles in general. My programming is limited to answering questions related to the given context. Is there anything else I can help you with?"
Isn't that kind of the point? I would greatly prefer it stays within the confines of the supplied documents.
If you are a ChatGPT Plus user you can also use ChatGPT 4 with the AskYourPDF plugin. The advantage is that you can just feed the PDF through your own URL and don't have to upload it to a third party.
I mean I'm assuming the plugin would still download the PDF from your url to process it and create embeddings, right? I don't see how it's any different from uploading it.
This demo (similar to ExplainPaper and pdf.ai) seems to be offered by a one-person company in the South Western German city of Überlingen (near Freiburg).
Can’t your drop the full text into Claude?
We are doing a Hackathon currently with the exact same idea. What are the chances?
High. Everybody, myself included, had this idea . There's also been (at least 2) posts in the past about similar products. I stopped from testing these out as they all "cheat", or at least in my opinion they cheat, by creating embeddings and vector search before hitting up GPT with the context/answer.
Why do you consider this cheating? The context window is currently too small for it to work in any other manner.
You could ask Claude+ 100k to give you the text context(s) containing anything pertinent then have GPT4 prompted with that.
With an emphasis on the word “could”! Very few people have access to Anthropic’s services at this point!
Embeddings still seem like a more economical approach.
Very high, there are a lot of these. It's not too far from a hello world project.
when will I get over hello world? Gosh
Specialize
Having to upload pdf files is a non-starter for any business user.
It's pretty disgusting to see how many of these exist, yet so few are local-only with local LMs. It's exceedingly obvious that local LMs are desired and far better for this task, and only idiots would be interested in paying for software like this that uses OpenAI.
A very healthy portion of paid software fits in the category of "things you could do on your own machine, but would rather pay someone to run on their machine with a simplified UI".
This doesn't make the users idiots, it just means they value their time.
Crafted by Rajat
Source Code