hckrnws
How are "LLM hallucinations" different from a low-quality training dataset or randomly picked tokens due to overly random sampling settings?
This is great. We need more research into solving this fundamental problem, yet AI companies prefer to chase benchmarks and pump out value-added products.
The RAG-based mitigation is interesting, but quite limited, as mentioned. It would only work if the user can provide ground truth data, which for code generation is relatively straightforward, but it's much more difficult for most other factual information. We can't directly rely on data from the web, since the sources need to be carefully reviewed by a human first, which is the labor-intensive task that requires human domain experts.
So this approach seems like a band-aid, and wouldn't be generally applicable. I'm not in the AI industry, but from the perspective of a user it seems that the hallucination problem requires a much more foundational solution.
I think there's room for more agentic systems that combine RAG, MCP and traditional static analyzer tools
For instance, RAG could be used to provide coding standards and best practices such as sanitizing user inputs used in file system lookups. MCP could be used to integrate with up to date and authoritative (official) docs. Static tools could run and analyze the results and feed errors back into the LLM to correct.
It seems a lot of tools rely on raw LLM queries and expect the IDE or other tools to take over instead of providing a consolidated experience.
My favorite is trying to use it to generate an IAM policy and keys are just hallucinated based on expectations of what the keys would be called and are either wrong or they flat out don't exist if you are dealing with more advanced conditions.
> keys are just hallucinated based on expectations of what the keys would be called and are either wrong or they flat out don't exist
To be fair this is also how I generally try to write IAM configs without AI.
I thought this was a very good read about the many of the issues that are faced without having any ground truth to reason against. It is interesting how many different ways people have developed to work around missing information, and the marginal improvements it makes in some benchmarks.
On the other hand, I think it's cute when LLMs hallucinate a Python library that doesn't exist, because it probably means it's worth creating into existence.
I suspect hallucinations in LLMs are the result of contradictions in their training sets which were trained into it.
I suspect it's just like with humans. People who learn quickly and don't carefully curate their knowledge to resolve contradictions as they learn, they tend to make similar mistakes when it comes to subjects which they did not invest much time fully studying.
If I was an AI researcher, what I would try to do is find the highest quality information possible concerning very few axiomatic topics, with as few contradictions as possible, then train it into the LLM until it can generate text and basic reasoning which is fully accurate... Then once we have this basic but fully rational AI, start feeding it new data but, before giving it any piece of data to learn from, you first ask the AI to indicate if this new data contradicts any of its current knowledge. You only let it update its weights with the new data as-is if it does not contradict its existing knowledge. If it does contradict its existing knowledge, either discard it or maybe feed it the data but with some synthetic preamble like "Some people believe that..." so that it's aware of the existence of this belief system but knows that it's not to be internalized as its own beliefs.
Or maybe there is a way to do this to detect contradictions by looking at the weights themselves. You can rollback a round of training if the weights update in a way which suggests that a conflicting piece of information was learned in a specific round of training. Maybe there can be a different ANN which looks at the weights of the LLM during training and it was trained to detect contradictions and decides when to rollback a round of training.
> I suspect hallucinations in LLMs are the result of contradictions in their training sets which were trained into it.
A simpler explanation, and I posit a correct one, is people anthropomorphize an algorithm by describing the result of a particular path within a statistical model used to generate tokens as being "hallucinations" due to them being unexpected by the person interpreting the text.
> I suspect it's just like with humans.
Therein lies the problem.
Yes. ”Hallucinations” are not an edge case or an error mode, but part of the normal operation of the llm.
I still don't think hallucinations in generated code matter very much. They show up the moment you try to run the code, and with the current batch of "coding agent" systems it's the LLM itself that spots the error when it attempts to run the code.
I was surprised that this paper talked more about RAG solutions than tool-use based solutions. Those seem to me like a proven solution at this point.
I'm surprised to read that from a prominent figure in the industry such as yourself.
The problem is that many hallucinations do not produce a runtime error, and can be very difficult to spot by a human, even if the code is thoroughly reviewed, which in many cases doesn't happen. These can introduce security issues, do completely different things from what the user asked (or didn't ask), do things inefficiently, ignore conventions and language idioms, or just be dead code.
For runtime errors, feeding them back to the LLM, as you say, might fix it. But even in those cases, the produced "fix" can often contain more hallucinations. I don't use agents, but I've often experienced the loop of pasting the error back to the LLM, only to get a confident yet non-working response using hallucinated APIs.
So this problem is not something external tools can solve, and requires a much deeper solution. RAG might be a good initial attempt, but I suspect an architectural solution will be needed to address the root cause. This is important because hallucination is a general problem, and doesn't affect just code generation.
> The problem is that many hallucinations do not produce a runtime error [...]
Don't hallucinations mean nonexistent things, that is, in the case of code: functions, classes, etc. How could they fail to lead to a runtime error, then? The fact that LLMs can produce unreliable or inefficient code is a different problem, isn't it?
This argument is the reason why LLM output failing to match reality was labelled 'hallucination'. It makes it seem like the LLM only makes mistakes in a neatly verifiable manner.
The 'jpeg of the internet' argument was more apt I think. The output of LLMs might be congruent with reality and how the prompt contents represent reality. But they might also not be, and in subtle ways too.
If only all code that has any flaw in it would not run. That would be truly amazing. Alas, there are several orders of magnitude more sequences of commands that can be run than that should be run.
Hopefully this will be a catalyst to bring language integraed machine readable schema checking into wider use (ala Clojure), as static typing is crap for structured data/API stuff.
Crafted by Rajat
Source Code