Researchers at Apple have come out with a new paper showing that large language models can’t reason — they’re just pattern-matching machines. [arXiv, PDF] This shouldn’t be news to anyone here. We …
Please enlighten me on how? I admit I don’t know all the internals of the transformer model, but from what I know it encodes precisely only syntactical information, i.e. what next syntactical token is most likely to follow based on a syntactical context window.
How does it encode semantics? What is the semantics that it encodes? I doubt they have denatotational or operational semantics of natural language, I don’t think something like that even exists, so it has to be some smaller model. Actually, it would be enlightening if you could tell me at least what the semantical domain here is, because I don’t think there’s any naturally obvious choice for that.
Please enlighten me on how? I admit I don’t know all the internals of the transformer model, but from what I know it encodes precisely only syntactical information, i.e. what next syntactical token is most likely to follow based on a syntactical context window.
How does it encode semantics? What is the semantics that it encodes? I doubt they have denatotational or operational semantics of natural language, I don’t think something like that even exists, so it has to be some smaller model. Actually, it would be enlightening if you could tell me at least what the semantical domain here is, because I don’t think there’s any naturally obvious choice for that.