Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

yes, but on this n-gram vs transformers; if you consider more general paradigm, self attention mechanism is basically a special form of a graph neural networks [1].

[1] Bridging Graph Neural Networks and Large Language Models: A Survey and Unified Perspective https://infoscience.epfl.ch/server/api/core/bitstreams/7e6f8...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: