Perceiver AR: general-purpose, long-context autoregressive generation

      We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal masking. Perceiver AR can directly attend to over a hundred thousand tokens, enabling practical long-context density estimation without the need for hand-crafted sparsity patterns or memory mechanisms. Read More DeepMind Blog 

​  


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *