Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

Publication
arXiv:2106.02584