viztales

Dimension reduction, State, Orientation, and Speed

Skip to content

List of belief/direction terms

Phil 6.1.21

June!

This looks quite interesting:

An Attention Free Transformer

We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention. In an AFT layer, the key and value are first combined with a set of learned position biases, the result of which is multiplied with the query in an element-wise fashion. This new operation has a memory complexity linear w.r.t. both the context size and the dimension of features, making it compatible to both large input and model sizes. We also introduce AFT-local and AFT-conv, two model variants that take advantage of the idea of locality and spatial weight sharing while maintaining global connectivity. We conduct extensive experiments on two autoregressive modeling tasks (CIFAR10 and Enwik8) as well as an image recognition task (ImageNet-1K classification). We show that AFT demonstrates competitive performance on all the benchmarks, while providing excellent efficiency at the same time.

SBIR

More writing. It turns out that the conference that I was aiming for had a (required) early submission for US authors that I missed. Sigh
Wrote of a description of cloud computing for big science for Eric H
Worked on 2 proposal overviews of Orest

Book

Working on conspiracy article/chapter

GPT-Agents,

Still running the statement that I put together Saturday

3:00 – ICWSM rehearsal – lots of good comments, which means lots of revisions. Another walkthrough this Friday at 3:30

Share this:

X
Facebook

Like Loading...

Related

This entry was posted in Phil on June 1, 2021 by pgfeldman.

Post navigation

← Phil 5.29.21 Phil 6.2.21 →

Search for:

Recent Posts

Phil 7.28.2026
Phil 7.27.2026
Phil 7.24.2026
Phil 7.23.2026
Phil 7.22.2026

Categories

PAGES

Basic JavaScript Stuff
Computational Discourse Resources
Data Visualization Bugs/Enhancements
DataManager Requirements
Examining the Alternative Media Ecosystem through the Production of Alternative Narratives of Mass Shooting Events on Twitter
Google CSE
How to set up Eclipse for Server side Java and Client side Flex
Keystore cheat sheet
Kinetic Feedback Checklists
List of belief/direction terms
Maven Setup at FGMdev
Mobile Database Navigator
My GeoMesa Experience
Online clustering, fear and uncertainty in Egypt’s transition
Project Tracking Tool
Scripting Bugs/Enhancements
Summer of Code 2013
Third Gen VISIBILITY
Useful JavaScript
Useful R Packages
Venues and Journals
Web-based IDE requirements

Archives

Meta

Create account
Log in
Entries feed
Comments feed
WordPress.com

Reblog
Subscribe Subscribed
- viztales
- Already have a WordPress.com account? Log in now.

Loading Comments...

You must be logged in to post a comment.

%d