Phil 1.20.20

Transformers from Scratch

  • Transformers are a very exciting family of machine learning architectures. Many good tutorials exist, but in the last few years transformers have mostly become simpler, so that it is now much more straightforward to explain how modern architectures work. This post is an attempt to explain directly how modern transformers work, and why, without some of the historical baggage.

Dissertation

  • Folding in Wayne’s edits
    • Made the Arendt paragraph of velocity less reflective and more objective.
    • TODO: Defend facts to opinion with examples of language, framing, what is interesting, etc.-done
    • TODO: Heavy thoughts, light and frivolous, etc. We ascribe these, but they are not there – done
    • TODO: We have a MASSIVE physical bias. Computers don’t. Done
    • TODO: COmputers and people must work together
  • Title case all refs (Section, Table, etc) – done
  • \texttt all urls (reddit, etc) – done
  • search for and / or slashes
  • Fix underlines as per here– done!
    % for better underlining
    \usepackage[outline]{contour}
    \usepackage{ulem}
    \normalem % use classical emph
    
    \newcommand \myul[4]{%
    	\begingroup%
    	\renewcommand \ULdepth {#1}%
    	\renewcommand \ULthickness {#2}%
    	\contourlength{#3}%
    	\uline{\phantom{#4}}\llap{\contour{white}{#4}}%
    	\endgroup%
    }

     

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.