🔎 View Tweet

Ronen Tamari@rtk254
Interesting paper - "by introducing position tokens (e.g., "3 10e1 2"), transformers learn to accurately add and subtract numbers up to 60 digits" Another angle - exposing model to "meaning behind the symbols" does wonders for performance. How to do this in NLU as well? https://t.co/eS7l9cRbzB
26 42/28/2021