maxOS Braindump

Transformer Model

About

A Neural Network architecture.

Multi-Head Attention

Links to this note

  • Artificial Intelligence Glossary: Neural Networks and Other Terms Explained
  • Attention Is All You Need
  • What Are Transformer Models and How Do They Work?
  • 机器之心的进化 / 理解 AI 驱动的软件 2.0 智能革命