HelloAI glossary
Multi-head attention
Running several attention layers in parallel so each head can form a different filter and focus on a different part of the input. The heads' outputs are concatenated and passed through a linear layer to combine them.
Go beyond the definition
Terms like this come up in real clinical scenarios across the HelloAI courses: bite-sized modules with verifiable certificates. An account takes one minute, no password needed.
Sign in →