HelloAILearn
Sign in →
HelloAI glossary

Multi-head attention

Running several attention layers in parallel so each head can form a different filter and focus on a different part of the input. The heads' outputs are concatenated and passed through a linear layer to combine them.

Go beyond the definition

Terms like this come up in real clinical scenarios across the HelloAI courses: bite-sized modules with verifiable certificates. An account takes one minute, no password needed.

Sign in →
See all terms →