HelloAILearn
Sign in →
HelloAI glossary

Masked attention

Attention in the decoder where each word may only score against the words before it; later words get a score that becomes zero after softmax. This forces the model to predict each token using only earlier ones, never future ones, which is what makes left-to-right text generation work.

Go beyond the definition

Terms like this come up in real clinical scenarios across the HelloAI courses: bite-sized modules with verifiable certificates. An account takes one minute, no password needed.

Sign in →
See all terms →