HelloAI glossary
Masked attention
Attention in the decoder where each word may only score against the words before it; later words get a score that becomes zero after softmax. This forces the model to predict each token using only earlier ones, never future ones, which is what makes left-to-right text generation work.
Go beyond the definition
Terms like this come up in real clinical scenarios across the HelloAI courses: bite-sized modules with verifiable certificates. An account takes one minute, no password needed.
Sign in →