HelloAI glossary

Vision transformer

A transformer applied to images rather than text, used as an alternative to convolutional networks (CNNs). It splits an image into patches and uses attention to relate them, borrowing the transformer advantages of scaling to large data and training end to end without hand-designed layers. Vision transformers are increasingly common in radiology and pathology AI.

Go beyond the definition

Terms like this come up in real clinical scenarios across the HelloAI courses: bite-sized modules with verifiable certificates. An account takes one minute, no password needed.

See all terms →