Multimodal AI is a type of artificial intelligence that can understand and process different types of information, such as text, images, audio, and video, all at the same time. Multimodal gen AI models produce outputs based on these various inputs.
Source: McKinsey
What is multimodal AI?