Meet MiniGPT-4: The Open Source Vision-Language Model that Matches the Performance of GPT-4
The model expands Vicuna with vision capabilities similar to BLIP-2 in one of the most interesting open source releases in the multi-modality space.
MiniGPT-4 has been one of the coolest releases in the space of multi-modal foundation models in the last few days. Created by a group of researchers from King Abdullah University of Science and Technology, Mini-GPT4 combines models like Vicuna and BLIP-2 to enable one of the first open source multi-modal foundation models ever released. Almost surprisingly, the model is showing performance comparable to GPT-4 across different tasks.