Qwen releases Qwen2.5-VL-32B multimodal model, with performance exceeding that of the 72B large model

PANews reported on March 25 that according to the Qwen team’s announcement, the Qwen2.5-VL-32B-Instruct model was officially open sourced, with a 32B parameter scale, and demonstrated excellent performance in tasks such as image understanding, mathematical reasoning, and text generation. The model was further optimized through reinforcement learning, and the responses were more in line with human preferences, surpassing the previously released 72B model in multimodal evaluations such as MMMU and MathVista. Compared with the previous Qwen2.5-VL series models, the 32B model has the following improvements: Responses are more in line with human subjective preferences: The output style has been adjusted to make the answers more detailed, the format more standardized, and more in line with human preferences. Mathematical reasoning ability: The accuracy of solving complex mathematical problems has been significantly improved. Fine-grained image understanding and reasoning: It shows stronger accuracy and fine-grained analysis capabilities in tasks such as image parsing, content recognition, and visual logic deduction.

Qwen releases Qwen2.5-VL-32B multimodal model, with performance exceeding that of the 72B large model

Popular Articles