OpenAI launches "Safety Evaluation Center" to regularly publish model safety performance data

PANews reported on May 15 that according to the official announcement of OpenAI, in order to improve the transparency of model security, OpenAI announced the launch of the "Safety Evaluations Hub" to continuously publish the safety performance results of its models in terms of harmful content, jailbreak attacks, hallucination generation, instruction priority, etc. Compared with the system card that only discloses one-time data when the model is released, the center will be updated periodically with the model update, supporting horizontal comparisons between different models, aiming to enhance the community's understanding of AI security and regulatory transparency. Currently, GPT-4.5 and GPT-4o perform best in terms of jailbreak attack resistance and factual accuracy.

OpenAI launches "Safety Evaluation Center" to regularly publish model safety performance data

Popular Articles

Curated Series