GPT-4/5 in capability and safety
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Capability: How good are open-source models compared to GPT-4/5? They're already there — or nearly so — in many ways. Over the past two years, open-source models have progressed incredibly. Meta's LLaMA 3, Mistral's Mixtral, Cohere's Command R+, and Microsoft's Phi-3 are some models that have shownRead more
Capability: How good are open-source models compared to GPT-4/5?
They’re already there — or nearly so — in many ways.
Over the past two years, open-source models have progressed incredibly. Meta’s LLaMA 3, Mistral’s Mixtral, Cohere’s Command R+, and Microsoft’s Phi-3 are some models that have shown that smaller or open-weight models can catch up or get very close to GPT-4 levels on several benchmarks, especially in some areas such as reasoning, retrieval-augmented generation (RAG), or coding.
Models are becoming:
The open world is rapidly closing the gap on research published (or spilled) by big labs. The gap that previously existed between open and closed models was 2–3 years; now it’s down to maybe 6–12 months, and in some tasks, it’s nearly even.
However, when it comes to truly frontier models — like GPT-4, GPT-4o, Gemini 1.5, or Claude 3.5 — there’s still a noticeable lead in:
So yes, open-source is closing in — but there’s still an infrastructure and quality gap at the top. It’s not simply model weights, but tooling, infrastructure, evaluation, and guardrails.
Safety: Are open models as safe as closed models?
That is a much harder one.
Open-source models are open — you know what you’re dealing with, you can audit the weights, you can know the training data (in theory). That’s a gigantic safety and trust benefit.
But there’s a downside:
Private labs like OpenAI, Anthropic, and Google build in:
And centralized control — which, for better or worse, allows them to enforce safety policies and ban bad actors
This centralization can feel like “gatekeeping,” but it’s also what enables strong guardrails — which are harder to maintain in the open-source world without central infrastructure.
That said, there are a few open-source projects at the forefront of community-driven safety tools, including:
So while open-source safety is behind the curve, it’s increasing fast — and more cooperatively.
The Bigger Picture: Why this question matters
Fundamentally, this question is really about who gets to determine the future of AI.
The most promising future likely exists in hybrid solutions:
TL;DR — Final Thoughts