Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Acefast Acefit Air Review: Sleek Style, Solid Substance

    June 16, 2025

    A New Obesity Pill May Burn Fat Without Suppressing Appetite

    June 16, 2025

    Laptop Buying Guide (2025): How to Choose the Right PC (Step-by-Step Guide)

    June 16, 2025
    Facebook X (Twitter) Instagram
    AI News First
    Trending
    • Acefast Acefit Air Review: Sleek Style, Solid Substance
    • A New Obesity Pill May Burn Fat Without Suppressing Appetite
    • Laptop Buying Guide (2025): How to Choose the Right PC (Step-by-Step Guide)
    • Spiraling with ChatGPT | TechCrunch
    • How Covid-19 Changed Hideo Kojima’s Vision for ‘Death Stranding 2’
    • What Is A Postgraduate Degree?
    • The AI execution gap: Why 80% of projects don’t reach production
    • Specs get 2026 upgrade to rival Meta
    • Home
    • AI News
    • AI Apps

      What Is A Postgraduate Degree?

      June 15, 2025

      What is Answer Engine Optimization (AEO)

      June 15, 2025

      Types of Project Management: Methodologies and Examples

      June 14, 2025

      40+ Quality Assurance Manager Interview Questions and Answers

      June 13, 2025

      Highest Paying Data Science Jobs

      June 12, 2025
    • Tech News
    • AI Smart Tech
    AI News First
    Home » Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark
    AI News 0

    Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark

    0April 12, 2025
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Earlier this week, Meta landed in hot water for using an experimental, unreleased version of its Llama 4 Maverick model to achieve a high score on a crowdsourced benchmark, LM Arena. The incident prompted the maintainers of LM Arena to apologize, change their policies, and score the unmodified, vanilla Maverick.

    Turns out, it’s not very competitive.

    The unmodified Maverick, “Llama-4-Maverick-17B-128E-Instruct,” was ranked below models including OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 1.5 Pro as of Friday. Many of these models are months old.

    The release version of Llama 4 has been added to LMArena after it was found out they cheated, but you probably didn’t see it because you have to scroll down to 32nd place which is where is ranks pic.twitter.com/A0Bxkdx4LX

    — ρ:ɡeσn (@pigeon__s) April 11, 2025

    Why the poor performance? Meta’s experimental Maverick, Llama-4-Maverick-03-26-Experimental, was “optimized for conversationality,” the company explained in a chart published last Saturday. Those optimizations evidently played well to LM Arena, which has human raters compare the outputs of models and choose which they prefer.

    As we’ve written about before, for various reasons, LM Arena has never been the most reliable measure of an AI model’s performance. Still, tailoring a model to a benchmark — besides being misleading — makes it challenging for developers to predict exactly how well the model will perform in different contexts.

    In a statement, a Meta spokesperson told TechCrunch that Meta experiments with “all types of custom variants.”

    “‘Llama-4-Maverick-03-26-Experimental’ is a chat optimized version we experimented with that also performs well on LM Arena,” the spokesperson said. “We have now released our open source version and will see how developers customize Llama 4 for their own use cases. We’re excited to see what they will build and look forward to their ongoing feedback.”



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    Spiraling with ChatGPT | TechCrunch

    June 16, 2025

    The AI execution gap: Why 80% of projects don’t reach production

    June 15, 2025

    Google reportedly plans to cut ties with Scale AI

    June 15, 2025
    Add A Comment

    Comments are closed.

    Editors Picks
    Top Reviews
    Advertisement
    Demo
    Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
    • Home
    • Privacy Policy
    • About Us
    • Contact Us
    • Disclaimer
    © 2025 AI News First

    Type above and press Enter to search. Press Esc to cancel.