How Qwen 2.5 Just Beat the Top AI Models — HubSpot SVP of Marketing Shares The Industry Impact
The AI model races are heating up. Right on the heels of DeepSeek-R1’s release, the industry is reeling from yet another powerful AI model hitting the market. I test drove the latest iteration of Alibaba’s Qwen models — Qwen 2.5.

The AI model races are heating up. Right on the heels of DeepSeek-R1’s release, the industry is reeling from yet another powerful AI model hitting the market. I test drove the latest iteration of Alibaba’s Qwen models — Qwen 2.5.
In this post, I’ll break down what Qwen 2.5 is, how you can use it, and how it compares to OpenAI o1 and DeepSeek-RI. I’ll also explore what this means for the AI industry moving forward. Let’s dive in.
What Makes Qwen 2.5 Different
Qwen 2.5 was released as a surprise launch on January 29, 2025. Like its competitors, Qwen 2.5 offers natural language processing, versatile use cases, and integrations with multilingual support. It’s fast and trained on a massive amount of data. It can search the web, write text, and code.
Unlike OpenAI and Claude’s models, Qwen 2.5 is open source, which opens a realm of possibility for companies and developers.
Beyond that, you can go to Qwen’s website and sign up to start using it today for free. Early testing suggests that Qwen 2.5 performs similarly to ChatGPT’s o1 and o3 models, which cost $200 per month. For a company or an individual looking to leverage complex reasoning and build a custom AI model, that’s significant savings.
Qwen 2.5 is also multimodal, meaning it can process and generate content based on both text and image inputs. This approach makes the tool incredibly versatile. With Qwen 2.5, I can:
- Generate images and videos.
- Create structured outputs for forms and invoices.
- Conduct spacial seasoning tasks.
- Convert images into coding languages like HTML, JSON, and more.
How Qwen 2.5 Compares to Other AI Models
Take a look at this performance comparison of Qwen 2.5 versus the other leading models, including ChatGPT-4, Claude 3.5 Sonnet, DeepSeek-V3, and Llama-3.1.
Qwen outperforms all other models on Arena-Hard (complex problem-solving) and LiveBench (competence in real-world AI tasks). Other tests have found that the model performs better at mathematical reasoning and vision-language modeling, where it needs to process both image and text inputs.
Qwen performs on par or better than paid models from comparable U.S. companies on a variety of tasks. Now, let’s dive into the use cases. Here’s what you can actually do with Qwen 2.5.
Four Ways to Use Qwen 2.5
1. Create images, videos, and text-based content.
First off, Qwen 2.5’s image and video creation rivals DALL-E and Soros. Here’s an AI-generated image someone created of a dog drinking a beer. It’s not perfect, but it’s a decent first take.
Then, for videos, check out this example from Shruti Mishra of a lifelike ride with huskies.
Whether you need to create images, videos, or text-based content like a blog post, Gwen 2.5 can give you a decent first draft to work with.
2. Create your own AI agents.
AI agents can reason through complex instructions and act on them without heavy oversight from you. With Gwen 2.5, you can create one of these agents to use on your computer, similar to Claude’s or OpenAI's operator.
For example, you could build an agent to update your calendar, interpret structured data, or book flights for you online. I recommend starting small by building agents for personal use (think, your own assistant to manage chores or social engagements). From there, you can scale up to AI agents for your business.
3. Synthesize large datasets and multimodal formats.
Qwen 2.5 can handle document parsing, meaning that it can understand not only text but also tables, charts, and images. It’s particularly good at understanding long videos, which other models aren’t.
If I want to train the model on my proprietary data, I can feed it information in many formats. Qwen 2.5 also uses long-context training, which gives the model a specific history to draw from. The model can then identify patterns and contextualize your business data over time. That’s useful for the next step I want to complete — complex reasoning.
4. Perform complex reasoning.
Early testing shows that Qwen 2.5 outperforms its competitors in math and logical inference. That means it’s well set up to perform complex reasoning.
I wanted to test Qwen 2.5 with one of the advanced prompts that I use to do my work. So, I asked the model to build a dashboard for a SaaS brand.
First, I started with a reasoning prompt. Remember:
- For a non-reasoning prompt, I provide step-by-step instructions on what I want the model to do.
- With a reasoning model, step-by-step logic is built into the system. All I have to do is provide it with the problem I want to solve and the output I want. From there, AI should be able to figure out a path forward.
I asked Gwen 2.5 to go through my most important business metrics, identify any trends I should be aware of, and tell me which metrics I should prioritize. Then, I asked it to build a forecast model and create a strategy for me to hit that forecast.
I compared this side-by-side with ChatGPT o1 pro, which I regularly use for this purpose, and Gwen 2.5 did a really good job. It broke down my key growth levers and identified almost the same core metrics as ChatGPT. Then, it identified core bottlenecks and created the requested forecast model. I still have work to do after this, but as a first pass, it’s pretty incredible.
Qwen 2.5 really excels when the model needs to run through five or more steps to complete your goal or objective. In my experience, that’s when other options start to struggle. The more powerful the model, the more strategic it can be and the more steps it can run through to outperform lesser models.
What All of This Means for the AI Industry
When DeepSeek and Qwen 2.5 launched, stock for AI chip providers crashed. These Chinese models are trained on lower-capability chips than U.S. options, but they match their competitors in power. Further, it’s mindblowing that DeepSeek and Quen 2.5 are free for anyone to use. A $0 price point puts pressure on competitors who charge for access to equivalent models.
I believe this market shift will allow custom AI tools to evolve faster. Teams can build on these great foundational models to create experiences specific to their audiences. As companies like Alibaba push the boundaries with models like Qwen 2.5, the industry is forced to adapt, leading to a future where open-source, custom models become the norm. So, AI giants are in for some competition.
Finding Your Place in the Next Chapter of the AI Race
Overall, Qwen 2.5 offers a cost-effective and powerful solution for AI applications, from content creation to data analysis and strategic planning. It’s fast, multimodal, multilingual, and can support large context windows and complex reasoning.
For individuals and business users, the hardest part will be discovering how to integrate this AI in a useful and impactful way. Ultimately, user habits and workflows are the hardest things to change.
If you want people to use these models, you have to figure out how to integrate them into their work on a day-to-day basis. That means testing use cases, sharing best practices, and building apps that make the AI easier to actually consume and use.
If you can crack that code, you can leverage the power of models like Qwen 2.5 to grow.
To learn more about lead-scoring tactics and marketing growth strategies, check out the full episode of Marketing Against the Grain below:
This blog series is in partnership with Marketing Against the Grain, the video podcast. It digs deeper into ideas shared by marketing leaders Kipp Bodnar (HubSpot’s CMO) and Kieran Flanagan (SVP, Marketing at HubSpot) as they unpack growth strategies and learn from standout founders and peers.