Alibaba’s Qwen Team Releases QVQ-72B Open Source Vision AI Model in Preview
Alibaba’s Qwen research team has released another open-source artificial intelligence (AI) model in preview. Dubbed QVQ-72B, it is a vision-based reasoning model that can analyze visual information from images and understand the context behind them. The tech giant has also shared benchmark scores of the AI model and highlighted that on one specific test, it was able to outperform OpenAI’s o1 model. Notably, Alibaba has released several open-source AI models recently, including the QwQ-32B and Marco-o1 reasoning-focused large language models (LLMs).Alibaba’s Vision-Based QVQ-72B AI Model LaunchedIn a Hugging Face listing, Alibaba’s Qwen team detailed the new open-source AI model. Calling it an experimental research model, the researchers highlighted that the QVQ-72B comes with enhanced visual reasoning capabilities. Interestingly, these are two separate branches of performance, that the researchers have combined in this model.Vision-based AI models are plenty. These include an image encoder and can analyze the visual information and context behind them. Similarly, reasoning-focused models such as o1 and QwQ-32B come with test-time compute scaling capabilities that allow them to increase the processing time for the model. This enables the model to break down the problem, solve it in a step-by-step manner, assess the output and correct it against a verifier.With QVQ-72B’s preview model, Alibaba has combined these two functionalities. It can now analyze information from images and answer complex queries by using reasoning-focused structures. The team highlights that it has significantly improved the performance of the model. Sharing evals from internal testing, the researchers claimed that the QVQ-72B was able to score 71.4 percent in the MathVista (mini) benchmark, outperforming the o1 model (71.0). It is also said to score 70.3 percent on the Multimodal Massive Multi-task Understanding (MMMU) benchmark.Despite the improved performance, there are several limitations, as is the case with most experimental models. The Qwen team stated that the AI model occasionally mixes different languages or unexpectedly switches between them. The code-switching issue is also prominent in the model. Additionally, the model is prone to getting caught in recursive reasoning loops, affecting the final output.
News kiosk Latest Posts
// Function to fetch the latest posts
function fetchLatestPosts() {
const feedUrl = ‘https://newskiosk.pro/feed/’; // Replace with your blog’s RSS feed URL
fetch(feedUrl)
.then(response => response.text())
.then(str => new window.DOMParser().parseFromString(str, “text/xml”))
.then(data => {
const items = Array.from(data.querySelectorAll(“item”));
const latestPostsContainer = document.getElementById(“latest-posts”);
latestPostsContainer.innerHTML = ”; // Clear previous posts
// Shuffle the items array
const shuffledItems = items.sort(() => Math.random() – 0.5);
// Select the first 5 items from the shuffled array
const selectedItems = shuffledItems.slice(0, 5);
// Loop through the selected items and display them
selectedItems.forEach(post => {
const link = post.querySelector(“link”).textContent;
const description = post.querySelector(“description”).textContent;
// Create a new post element
const postElement = document.createElement(“div”);
postElement.classList.add(“latest-post”);
postElement.innerHTML = `
${description} Read more
`;
// Append the new post element to the container
latestPostsContainer.appendChild(postElement);
});
})
.catch(error => console.error(‘Error fetching the latest posts:’, error));
}
// Call the function to fetch and display the latest posts
fetchLatestPosts();
News kiosk- Are You Making These Common Mistakes? Click below to Learn More
Secret That Everyone Is Talking About
If you want to dive deeper into the topic, click on Read More:
Gardening with Ecorganicas: Your Source for Organic Gardening Tips
Financial potential with expert tips on budgeting, investing, and saving
Unlock the Hidden Truth: Click to Reveal!