Friday, Apr 24, 2026
English News
  • Hyderabad
  • Telangana
  • AP News
  • India
  • World
  • Entertainment
  • Sport
  • Science and Tech
  • Business
  • Rewind
  • ...
    • NRI
    • View Point
    • cartoon
    • My Space
    • Education Today
    • Reviews
    • Property
    • Lifestyle
E-Paper
  • NRI
  • View Point
  • cartoon
  • My Space
  • Reviews
  • Education Today
  • Property
  • Lifestyle
Home | Hyderabad | Microsoft S New Free Tool Omniparser V2 Gives More Power To Large Language Models Llms

Microsoft ‘s new free tool OmniParser V2 gives more power to large language models (LLMs)

OmniParser V2 is trained with a larger set of interactive element detection data and icon functional caption data. By decreasing the image size of the icon caption model, OmniParser V2 reduces the latency by 60% compared to the previous version

By Telangana Today
Published Date - 16 February 2025, 07:45 PM
Microsoft ‘s new free tool OmniParser V2 gives more power to large language models (LLMs)
whatsapp facebook twitter telegram

Hyderabad: A new AI model, OmniParser V2, was unveiled by Microsoft. The open-source model allows large language models (LLMs) — which are deep-learning models pre-trained on vast amounts of data — to act as agents capable of using a computer.

According to Microsoft, Graphic User interface (GUI) automation requires agents with the ability to understand and interact with user screens.


However, using general purpose LLM models to serve as GUI agents faces several challenges: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associating the intended action with the corresponding region on the screen.

OmniParser closes this gap by ‘tokenising’ UI screenshots from pixel spaces into structured elements in the screenshot that are interpretable by LLMs.

This enables the LLMs to do retrieval based next action prediction given a set of parsed interactable elements.

OmniParser V2 takes this capability to the next level. Compared to its predecessor, it achieves higher accuracy in detecting smaller interactable elements and faster inference, making it a useful tool for GUI automation.

In particular, OmniParser V2 is trained with a larger set of interactive element detection data and icon functional caption data. By decreasing the image size of the icon caption model, OmniParser V2 reduces the latency by 60% compared to the previous version.

Notably, Omniparser+GPT-4o achieves state-of-the-art average accuracy of 39.6 on a recently released grounding benchmark ScreenSpot Pro, which features high resolution screen and tiny target icons. This is a substantial improvement on GPT-4o’s original score of 0.8.

In simple terms, OmniParserV2 is a tool designed to help AI models interact with graphical user interfaces (GUIs), like the ones you see on your computer screen. When AI models are asked to automate tasks in a GUI, they face two main problems:
1. Recognising which parts of the screen can be interacted with (like buttons, icons, etc.).
2. Understanding what each part of the screen means and knowing what action should be taken on it (like clicking a button or entering text).

OmniParser V2 solves these problems by taking a screenshot of the GUI and breaking it down into structured, understandable elements.

It converts the visual information (the pixels) into parts that AI models can easily interpret.

This makes it possible for AI to predict what the next action should be based on the parsed elements, such as which button to press or field to fill in.

(Source: Microsoft.com)

  • Follow Us :
  • Tags
  • AI model
  • Hyderabad
  • Microsoft

Related News

  • Hyderabad police bust tea adulteration racket, 10 held

    Hyderabad police bust tea adulteration racket, 10 held

  • University of Hyderabad scholar wins Fulbright-Nehru fellowship

    University of Hyderabad scholar wins Fulbright-Nehru fellowship

  • Hyderabad’s Go-To Indulgence: Cakes for Every Mood

    Hyderabad’s Go-To Indulgence: Cakes for Every Mood

  • Hyderabad Water Board seizes 17 illegal motors during raids in city

    Hyderabad Water Board seizes 17 illegal motors during raids in city

Latest News

  • RTC driver’s death: Bandh observed in Narsampet, tension at drivers’ native place

    9 mins ago
  • Cartoon Today on April 24, 2026

    20 mins ago
  • Airtel, Jio drive telecom growth as India adds 93 lakh subscribers in March 2026

    10 hours ago
  • Donald Trump praises Indian tennis star Dhakshineswar Suresh

    10 hours ago
  • India slams Trump’s remarks on immigration as ‘uninformed and inappropriate’

    10 hours ago
  • Delhi Capitals rope in Rehan Ahmed for IPL 2026

    10 hours ago
  • Host YMCA Secunderabad beats St Francis Boys 66-47

    10 hours ago
  • Foetus found in Foxconn Bengaluru office toilet, police launch probe

    10 hours ago

company

  • Home
  • About Us
  • Contact Us
  • Privacy Policy

business

  • Subscribe

telangana today

  • Telangana
  • Hyderabad
  • Latest News
  • Entertainment
  • World
  • Andhra Pradesh
  • Science & Tech
  • Sport

follow us

  • Telangana Today Telangana Today
Telangana Today Telangana Today

© Copyrights 2024 TELANGANA PUBLICATIONS PVT. LTD. All rights reserved. Powered by Veegam

.