Wednesday, Jun 24, 2026
English News
  • Hyderabad
  • Telangana
  • AP News
  • India
  • World
  • Entertainment
  • Sport
  • Science and Tech
  • Business
  • Rewind
  • ...
    • NRI
    • View Point
    • cartoon
    • My Space
    • Education Today
    • Reviews
    • Property
    • Lifestyle
E-Paper
  • NRI
  • View Point
  • cartoon
  • My Space
  • Reviews
  • Education Today
  • Property
  • Lifestyle
Home | Hyderabad | Microsoft S New Free Tool Omniparser V2 Gives More Power To Large Language Models Llms

Microsoft ‘s new free tool OmniParser V2 gives more power to large language models (LLMs)

OmniParser V2 is trained with a larger set of interactive element detection data and icon functional caption data. By decreasing the image size of the icon caption model, OmniParser V2 reduces the latency by 60% compared to the previous version

By Telangana Today
Published Date - 16 February 2025, 07:45 PM
Microsoft ‘s new free tool OmniParser V2 gives more power to large language models (LLMs)
whatsapp facebook twitter telegram

Hyderabad: A new AI model, OmniParser V2, was unveiled by Microsoft. The open-source model allows large language models (LLMs) — which are deep-learning models pre-trained on vast amounts of data — to act as agents capable of using a computer.

According to Microsoft, Graphic User interface (GUI) automation requires agents with the ability to understand and interact with user screens.


However, using general purpose LLM models to serve as GUI agents faces several challenges: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associating the intended action with the corresponding region on the screen.

OmniParser closes this gap by ‘tokenising’ UI screenshots from pixel spaces into structured elements in the screenshot that are interpretable by LLMs.

This enables the LLMs to do retrieval based next action prediction given a set of parsed interactable elements.

OmniParser V2 takes this capability to the next level. Compared to its predecessor, it achieves higher accuracy in detecting smaller interactable elements and faster inference, making it a useful tool for GUI automation.

In particular, OmniParser V2 is trained with a larger set of interactive element detection data and icon functional caption data. By decreasing the image size of the icon caption model, OmniParser V2 reduces the latency by 60% compared to the previous version.

Notably, Omniparser+GPT-4o achieves state-of-the-art average accuracy of 39.6 on a recently released grounding benchmark ScreenSpot Pro, which features high resolution screen and tiny target icons. This is a substantial improvement on GPT-4o’s original score of 0.8.

In simple terms, OmniParserV2 is a tool designed to help AI models interact with graphical user interfaces (GUIs), like the ones you see on your computer screen. When AI models are asked to automate tasks in a GUI, they face two main problems:
1. Recognising which parts of the screen can be interacted with (like buttons, icons, etc.).
2. Understanding what each part of the screen means and knowing what action should be taken on it (like clicking a button or entering text).

OmniParser V2 solves these problems by taking a screenshot of the GUI and breaking it down into structured, understandable elements.

It converts the visual information (the pixels) into parts that AI models can easily interpret.

This makes it possible for AI to predict what the next action should be based on the parsed elements, such as which button to press or field to fill in.

(Source: Microsoft.com)

  • Follow Us :
  • Tags
  • AI model
  • Hyderabad
  • Microsoft

Related News

  • Agniveers excel as Southern Command wins EME Sailing Regatta 2026

    Agniveers excel as Southern Command wins EME Sailing Regatta 2026

  • Four years of continuous study in State is sufficient to make one local candidate: Telangana High Court

    Four years of continuous study in State is sufficient to make one local candidate: Telangana High Court

  • SIR in Telangana: No system yet to detect voters registered in two States, officials admit

    SIR in Telangana: No system yet to detect voters registered in two States, officials admit

  • Hyderabad gets first footpath made entirely of recycled plastic paver blocks

    Hyderabad gets first footpath made entirely of recycled plastic paver blocks

Latest News

  • Doval calls Strait of Hormuz opening a boost for energy security at BRICS meet

    7 hours ago
  • AI to amplify, not replace Infosys: Nandan Nilekani

    7 hours ago
  • President confers Padma awards: Vijay Amritraj, Alka Yagnik and others honoured

    7 hours ago
  • President Murmu confers Padma awards to Rohit Sharma, Mammootty, Vijay Amritraj

    7 hours ago
  • AI expert Desidi Narsimha Reddy launches AYURA, a voice-based healthcare platform

    7 hours ago
  • TG20: Ranga Reddy Risers seal five-wicket win against Medak Falcons

    7 hours ago
  • Delhi Redz defeat Chennai Bulls to win inaugural RPL Women’s title

    7 hours ago
  • Gandhinagar police station SI trapped by ACB while taking bribe

    7 hours ago

company

  • Home
  • About Us
  • Contact Us
  • Privacy Policy

business

  • Subscribe

telangana today

  • Telangana
  • Hyderabad
  • Latest News
  • Entertainment
  • World
  • Andhra Pradesh
  • Science & Tech
  • Sport

follow us

  • Telangana Today Telangana Today
Telangana Today Telangana Today

© Copyrights 2024 TELANGANA PUBLICATIONS PVT. LTD. All rights reserved. Powered by Veegam