Thursday, May 14, 2026
English News
  • Hyderabad
  • Telangana
  • AP News
  • India
  • World
  • Entertainment
  • Sport
  • Science and Tech
  • Business
  • Rewind
  • ...
    • NRI
    • View Point
    • cartoon
    • My Space
    • Education Today
    • Reviews
    • Property
    • Lifestyle
E-Paper
  • NRI
  • View Point
  • cartoon
  • My Space
  • Reviews
  • Education Today
  • Property
  • Lifestyle
Home | Hyderabad | Microsoft S New Free Tool Omniparser V2 Gives More Power To Large Language Models Llms

Microsoft ‘s new free tool OmniParser V2 gives more power to large language models (LLMs)

OmniParser V2 is trained with a larger set of interactive element detection data and icon functional caption data. By decreasing the image size of the icon caption model, OmniParser V2 reduces the latency by 60% compared to the previous version

By Telangana Today
Published Date - 16 February 2025, 07:45 PM
Microsoft ‘s new free tool OmniParser V2 gives more power to large language models (LLMs)
whatsapp facebook twitter telegram

Hyderabad: A new AI model, OmniParser V2, was unveiled by Microsoft. The open-source model allows large language models (LLMs) — which are deep-learning models pre-trained on vast amounts of data — to act as agents capable of using a computer.

According to Microsoft, Graphic User interface (GUI) automation requires agents with the ability to understand and interact with user screens.


However, using general purpose LLM models to serve as GUI agents faces several challenges: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associating the intended action with the corresponding region on the screen.

OmniParser closes this gap by ‘tokenising’ UI screenshots from pixel spaces into structured elements in the screenshot that are interpretable by LLMs.

This enables the LLMs to do retrieval based next action prediction given a set of parsed interactable elements.

OmniParser V2 takes this capability to the next level. Compared to its predecessor, it achieves higher accuracy in detecting smaller interactable elements and faster inference, making it a useful tool for GUI automation.

In particular, OmniParser V2 is trained with a larger set of interactive element detection data and icon functional caption data. By decreasing the image size of the icon caption model, OmniParser V2 reduces the latency by 60% compared to the previous version.

Notably, Omniparser+GPT-4o achieves state-of-the-art average accuracy of 39.6 on a recently released grounding benchmark ScreenSpot Pro, which features high resolution screen and tiny target icons. This is a substantial improvement on GPT-4o’s original score of 0.8.

In simple terms, OmniParserV2 is a tool designed to help AI models interact with graphical user interfaces (GUIs), like the ones you see on your computer screen. When AI models are asked to automate tasks in a GUI, they face two main problems:
1. Recognising which parts of the screen can be interacted with (like buttons, icons, etc.).
2. Understanding what each part of the screen means and knowing what action should be taken on it (like clicking a button or entering text).

OmniParser V2 solves these problems by taking a screenshot of the GUI and breaking it down into structured, understandable elements.

It converts the visual information (the pixels) into parts that AI models can easily interpret.

This makes it possible for AI to predict what the next action should be based on the parsed elements, such as which button to press or field to fill in.

(Source: Microsoft.com)

  • Follow Us :
  • Tags
  • AI model
  • Hyderabad
  • Microsoft

Related News

  • Telangana ends FY26 with Rs 9,235 crore revenue deficit

    Telangana ends FY26 with Rs 9,235 crore revenue deficit

  • Miniature collectibles become Hyderabad’s newest home decor obsession

    Miniature collectibles become Hyderabad’s newest home decor obsession

  • Automobile theft suspect attempts self-immolation at Kulsumpura police station

    Automobile theft suspect attempts self-immolation at Kulsumpura police station

  • TG POLYCET 2026 conducted successfully across Telangana

    TG POLYCET 2026 conducted successfully across Telangana

Latest News

  • Frederic Soyez to coach Indian junior men’s hockey team

    1 min ago
  • Shutdowns paralyse life in parts of Manipur after fresh killings

    10 mins ago
  • Sugar export bans till Sept 30

    25 mins ago
  • Mamata Banerjee faces ‘thief’ slogans at Calcutta HC

    25 mins ago
  • Rajasthan NEET leak accused celebrated 5 MBBS selections, under SOG scanner

    30 mins ago
  • Hyderabad: Student groups demand arrest of Bandi Sanjay’s son in POCSO case

    36 mins ago
  • Stalin says DMK alliance remains strong despite electoral setback

    38 mins ago
  • IIT Madras launches new UG, PG, online and web-enabled programmes

    41 mins ago

company

  • Home
  • About Us
  • Contact Us
  • Privacy Policy

business

  • Subscribe

telangana today

  • Telangana
  • Hyderabad
  • Latest News
  • Entertainment
  • World
  • Andhra Pradesh
  • Science & Tech
  • Sport

follow us

  • Telangana Today Telangana Today
Telangana Today Telangana Today

© Copyrights 2024 TELANGANA PUBLICATIONS PVT. LTD. All rights reserved. Powered by Veegam