Prasad Thammineni
prasadt2
·
AI & ML interests
None yet
Organizations
None yet
Generative UI
Screen agents
-
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Paper • 2410.13757 • Published • 33 -
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Paper • 2410.08164 • Published • 26 -
WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration
Paper • 2408.15978 • Published -
Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents
Paper • 2409.17140 • Published
LAMs
Trained models
Memory
Voice agents
-
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Paper • 2410.15316 • Published • 12 -
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
Paper • 2410.19168 • Published • 22 -
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects
Paper • 2504.19838 • Published • 22
Reasoning
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Paper • 2410.13639 • Published • 19 -
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Paper • 2410.13757 • Published • 33 -
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper • 2411.14405 • Published • 61 -
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS
Paper • 2411.18478 • Published • 37
Agents
-
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
Paper • 2406.12045 • Published • 9 -
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion
Paper • 2412.04301 • Published • 41 -
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems
Paper • 2501.11067 • Published • 13
Datasets
RAG
Memory
Generative UI
Voice agents
-
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Paper • 2410.15316 • Published • 12 -
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
Paper • 2410.19168 • Published • 22 -
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects
Paper • 2504.19838 • Published • 22
Screen agents
-
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Paper • 2410.13757 • Published • 33 -
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Paper • 2410.08164 • Published • 26 -
WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration
Paper • 2408.15978 • Published -
Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents
Paper • 2409.17140 • Published
Reasoning
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Paper • 2410.13639 • Published • 19 -
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Paper • 2410.13757 • Published • 33 -
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper • 2411.14405 • Published • 61 -
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS
Paper • 2411.18478 • Published • 37
LAMs
Agents
-
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
Paper • 2406.12045 • Published • 9 -
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion
Paper • 2412.04301 • Published • 41 -
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems
Paper • 2501.11067 • Published • 13
Trained models
Datasets