久久影院一区二区三区-久久影院午夜伦手机不四虎卡-久久影院毛片一区二区-久久影视一区-在线精品91青草国产在线观看-在线激情小视频

<返回

Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control

Longtao Zheng, Rundong Wang, Xinrun Wang, Bo An

ICLR 2024 Conference

May 2024

Keywords: AI Agents, Large Language Models, Prompting

Abstract:

Building agents with large language models (LLMs) for computer control is a burgeoning research area, where the agent receives computer states and performs actions to complete complex tasks. Previous computer agents have demonstrated the benefits of in-context learning (ICL); however, their performance is hindered by several issues. First, the limited context length of LLMs and complex computer states restrict the number of exemplars, as a single webpage can consume the entire context. Second, the exemplars in current methods, such as high-level plans and multi-choice questions, cannot represent complete trajectories, leading to suboptimal performance in long-horizon tasks. Third, existing computer agents rely on task-specific exemplars and overlook the similarity among tasks, resulting in poor generalization to novel tasks. To address these challenges, we introduce Synapse, a computer agent featuring three key components: i) state abstraction, which filters out task-irrelevant information from raw states, allowing more exemplars within the limited context, ii) trajectory-as-exemplar prompting, which prompts the LLM with complete trajectories of the abstracted states and actions to improve multi-step decision-making, and iii) exemplar memory, which stores the embeddings of exemplars and retrieves them via similarity search for generalization to novel tasks. We evaluate Synapse on MiniWoB++, a standard task suite, and Mind2Web, a real-world website benchmark. In MiniWoB++, Synapse achieves a 99.2% average success rate (a 10% relative improvement) across 64 tasks using demonstrations from only 48 tasks. Notably, Synapse is the first ICL method to solve the book-flight task in MiniWoB++. Synapse also exhibits a 56% relative improvement in average step success rate over the previous state-of-the-art prompting scheme in Mind2Web.

View More PDF>>

主站蜘蛛池模板: 郁南县| 安丘市| 延安市| 翼城县| 林州市| 台江县| 闽清县| 井冈山市| 淄博市| 科技| 呼和浩特市| 靖江市| 南涧| 登封市| 娄底市| 方山县| 怀安县| 九江县| 白银市| 虞城县| 额济纳旗| 盖州市| 竹北市| 富蕴县| 阳江市| 苍山县| 滦平县| 虞城县| 巨野县| 乌拉特中旗| 天台县| 扶沟县| 蒲城县| 宿州市| 贵阳市| 河西区| 海兴县| 闻喜县| 观塘区| 开平市| 平和县|