-
Notifications
You must be signed in to change notification settings - Fork 104
Human in the Loop Agent UI and Agent interface #290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
| Category | Issue | Status |
|---|---|---|
| Missing action match validation ▹ view | 🧠 Incorrect | |
| Missing configuration interface definition ▹ view | 🧠 Not in standard | |
| Inefficient Image Processing Loop ▹ view | ✅ Fix detected | |
| Unclear State Update Parameter Specification ▹ view | ✅ Fix detected | |
| Document factory function purpose and types ▹ view | ✅ Fix detected | |
| Unused intervention round counter ▹ view | ✅ Fix detected | |
| Optional Agent Name ▹ view | 🧠 Not in scope | |
| Document complex interaction flow ▹ view | 🧠 Not in standard | |
| Inconsistent Function Name Case ▹ view | ✅ Fix detected | |
| Improper Type Hint Comment ▹ view | 🧠 Incorrect |
Files scanned
| File Path | Reviewed |
|---|---|
| src/agentlab/agents/hilt_agent/base_multi_candidate_agent.py | ✅ |
| src/agentlab/agents/hilt_agent/hint_labelling.py | ✅ |
| src/agentlab/agents/hilt_agent/multi_candidate_generic_agent.py | ✅ |
| src/agentlab/agents/hilt_agent/hilt_agent.py | ✅ |
| src/agentlab/agents/hilt_agent/generic_human_guided_agent.py | ✅ |
| src/agentlab/agents/hilt_agent/hint_labelling_ui_files/hint_labeling_ui.html | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
| class MultiCandidateAgentArgs(AgentArgs): | ||
| def make_agent(self) -> MultiCandidateAgent: ... |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| """ | ||
| ... | ||
|
|
||
| def update_agent_state_from_selected_candidate(self, output: dict): |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| self.ui = None | ||
|
|
||
| @cost_tracker_decorator | ||
| def get_action(self, obs): |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| def __post_init__(self): | ||
| """Prefix subagent name with 'MC-'.""" | ||
| super().__post_init__() | ||
| if hasattr(self, 'agent_name') and self.agent_name: |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
|
|
||
| def __init__( | ||
| self, | ||
| subagent_args, # Type: any object with MultiCandidateAgent interface |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| def get_base_human_in_the_loop_genericagent(llm_config): | ||
| from agentlab.agents.generic_agent.tmlr_config import BASE_FLAGS | ||
| from agentlab.llm.llm_configs import CHAT_MODEL_ARGS_DICT | ||
| from agentlab.agents.hilt_agent.hilt_agent import HumanInTheLoopAgentArgs |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| step_n_human_intervention_rounds += 1 | ||
| suggestions = [{ 'action': c['action'], 'think': c['agent_info'].think} for c in candidates] | ||
| # List of Images as base64 - create overlay screenshots for each suggested action | ||
| screenshots = [overlay_action(obs, choice["action"]) for choice in suggestions] |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| choice_idx = None | ||
| for i, candidate in enumerate(suggestions): | ||
| if candidate["action"] == selected_action: | ||
| choice_idx = i | ||
| break | ||
| selected_candidate = candidates[choice_idx] |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| step_n_human_intervention_rounds = 0 | ||
| step_hint = [] | ||
|
|
||
| # Initialize UI once outside the loop | ||
| if self.ui is None: | ||
| self.ui = HintLabeling(headless=False) | ||
| # Show initial waiting state | ||
| initial_inputs = HintLabelingInputs( | ||
| goal=( | ||
| obs.get("goal_object", [{}])[0].get("text", "") | ||
| if obs.get("goal_object") | ||
| else "" | ||
| ), | ||
| error_feedback="", | ||
| screenshot=(img_to_base_64(obs["screenshot"]) if "screenshot" in obs else ""), | ||
| screenshots=[], # no overlay screenshots yet | ||
| axtree=obs.get("axtree_txt", ""), | ||
| history=[], | ||
| hint="", | ||
| suggestions=[], # no suggestions yet | ||
| ) | ||
| self.ui.update_context(initial_inputs) | ||
|
|
||
| # Generate first candidates | ||
| candidates = self.subagent.get_candidate_generations(obs, hint=None, n_candidates=3) | ||
| step_n_human_intervention_rounds += 1 |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| return img_to_base_64(act_img) | ||
|
|
||
|
|
||
| def img_to_base_64(image: Image.Image | np.ndarray) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this could be in a utile file somehwere where we reconcile to avoid duplicates. e.g. the _url version would call this one.
|
|
||
| from agentlab.agents.agent_args import AgentArgs | ||
| from agentlab.agents.hilt_agent.base_multi_candidate_agent import MultiCandidateAgent | ||
| from agentlab.agents.hilt_agent.hint_labelling import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linter doesn't resolve, are we missing init.py ?
Introduces a new agent interface to make any agent human-in-the-loop. At each step, the agent proposes the next action, and humans either select one of the actions or provide a natural-language hint to guide the generation of proposed actions.
Includes a companion UI to enable human intervention, by adding one or more hints to get candidate actions and see past history of interaction.
Adds a stable human_guided_generic_agent.
Adds a draft human-in-the-loop (HITL) agent interface: Multi-candidate Generic Agent.
Updates Xray to view any added in hints in agent_info tab.
Registered the new script entry point
agentlab-mentorfor launching the HITL agent UI.You can run the the UI and the stable human guided generic agent as follows:
agentlab-mentor --benchmark miniwob --task-name "miniwob.book-flight" --seed 7Run agentlab-mentor Generic Agent UI on a benchmark task
This pull request introduces a new "Human-in-the-Loop" (HILT) agent architecture for web automation tasks, enabling a human operator to select among multiple candidate actions proposed by an underlying agent. The changes modularize the agent design, add protocol definitions for multi-candidate agents.
Human-in-the-loop agent workflow and UI integration:
Generic multi-candidate agent implementation:
[Stable] Added
MultipleProposalGenericAgentand its argument class ingeneric_human_guided_agent.py, providing a concrete agent that generates multiple candidate actions using LLM prompts, parses structured responses, and integrates with a hint-labeling UI for human selection.Human-in-the-Loop agent architecture and protocol:
[Draft] Added the
MultiCandidateAgentprotocol inbase_multi_candidate_agent.py, defining a standard interface for agents that generate multiple candidate actions and update their internal state based on the selected candidate. Also introducedMultiCandidateAgentArgsfor agent argument handling and naming conventions.[Draft] Implemented the
HumanInTheLoopAgentclass inhilt_agent.py, which wraps any multi-candidate agent and presents candidate actions to a human via a UI, allowing hints and selection, and updating agent state accordingly. Includes error handling and UI integration.User interface and action visualization:
overlay_action,img_to_base_64) in both agent files to overlay proposed actions on screenshots and encode images for the UI, enhancing human interpretability of agent suggestions. [1] [2]Description by Korbit AI
What change is being made?
Introduce Human in the Loop (HILT) Agent UI and interfaces for agent implementations that support multiple candidate actions and user guidance.
Why are these changes being made?
These changes are being made to facilitate a more interactive approach where a human can guide the decision-making process of an AI agent by selecting among multiple proposed actions. The implementation allows agents to propose several viable actions in complex environments, with a user interface enabling human users to provide hints and select preferred actions. This setup enhances the adaptability and effectiveness of agents in dynamic environments by leveraging human intuition and expertise.