From the course: Prompting with Agentic Techniques
Agentic browsers with chatbot integration - ChatGPT Tutorial
From the course: Prompting with Agentic Techniques
Agentic browsers with chatbot integration
- Agentic browsers are a new category of chatbots that sit on top of existing browsers and give you some unique capabilities that are not really available in other places. Their main feature is that the chatbot is an integral part of the browser. As you can see here, I have the Atlas browser, which is ChatGPT's version, and every time I open a new tab, you'll see that by default, you'll get a full ChatGPT interface with all the features here, and the sidebar will be over here. So you don't even have to go to ChatGPT anymore. You just use this browser and it's always there. And if you've watched any of my recent videos, you may have noticed that my daily browser is comment from Perplexity, and it's also an agentic browser with excellent features. It's been around for a little bit longer and it has a lot of great features. Perplexity is always on, whether you're opening a new window or any tab. And if I go to a website, I can open up the assistant and I can ask it a question about the current page that I'm looking at. I'm gonna say, write five bullet points for a PowerPoint presentation about this page, about 150 characters each. Use an informative but matter of fact style. And this will just take the current page and all of the content within, and summarize it for me rather quickly. And there's actually a shortcut for just summarizing the current page that you're in. So if you don't want to be specific about the five bullet points, you can just hit that button and it's gonna go through and give you a quick summary of the page. One of the interesting things about this browser is that it will remember all of your browser history, so you can ask a questions about something that you remember looking at a while back. So I'm gonna say, find the study that I was looking at earlier about the remote labor index and summarize the key findings for an executive update. Make it one or two sentences, easy to understand. And since it has my browser history, it found the PDF that I was looking at earlier, and it's able to give me a quick summary for a meeting. If I wanted to, I can ask it to open this up in a window. The chatbot has full control over the browser. I can ask it to do all kinds of things with tabs and history. So I can say things like, close all the tabs that I haven't looked at in 30 minutes, or even ask it to organize tabs into different groups. Just like with agent mode and ChatGPT, you can ask it to perform a task that would normally require you on my list of courses. When I get to a certain point, the page will automatically reload some of the rest of the page. I'd like to maybe get a list of all my courses and I'll open up the assistant and I'll say, I need to create a list of all my courses on this page. Go ahead and put together a list with a title, a description from the course, and a link to the course. And you'll see this blue outline that appears on the page. That means that the browser now has control of my window and it's going to scroll through all the courses and gather all the information. Notice that it's making a to-do list of what it needs to do, and it's taking some screenshots as it's performing the task and it's trying to find a way of getting that description. It needs to go ahead and click on each of the courses to get the descriptions. Because doing this is going to require to visit 55 pages, it's asking me to make sure that that's something that I want to do. I'll go ahead and I'm gonna say, please do it. And eventually it would be able to finish the complete list that I asked for, although this is an operation that would require a lot of time. You can see how useful these agentic browsers can be. They make the specific chatbot, immediately available everywhere. They can quickly summarize documents, and even take over existing tasks for you.