8
$\begingroup$

In Mathematica it's easy to Import data from the web and perform all sort of Web Operations. But the later doesn't allow to simulate a more interactive browser operations, such as a "Click" action into a button or link. This may be relevant, for instance in in this question, where a follow-up comment asked for the "next page".

How can we open a web-page, click a particular button or link and extract relevant information such as the URL, page source and list of links of the new web-page?

$\endgroup$

1 Answer 1

8
$\begingroup$

Since Mathematica 11.3 there is an experimental implementation for WebDriver, a proposed W3C standard that currently supports Chrome and Firefox web browsers. It allows actions such as "OpenWebPage", "CaptureWebPage" (Image), "JavascriptExecute", and mouse operations such as "ClickElement" and "HoverElement" among others.

There is no documented option to get the URL or the page source, but a simple inspection of the implementation reveals some internal function of interest, including

WebUnit`GetURL
WebUnit`PageLinks
WebUnit`GetPageHtml

and more.

The following code opens Chrome on Google Maps, captures an image and extract the modified URL, which is a modified URL that includes the detected location.

Module[
 {
  session = StartExternalSession["WebDriver-Chrome"],
  iws, chromedo, img, links
  },
 chromedo[cmd_] := ExternalEvaluate[session, cmd];
 Pause[1];
 iws = ExternalEvaluateWebDriver`Private`websession[];
 Pause[1]; (* Time to load chrome *)
 chromedo["OpenWebPage" -> "https://maps.google.com"];
 Pause[5]; (* Time to load the page *)
 img = chromedo["CaptureWebPage"];
 Echo@WebUnit`GetURL[iws];
 (*links=Union@WebUnit`PageLinks[iws];*)
 DeleteObject[session];
 img
 ]

Mathematica graphics

The following opens Firefox, searches in DuckDuckGo, locates a link to a twitter account, shows it (scroll down), clicks on it and then closes.

Module[
 {
  session = StartExternalSession["WebDriver-Firefox"],
  query = 
   StringTemplate["https://duckduckgo.com/html/?q=%22``%22\""]@
    URLEncode["mathematica stackexchange"],
  located
  },
 ffoxdo[cmd_] := ExternalEvaluate[session, cmd];
 Pause[10];
 ffoxdo["OpenWebPage" -> query];
 Pause[1];
 located = 
  ffoxdo["LocateElements" -> <|"PartialLinkText" -> "@StackMma"|> ];
 ffoxdo["ShowElement" -> located];
 Pause[1];
 ffoxdo["ClickElement" -> First[located]];
 Pause[10];
 DeleteObject[session];
 ]

enter image description here

$\endgroup$
1
  • 1
    $\begingroup$ Interesting to see that WebUnit is how they're doing this. It was originally an Arnoud Buzing project. $\endgroup$ Commented Jul 26, 2018 at 0:07

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.