2
$\begingroup$

Given a list of web links:

listLinks={
   https://lex.uz/en/docs/-6880024?query=food%20safety#sr-1,
   https://lex.uz/en/docs/6360915?query=food%20safety#sr-1,
   https://lex.uz/en/docs/6112700?query=food%20safety#sr-1
};

How can I download the files attached to the links in listLinks?

$\endgroup$

2 Answers 2

6
$\begingroup$

Look the documentation for URLDownload

listLinks=URL/@{
   "https://lex.uz/en/docs/-6880024?query=food%20safety#sr-1",
   "https://lex.uz/en/docs/6360915?query=food%20safety#sr-1",
   "https://lex.uz/en/docs/6112700?query=food%20safety#sr-1"
};
URLDownload[listLinks]
$\endgroup$
5
  • $\begingroup$ Thanks for the answer. I get a list of .tmp files from your code but cannot have access to the actual texts of the files. $\endgroup$ Commented Oct 21, 2024 at 18:22
  • $\begingroup$ You asked to download the files from URL, not to parse the website to extract files. I think I answered what you asked. Perhaps you should ask a new question with more details. $\endgroup$ Commented Oct 21, 2024 at 18:25
  • 2
    $\begingroup$ @TugrulTemel Rename these ".tmp" files to ".html" and open with browser or use URLDownload[url,file] with explicit name of a file. $\endgroup$ Commented Oct 21, 2024 at 18:39
  • $\begingroup$ @azerbajdzan: How do you automatically rename these tmp files? $\endgroup$ Commented Oct 21, 2024 at 21:13
  • 2
    $\begingroup$ @TugrulTemel did you actually look at the documentation as suggsted? Not only did azerbajdzan tell you how to rename it, there's also an example like this explicitely listed in the details. I suggest you also take a look at Map $\endgroup$ Commented Oct 22, 2024 at 6:30
1
$\begingroup$

Thanks go to @rhermans and @azerbajdzan for the following answer, which may help others in future:

(* Hyperlinks are retrieved.*)
links = Import["https://lex.uz/en/search/nat?query=food%20safety", "Hyperlinks"]; 

(*The first 15 links are selected.*)
urlLinks = Take[URL[#] & /@ links, 15]; 

(*The documents in the selected links are saved as individual HTML files.*)
Table[URLDownload[urlLinks[[i]], ToString[deneme[i]] <> ".html"], {i,15}] 

This code retrieves all the documents using URL links, with a caveat. The outputs are not really in good format because there are many undesirable comments in the imported documents.

$\endgroup$
1
  • 2
    $\begingroup$ ToString[deneme[i].html] is not a correct code, use this instead: ToString[deneme[i]] <> ".html" $\endgroup$ Commented Oct 22, 2024 at 16:49

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.