How to quickly download a large number of small images on the debian server?

Hi,

there is a remote VPS with Debian on Board. And periodically need to download a large number of small images by their URL renaming, then compress. The problem like garbage, but the question of speed.

What are the options? I will appreciate any references.

At first I thought to use "wget -i pupkinlist.txt" but you can't specify the destination name of each file.

Now doing probably lamerskie method. Php get list of files and run exec('wget ... then exec(' convert.. so a few php cli instances (hoping that they work in parallel and not interfere with each other)..
Actually really want to know a better option?

(plus there is a related task in the same direction mass check for existence of files in a folder, like file_exists).

Maybe somehow we need to combine multiple exec in one with a bunch of arguments? How to do it and whether it will speed up the process? Feel? you need to use console commands, but how to parallelity and make sure there were not too many I don't know.
What would you recommend?
June 8th 19 at 17:22
7 answers
June 8th 19 at 17:32
Yes, finally easy, tell the whole algorithm in detail, th download can I access, what file? updated? and so and so.
June 8th 19 at 17:24
$ awk '{ ext=gensub(/.*(\.[^.]*$)/,"\\1",1);
 print "wget" $2 "-O " $1 ext}' images.txt

wget image.jpg -O some_id1.jpg
wget image2.jpg -O some_id2.jpg
wget image3.jpg -O some_id3.jpg
June 8th 19 at 17:26
Zip to tar with zero compression. Then to unclench. Overheads will be + 2-3%
)) unfortunately, we are not talking about transfer files from server to server - afton42 commented on June 8th 19 at 17:29
you can use http/2 and to immediately send a bunch of requests in one connection.
there is acceleration only in umenshenie the overhead of creating a connection.
one of the options to tar, but not entirely successful - will not docutek and all that. - Sandra_Kautzer42 commented on June 8th 19 at 17:32
June 8th 19 at 17:28
Put the slip file, or directly a list of URLs and do something like:
cat img_links | xargs -n 1 -P 5 wget {}
Where P is the number of threads and wget'.

You can add some variable that would be needed called files (option -O wget'a).

PS - to verify the existence of the file on the link structure like:
if [[ `wget-S --spider http://link_to_image.jpg 2>&1 | grep 'HTTP/1.1 200 OK" ]]; then
...
June 8th 19 at 17:30
of course, I developed the web and admin, and therefore the Council will be relative to the web, but I would yuzal sockets and transmit one stream from a packet.
June 8th 19 at 17:34
If you already use PHP, then I would try to do so:

1) read the bundle URL from the list (10,20... )
2) through php.net/manual/ru/function.curl-multi-init.php inicijaliziram and start
3) wait until all threads exits
4) rename the file
5) press them
6) repeat p. 1
June 8th 19 at 17:36
rsync
)) unfortunately, we are not talking about transfer files from server to server - afton42 commented on June 8th 19 at 17:39
,
rsync works locally. Not necessarily from server to server. You can, for example, from 1 directory server A directory server A 2 - Sandra_Kautzer42 commented on June 8th 19 at 17:42

Find more questions by tags PHPLinuxWgetDebian