How to emulate browser request to google with cURL?

You need to get the first 100 results for a particular keyword(with the help of scraper), but from time to time, Google blocks the IP and requires to enter the captcha. Is it possible to emulate a browser request with cURL?

Configs cURL:

$curl = curl_init();
 curl_setopt($curl, CURLOPT_URL, $url);
 curl_setopt($curl, THIS, 0);
 curl_setopt($curl, CURLOPT_VERBOSE, true);
 curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
 curl_setopt($curl, CURLOPT_AUTOREFERER, 1);
 curl_setopt($curl, CURLOPT_COOKIEFILE, $GLOBALS['cookie_file']);
 curl_setopt($curl, CURLOPT_TIMEOUT, 30);
 curl_setopt($curl, CURLOPT_HEADER, 0);
 curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
 $html = curl_exec($curl);
October 3rd 19 at 04:20
7 answers
October 3rd 19 at 04:22
If you want to pretend to be a browser, the best way to be. For example, using phantomJS.
Or Selenium - cristopher_Mo commented on October 3rd 19 at 04:25
It is unnecessary for such queries, as the author, a waste of memory and will have to invent something with the wrapper to run in multiple threads. - ricardo_Haley commented on October 3rd 19 at 04:28
October 3rd 19 at 04:24
So like Google has an API, why this bike?
October 3rd 19 at 04:26
1. Make a request to the browser
2. View what headers go to Google (to look in chrome dev tools, firebug, fiddler, etc.)
3. Program curl to send identical headers
4....
5. PROFIT!
October 3rd 19 at 04:28
Suppose you give the browser headers, what is the difference? Google will also block when exceeding certain limits.
The fact of the matter is that when curl request google needs to enter the captcha, but when you get to the desired url in the browser the captcha is not necessary to enter - cristopher_Mo commented on October 3rd 19 at 04:31
Typically, the User-agent header with the "right" browser may not be enough. - ricardo_Haley commented on October 3rd 19 at 04:34
This will be enough until threshold values are exceeded. Regular users sometimes suffer because of this because, instead of search, you are prompted to enter a captcha. - Santos_Moore1 commented on October 3rd 19 at 04:37
the captcha can be automatically resolved through pixodrom.com — cast a dollar and a thousand requests to Google, you can easily send - ayla.Lowe78 commented on October 3rd 19 at 04:40
Correctly, the captcha can be solved without feigning browser. - Dalton57 commented on October 3rd 19 at 04:43
October 3rd 19 at 04:30
There is a reason why Google blocks requests from scripts, and check the captcha, so you need to prove to him that your script is not a bot (to recognize the captcha). I have noticed that he does it, where the requested words are not connected morphologically and go EN masse to the same address (which, moreover, certainly belongs to the famous datacenter?). You can also try to send the correct "browser" of Togolok with the UserAgent and other headers.
Even if you parse the page of issue one request, the captcha also appears. - cristopher_Mo commented on October 3rd 19 at 04:33
October 3rd 19 at 04:32
in the console about as curl -A Mozilla www.google.com/search?q=linux |html2text -width 70
October 3rd 19 at 04:34
What does Google benefit from the fact that they are bots? They see advertising, they only burden and harm. right Google makes, I would have scripts blocked.

Find more questions by tags PHPWeb Development