Recently learning Python and want to do the first project for parsing data from the closed area (for authorization).
Looked one lesson
(the gist, but there is also a link to the video on YouTube), in which everything is quite clear. But the author does not use any modules for authentication, does not send the headers, use proxies, etc., so the following questions arise:
- If you have to put a few thousand pages, what security measures need to be taken in order not to be banned?
- Probably. if you put a pause between requests you can not get ban? (and how do "razvalivaetsya" the situation, to understand: here you can easily parse, and here you show a complex captcha after the first 3 requests
- Whether to parse from the desktop (as did the author)?
- Any simple http client can you recommend?
- Is it enough to send headers similar to the ones that sends my browser?
The data for parsing in General, simple, titles, travel contacts, no JS, pagination.