selenium - I am using Scrapy to crawl data, but the server block my IP -
as picture showed, using scrapy crawl data server, server seem block ip, curious server block ip of my mac or ip of router?
it router's public ip blocked.
in scenario there 2 networks.
one, public internet - server (hosting website crawl) connected.
two, private home network - mac connected.
your router acts gateway private home network internet , helps mac talk server.
to act "gateway" router have 2 ip addresses. 1 private ip address home network , 1 public ip address. public ip address visible server. in server's viewpoint public ip address crawl requests made.
hence it router's public ip blocked.
also please respect website's terms of service , crawl responsibly.
if don't want banned try following settings in settings.py:
- limit concurrent_requests
- set download_delay
reference : http://doc.scrapy.org/en/latest/topics/settings.html
Comments
Post a Comment