我们在访问网站的时候,发现有些网页ID 是按顺序排列的数字,这个时候我们就可以使用ID遍历的方式来爬取内容。但是局限性在于有些ID数字在10位数左右,那么这样爬取效率就会很低很低! import itertools from common import download def iteration(): max_errors = 5 # maximum number of consecutive download errors allowed num_errors = 0 # current number of consecutive download errors for page in itertools.count(1): url = 'http://example.webscraping.com/view/-{}'.format(page) html = download(url) if html is None: # received an error trying to download this webpage num_errors += 1 if num_errors == max_errors: # reached maximum amount of errors in a row so exit break # so assume have reached the last country ID and can stop downloading else: # success - can scrape the result # ... num_errors = 0