我要怎么知道網(wǎng)站篩選條件后有多少頁(yè),因?yàn)橛惺畮讉€(gè)條件循環(huán)爬取,不想每個(gè)都去翻開(kāi)看。這是網(wǎng)站http://www.landchina.com/defa...
然后我的代碼中???部分就是頁(yè)數(shù)。我該怎么獲取頁(yè)數(shù),求助!
假設(shè)有7頁(yè)時(shí)的代碼如下:而不知道有多少頁(yè)時(shí)該怎么寫??
class ListSpider(CrawlSpider):
# 爬蟲名稱
name = "tudiwang"
# 允許域名
allowed_domains = ["landchina.com"]
def start_requests(self):
cities = {'4a611fc4-42b1-4861-ac43-8d25b002dc2b%3A31%A8%88%7E%C9%CF%BA%A3%CA%D0',
'4a611fc4-42b1-4861-ac43-8d25b002dc2b%3A2101%A8%88%7E%C9%F2%D1%F4%CA%D0'}
for city in cities:
for i in range(1, 8):
yield FormRequest(url="http://www.landchina.com/default.aspx?tabid=262&ComName=default",
formdata={'TAB_QuerySubmitPagerData': str(i),
'hidComName': 'default',
'__VIEWSTATE': '/wEPDwUJNjkzNzgyNTU4D2QWAmYPZBYIZg9kFgICAQ9kFgJmDxYCHgdWaXNpYmxlaGQCAQ9kFgICAQ8WAh4Fc3R5bGUFIEJBQ0tHUk9VTkQtQ09MT1I6I2YzZjVmNztDT0xPUjo7ZAICD2QWAgIBD2QWAmYPZBYCZg9kFgJmD2QWBGYPZBYCZg9kFgJmD2QWAmYPZBYCZg9kFgJmDxYEHwEFIENPTE9SOiNEM0QzRDM7QkFDS0dST1VORC1DT0xPUjo7HwBoFgJmD2QWAgIBD2QWAmYPDxYCHgRUZXh0ZWRkAgEPZBYCZg9kFgJmD2QWAmYPZBYEZg9kFgJmDxYEHwEFhwFDT0xPUjojRDNEM0QzO0JBQ0tHUk9VTkQtQ09MT1I6O0JBQ0tHUk9VTkQtSU1BR0U6dXJsKGh0dHA6Ly93d3cubGFuZGNoaW5hLmNvbS9Vc2VyL2RlZmF1bHQvVXBsb2FkL3N5c0ZyYW1lSW1nL3hfdGRzY3dfc3lfamhnZ18wMDAuZ2lmKTseBmhlaWdodAUBMxYCZg9kFgICAQ9kFgJmDw8WAh8CZWRkAgIPZBYCZg9kFgJmD2QWAmYPZBYCZg9kFgJmD2QWAmYPZBYEZg9kFgJmDxYEHwEFIENPTE9SOiNEM0QzRDM7QkFDS0dST1VORC1DT0xPUjo7HwBoFgJmD2QWAgIBD2QWAmYPDxYCHwJlZGQCAg9kFgJmD2QWBGYPZBYCZg9kFgJmD2QWAmYPZBYCZg9kFgJmD2QWAmYPFgQfAQUgQ09MT1I6I0QzRDNEMztCQUNLR1JPVU5ELUNPTE9SOjsfAGgWAmYPZBYCAgEPZBYCZg8PFgIfAmVkZAICD2QWBGYPZBYCZg9kFgJmD2QWAmYPZBYCAgEPZBYCZg8WBB8BBYYBQ09MT1I6I0QzRDNEMztCQUNLR1JPVU5ELUNPTE9SOjtCQUNLR1JPVU5ELUlNQUdFOnVybChodHRwOi8vd3d3LmxhbmRjaGluYS5jb20vVXNlci9kZWZhdWx0L1VwbG9hZC9zeXNGcmFtZUltZy94X3Rkc2N3X3p5X2RrZ3NfMDEuZ2lmKTsfAwUCNDYWAmYPZBYCAgEPZBYCZg8PFgIfAmVkZAIBD2QWAmYPZBYCZg9kFgJmD2QWAgIBD2QWAmYPFgQfAQUgQ09MT1I6I0QzRDNEMztCQUNLR1JPVU5ELUNPTE9SOjsfAGgWAmYPZBYCAgEPZBYCZg8PFgIfAmVkZAIDD2QWAgIDDxYEHglpbm5lcmh0bWwFgwc8cCBhbGlnbj0iY2VudGVyIj48c3BhbiBzdHlsZT0iZm9udC1zaXplOiB4LXNtYWxsIj4mbmJzcDs8YnIgLz4NCiZuYnNwOzxhIHRhcmdldD0iX3NlbGYiIGhyZWY9Imh0dHA6Ly93d3cubGFuZGNoaW5hLmNvbS8iPjxpbWcgYm9yZGVyPSIwIiBhbHQ9IiIgd2lkdGg9IjI2MCIgaGVpZ2h0PSI2MSIgc3JjPSIvVXNlci9kZWZhdWx0L1VwbG9hZC9mY2svaW1hZ2UvdGRzY3dfbG9nZS5wbmciIC8+PC9hPiZuYnNwOzxiciAvPg0KJm5ic3A7PHNwYW4gc3R5bGU9ImNvbG9yOiAjZmZmZmZmIj5Db3B5cmlnaHQgMjAwOC0yMDE0IERSQ25ldC4gQWxsIFJpZ2h0cyBSZXNlcnZlZCZuYnNwOyZuYnNwOyZuYnNwOyA8c2NyaXB0IHR5cGU9InRleHQvamF2YXNjcmlwdCI+DQp2YXIgX2JkaG1Qcm90b2NvbCA9ICgoImh0dHBzOiIgPT0gZG9jdW1lbnQubG9jYXRpb24ucHJvdG9jb2wpID8gIiBodHRwczovLyIgOiAiIGh0dHA6Ly8iKTsNCmRvY3VtZW50LndyaXRlKHVuZXNjYXBlKCIlM0NzY3JpcHQgc3JjPSciICsgX2JkaG1Qcm90b2NvbCArICJobS5iYWlkdS5jb20vaC5qcyUzRjgzODUzODU5YzcyNDdjNWIwM2I1Mjc4OTQ2MjJkM2ZhJyB0eXBlPSd0ZXh0L2phdmFzY3JpcHQnJTNFJTNDL3NjcmlwdCUzRSIpKTsNCjwvc2NyaXB0PiZuYnNwOzxiciAvPg0K54mI5p2D5omA5pyJJm5ic3A7IOS4reWbveWcn+WcsOW4guWcuue9kSZuYnNwOyZuYnNwO+aKgOacr+aUr+aMgTrmtZnmsZ/oh7vlloTnp5HmioDogqHku73mnInpmZDlhazlj7gmbmJzcDvkupHlnLDnvZE8YnIgLz4NCuWkh+ahiOWPtzog5LqsSUNQ5aSHMDkwNzQ5OTLlj7cg5Lqs5YWs572R5a6J5aSHMTEwMTAyMDAwNjY2KDIpJm5ic3A7PGJyIC8+DQo8L3NwYW4+Jm5ic3A7Jm5ic3A7Jm5ic3A7PGJyIC8+DQombmJzcDs8L3NwYW4+PC9wPh8BBWRCQUNLR1JPVU5ELUlNQUdFOnVybChodHRwOi8vd3d3LmxhbmRjaGluYS5jb20vVXNlci9kZWZhdWx0L1VwbG9hZC9zeXNGcmFtZUltZy94X3Rkc2N3MjAxM195d18xLmpwZyk7ZGRj0Fiy7GPyRpmTRnlY5U0IOrZGzSKlbJQP2TxpPz5bSg==',
'__EVENTVALIDATION': '/wEWAgLRs7z/DgLN3cj/BExRVsB2t10osIeeCUJDCHk68eSfFrDyuUmn8ownEC0+',
'TAB_QuerySortItemList': '20da3312-3b36-4e96-9398-fc8c5174b02c:False',
'TAB_QuerySubmitConditionData': unquote(city),
'TAB_QuerySubmitOrderData': '20da3312-3b36-4e96-9398-fc8c5174b02c:False',
'TAB_QuerySubmitSortData': '',
'TAB_RowButtonActionControl': ''
}
)
# 解析內(nèi)容函數(shù)
def parse(self, response):
for sel in response.xpath('//tr[@onmouseout="this.className=rowClass"]'):
item = TudiItem()
no = sel.xpath('td[@class="gridTdNumber"]/text()')[0].extract()
item['no'] = no
district = sel.xpath('td[@class="queryCellBordy"]/text()')[0].extract()
item['district'] = district
time = sel.xpath('td[@class="queryCellBordy"]/text()')[1].extract()
item['time'] = time
yield item
北大青鳥(niǎo)APTECH成立于1999年。依托北京大學(xué)優(yōu)質(zhì)雄厚的教育資源和背景,秉承“教育改變生活”的發(fā)展理念,致力于培養(yǎng)中國(guó)IT技能型緊缺人才,是大數(shù)據(jù)專業(yè)的國(guó)家
北大青鳥(niǎo)中博軟件學(xué)院創(chuàng)立于2003年,作為華東區(qū)著名互聯(lián)網(wǎng)學(xué)院和江蘇省首批服務(wù)外包人才培訓(xùn)基地,中博成功培育了近30000名軟件工程師走向高薪崗位,合作企業(yè)超4
中公教育集團(tuán)創(chuàng)建于1999年,經(jīng)過(guò)二十年潛心發(fā)展,已由一家北大畢業(yè)生自主創(chuàng)業(yè)的信息技術(shù)與教育服務(wù)機(jī)構(gòu),發(fā)展為教育服務(wù)業(yè)的綜合性企業(yè)集團(tuán),成為集合面授教學(xué)培訓(xùn)、網(wǎng)
達(dá)內(nèi)教育集團(tuán)成立于2002年,是一家由留學(xué)海歸創(chuàng)辦的高端職業(yè)教育培訓(xùn)機(jī)構(gòu),是中國(guó)一站式人才培養(yǎng)平臺(tái)、一站式人才輸送平臺(tái)。2014年4月3日在美國(guó)成功上市,融資1
曾工作于聯(lián)想擔(dān)任系統(tǒng)開(kāi)發(fā)工程師,曾在博彥科技股份有限公司擔(dān)任項(xiàng)目經(jīng)理從事移動(dòng)互聯(lián)網(wǎng)管理及研發(fā)工作,曾創(chuàng)辦藍(lán)懿科技有限責(zé)任公司從事總經(jīng)理職務(wù)負(fù)責(zé)iOS教學(xué)及管理工作。
浪潮集團(tuán)項(xiàng)目經(jīng)理。精通Java與.NET 技術(shù), 熟練的跨平臺(tái)面向?qū)ο箝_(kāi)發(fā)經(jīng)驗(yàn),技術(shù)功底深厚。 授課風(fēng)格 授課風(fēng)格清新自然、條理清晰、主次分明、重點(diǎn)難點(diǎn)突出、引人入勝。
精通HTML5和CSS3;Javascript及主流js庫(kù),具有快速界面開(kāi)發(fā)的能力,對(duì)瀏覽器兼容性、前端性能優(yōu)化等有深入理解。精通網(wǎng)頁(yè)制作和網(wǎng)頁(yè)游戲開(kāi)發(fā)。
具有10 年的Java 企業(yè)應(yīng)用開(kāi)發(fā)經(jīng)驗(yàn)。曾經(jīng)歷任德國(guó)Software AG 技術(shù)顧問(wèn),美國(guó)Dachieve 系統(tǒng)架構(gòu)師,美國(guó)AngelEngineers Inc. 系統(tǒng)架構(gòu)師。