鍍金池/ 問答/PHP  HTML/ php curl爬取頁面,頁面可以訪問,但是卻顯示500錯(cuò)誤

php curl爬取頁面,頁面可以訪問,但是卻顯示500錯(cuò)誤


一般是爬到$i為35-45之間的時(shí)候就會(huì)出現(xiàn)這個(gè)internal server error錯(cuò)誤


clipboard.png

class Pacong extends Base
{

public function test(){
    header("content-type:text/html;charset=utf-8");
    set_time_limit(0); //執(zhí)行時(shí)間無限
    ini_set('memory_limit', '-1'); //內(nèi)存無限

    $array=[0=>'https://haikou.anjuke.com/ask/fl-qita/p',
        1=>'https://haikou.anjuke.com/ask/fl-daikuan/p',
        2=>'https://haikou.anjuke.com/ask/fl-maifang/p',
        3=>'https://haikou.anjuke.com/ask/fl-maifanga/p',
        4=>'https://haikou.anjuke.com/ask/fl-zufang/p',
        5=>'https://haikou.anjuke.com/ask/fl-jiaoyiguohu/p'];
    foreach ($array as $k=>$v) {
      //  echo $v;
        $header=array();
        for ($i = 1; $i < 100; $i++) {
           // sleep(5);
            echo $i;
            $curlobj = curl_init();
            //設(shè)置訪問的url
            curl_setopt($curlobj, CURLOPT_URL, $v . $i . "/");
            //echo $array['0'] . $i . "/";
            curl_setopt($curlobj, CURLOPT_TIMEOUT, 0);
            curl_setopt($curlobj, CURLOPT_CONNECTTIMEOUT, 0);
            //執(zhí)行后不直接打印出
            curl_setopt($curlobj, CURLOPT_RETURNTRANSFER, true);
            curl_setopt($curlobj, CURLOPT_HEADER, 1);
            //curl_setopt($curlobj, CURLOPT_RETURNTRANSFER, 1);
                curl_setopt($curlobj,CURLOPT_HTTPHEADER,$header);
                curl_setopt($curlobj,CURLOPT_COOKIE,'aQQ_ajkguid=65E1E78E-6422-B2AF-B73F-000C2FA17625; ctid=49; 58tj_uuid=dfc8d57c-c982-4cb0-9693-9e3b688e6d97; als=0; _ga=GA1.2.1683086249.1524535452; _gid=GA1.2.785162605.1524535452; isp=true; lps=http%3A%2F%2Fhaikou.anjuke.com%2Fask%2Ffl-qita%2Fp30%7C; twe=2; sessid=39A3A71B-0A07-8C1E-1B9D-9409C5F93F8B; init_refer=; new_uv=5; new_session=0; __xsptplusUT_8=1; __xsptplus8=8.5.1524622380.1524623253.8%234%7C%7C%7C%7C%7C%23%23yn1dRjCaHplJ6-hXacmy8mfcE82lTJHz%23;');

                curl_setopt($curlobj, CURLOPT_SSL_VERIFYPEER, false); // 跳過證書檢查
            curl_setopt($curlobj, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36");
            $output = curl_exec($curlobj);  //執(zhí)行獲取內(nèi)容
                $info = curl_getinfo($curlobj);
                print_r($info);
                echo "<pre>";print_r(curl_error($curlobj));echo "</pre>";
                echo "<pre>";print_r(curl_getinfo($curlobj));echo "</pre>";
                echo "<pre>";print_r($header);echo "</pre>";
                curl_close($curlobj);          //關(guān)閉curl
                }
                }
                }
                }
回答
編輯回答
瘋浪

有可能是服務(wù)器做了反爬蟲判斷,就是如果檢測(cè)出訪問頻繁再加上一些特性,判斷出是爬蟲爬取就返回500錯(cuò)誤頁面 curl_getinfo

$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE); 
2017年5月20日 01:23
編輯回答
吢涼

讓我如何捕獲這個(gè)500錯(cuò)誤,我想捕獲這個(gè)500,然后continue;或者進(jìn)行sleep(30)操作

2017年6月20日 00:42