当前位置:Gxlcms > PHP教程 > 这个网站,为啥我用file_get_contents抓取不到任何内容?

这个网站,为啥我用file_get_contents抓取不到任何内容?

时间:2021-07-01 10:21:17 帮助过:7人阅读

http://www.hdwallpapersimages.com/
浏览器显示正常,先使用file_get_contents,抓取内容为空,用ChinaZ的百度蜘蛛和谷歌蜘蛛模拟抓取,还是请求超时,于是我干脆复制我浏览器的header,用file_get_contents抓取,还是抓取为空,这是我的代码:

$opts = array(
            'http'=>array(
                'method'=>"GET",
                'header'=>"Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n".
                    "Accept-Encoding:gzip, deflate, sdch\r\n".
                    "Accept-Language:zh-CN,zh;q=0.8,en;q=0.6\r\n".
                    "Cache-Control:max-age=0\r\n".
                    "Cookie:viewed_cookie_policy=yes; __utmt=1; __utma=37938810.875942873.1452954236.1453114091.1453209277.3; __utmb=37938810.30.10.1453209277; __utmc=37938810; __utmz=37938810.1452954236.1.1.utmcsr=bing|utmccn=(organic)|utmcmd=organic|utmctr=hd%20wallpaper; __unam=eb5fde1-1524ad24043-4a580705-62\r\n".
                    "Host:www.hdwallpapersimages.com\r\n".
                    "Proxy-Connection:keep-alive\r\n".
                    "Upgrade-Insecure-Requests:1\r\n".
                    "User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36\r\n"
            )
        );
        $context = stream_context_create($opts);
        echo file_get_contents('http://www.hdwallpapersimages.com', false, $context);

回复内容:

http://www.hdwallpapersimages.com/
浏览器显示正常,先使用file_get_contents,抓取内容为空,用ChinaZ的百度蜘蛛和谷歌蜘蛛模拟抓取,还是请求超时,于是我干脆复制我浏览器的header,用file_get_contents抓取,还是抓取为空,这是我的代码:

$opts = array(
            'http'=>array(
                'method'=>"GET",
                'header'=>"Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n".
                    "Accept-Encoding:gzip, deflate, sdch\r\n".
                    "Accept-Language:zh-CN,zh;q=0.8,en;q=0.6\r\n".
                    "Cache-Control:max-age=0\r\n".
                    "Cookie:viewed_cookie_policy=yes; __utmt=1; __utma=37938810.875942873.1452954236.1453114091.1453209277.3; __utmb=37938810.30.10.1453209277; __utmc=37938810; __utmz=37938810.1452954236.1.1.utmcsr=bing|utmccn=(organic)|utmcmd=organic|utmctr=hd%20wallpaper; __unam=eb5fde1-1524ad24043-4a580705-62\r\n".
                    "Host:www.hdwallpapersimages.com\r\n".
                    "Proxy-Connection:keep-alive\r\n".
                    "Upgrade-Insecure-Requests:1\r\n".
                    "User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36\r\n"
            )
        );
        $context = stream_context_create($opts);
        echo file_get_contents('http://www.hdwallpapersimages.com', false, $context);

你抓取的网站打不开么

因为网站我也打不开,哈哈哈哈,在你运行的机子上 直接curl 看看有内容么

人气教程排行