当前位置：Gxlcms > PHP教程 > php实现搜索引擎爬行代码分享

php实现搜索引擎爬行代码分享

时间：2021-07-01 10:21:17 帮助过：42人阅读

本文主要和大家介绍了php记录搜索引擎爬行记录的实现代码，然后在文中给大家补充介绍了php获取各搜索蜘蛛爬行记录的代码,需要的朋友可以参考下，希望能帮助到大家。

下面是完整代码：

//记录搜索引擎爬行记录 $searchbot = get_naps_bot(); 
if ($searchbot) 
{ $tlc_thispage = addslashes($_SERVER['HTTP_USER_AGENT']); 
$url = $_SERVER['HTTP_REFERER']; 
$file = WEB_PATH.'robotslogs.txt'; 
$date = date('Y-m-d H:i:s'); 
$data = fopen($file,'a'); 
fwrite($data,"Time:$date robot:$searchbot URL:$tlc_thispage/r/n"); 
fclose($data);
}

WEB_PATH为index.PHP下define的根目录路径，意思就是说robotslogs.txt文件是放在根目录下的。

通过get_naps_bot()获取蜘蛛爬行记录，然后在通过addslashes处理一下，将数据存储于变量$tlc_thispage中。

fopen打开robotslogs.txt文件，将数据通过函数fwrite写入，在通过函数fclose关闭就可以了。

因为我觉得没必要，所以把自己网站上的代码删除了，所以也没有效果示例了。

PS：php获取各搜索蜘蛛爬行记录的代码

支持如下的搜索引擎：Baidu,Google,Bing,Yahoo,Soso,Sogou,Yodao爬行网站的记录!

代码：

<?php 
/**
* 获取搜索引擎爬行记录
* edit by www.gxlcms.com
*/
function get_naps_bot() 
{ 
$useragent = strtolower($_SERVER['HTTP_USER_AGENT']); 
if (strpos($useragent, 'googlebot') !== false){ 
return 'Google'; 
} 
if (strpos($useragent, 'baiduspider') !== false){ 
return 'Baidu'; 
} 
if (strpos($useragent, 'msnbot') !== false){ 
return 'Bing'; 
} 
if (strpos($useragent, 'slurp') !== false){ 
return 'Yahoo'; 
} 
if (strpos($useragent, 'sosospider') !== false){ 
return 'Soso'; 
} 
if (strpos($useragent, 'sogou spider') !== false){ 
return 'Sogou'; 
} 
if (strpos($useragent, 'yodaobot') !== false){ 
return 'Yodao'; 
} 
return false; 
} 
function nowtime(){ 
$date=date("Y-m-d.G:i:s"); 
return $date; 
} 
$searchbot = get_naps_bot(); 
if ($searchbot) { 
$tlc_thispage = addslashes($_SERVER['HTTP_USER_AGENT']); 
$url=$_SERVER['HTTP_REFERER']; 
$file="www.gxlcms.com.txt"; 
$time=nowtime(); 
$data=fopen($file,"a"); 
fwrite($data,"Time:$time robot:$searchbot URL:$tlc_thispage\n"); 
fclose($data); 
} 
?>

php实现搜索引擎爬行代码分享

人气教程排行