Zadejte hledaný výraz...

Zjisteni robota

Ahojte,
delam si webovou stranku - katalog (kategorie a podkategorie) a pro kazdou kategorii a podkategorii si chci ukladat informaci, kdy naposlet zde byl robot google a seznam.
Kamarad doporucil pouzit: $_SERVER
Udaj o tom, kdy robot me navstivil si chci ukladat do mySQl, tj. typ robota a pak cas kdy tu byl.
Mate nejaky takovy - pripadne podobny script?
13. 11. 2008 16:49:56
https://webtrh.cz/diskuse/zjisteni-robota#reply159356
ondrej.baar
verified
rating uzivatele
(2 hodnocení)
13. 11. 2008 17:06:14
Tohle uz se tu resilo:
http://webtrh.cz/17491-neznamy-prilis-aktivni-bot?p=132456
Melo by stacit upravit tohle:
/**
* customizedscripts.biz Search Engine Tool
* version 1.0
* Copyright (C) 2006 Peter Soper http://www.customizedscripts.biz/
* PHP script to log search engines spider visits.
* FREE TO USE BUT DO NOT REMOVE COPYRIGHT NOTICES OR CHANGE ANYTHING
* EMAIL info@customizedscripts.biz for customisation or any info
*/
/**
* for emal reports place your email betweem the red commas (optional)
*/
$email = "tvuj@mail.sem";
/**
* Location of the log file (optional)
*/
$log = "/bots.log";
/**
* Date/Time format
*/
$dateTime = date("r");
// DO NOT MODIFY BELOW THIS //////////////////////////////////////////
$agents[] = 'scooter';
$spiders[] = "Scooter (Altavista's robot www.altavista.com)";
$agents[] = 'googlebot';
$spiders[] = 'Google';
$agents[] = 'slurp';
$spiders[] = "Slurp (Inktomi's robot, HotBot)";
$agents[] = 'webmoose';
$spiders[] = "Webmoose (MSN.com's robot, www.msn.com)";
$agents[] = 'gulliver';
$spiders[] = "Gulliver (Northern Light's robot, www.northernlight.com)";
$agents[] = 'lycos';
$spiders[] = 'Lycos www.lycos.com';
$agents[] = 'wombat';
$spiders[] = "WebWombat's robot, www.webwombat.com.au";
$agents[] = 'infoseek';
$spiders[] = 'Infoseek www.infoseek.com';
$agents[] = 'askjeeves';
$spiders[] = 'Askjeeves www.askjeeves.com';
$agents[] = 'freecrawl';
$spiders[] = "Free Crawl (Euroseek's robot, www.euroseek.com)";
$agents[] = 'robozilla';
$spiders[] = "Robozilla (DMOZ's Directory link checker robot, www.dmoz.com)";
$agents[] = 'zyborg';
$spiders[] = "ZyBorg (WiseNut's robot, www.wisenutbot.com)";
$agents[] = 'Gigabot';
$spiders[] = 'Gigabot, www.gigablast.com';
$agents[] = 'Ask Jeeves/Teoma';
$spiders[] = 'Ask Jeeves/Teoma, www.teoma.com, www.askjeeves.com';
$agents[] = 'grub-client';
$spiders[] = 'Grub (Looksmart Grub client robot, www.grub.org)';
$agents[] = 'linkwalker';
$spiders[] = 'Linkwalker (SevenTwentyFour link checker robot, www.seventwentyfour.com)';
$agents[] = 'ia_archiver';
$spiders[] = "Internet Archive (Alexa & WayBackMachine's robot, www.archive.org, www.alexa.com)";
$agents[] = 'TurnitinBot';
$spiders[] = 'TurnitinBot (Anti-Plagiarism robot, www.turnitin.com)';
$agents[] = 'atSpider';
$spiders[] = 'atSpider (Email Collector/Spam)';
$agents[] = 'autoemailspider';
$spiders[] = 'autoemailspider (Email Collector/Spam)';
$agents[] = 'cherrypicker';
$spiders[] = 'cherrypicker (Email Collector/Spam)';
$agents[] = 'DSurf';
$spiders[] = 'DSurf (Email Collector/Spam)';
$agents[] = 'DTS Agent';
$spiders[] = 'DTS Agent (Email Collector/Spam)';
$agents[] = 'EliteSys Entry';
$spiders[] = 'EliteSys Entry (Email Collector/Spam)';
$agents[] = 'EmailCollector';
$spiders[] = 'EmailCollector (Email Collector/Spam)';
$agents[] = 'EmailSiphon';
$spiders[] = 'EmailSiphon (Email Collector/Spam)';
$agents[] = 'EmailWolf';
$spiders[] = 'EmailWolf (Email Collector/Spam)';
$agents[] = 'Mail Sweeper';
$spiders[] = 'Mail Sweeper (Email Collector/Spam)';
$agents[] = 'msnbot';
$spiders[] = 'MSN Robot (MSN Search, search.msn.com)';
$agents[] = 'whatuseek';
$spiders[] = 'What You Seek';
$agents[] = 'yahoo! slurp';
$spiders[] = 'Yahoo! Slurp';
$agents[] = 'seznambot';
$spiders[] = 'Seznam Bot';
$found = false;
for ($spi = 0; $spi < count($spiders); $spi++)
if ($found = eregi($agents, $_SERVER))
break;
if ($found) {
$url = "http://" . $_SERVER. $_SERVER;
if ($_SERVER != "") {
$url .= '?' . $_SERVER;
}
$line = $dateTime . " " . $spiders. " @ " . $url;
if ($log != "") {
if (@file_exists($log)) {
$mode = "a";
} else {
$mode = "w";
}
if ($f = @fopen($log, $mode)) {
@fwrite($f, $line . "n");
@fclose($f);
}
}
if ($email != "") {
$headers = "From: <$email>n";
$headers .= "X-Sender: <$email>n";
$headers .= "X-Mailer: Search Engine Bot detectorn";
$headers .= "X-Priority: 3n";
$subject = $spiders. " crawled your site";
@mail($email, stripslashes($subject), wordwrap(stripslashes($line)), $headers);
}
}
?>
13. 11. 2008 17:06:14
https://webtrh.cz/diskuse/zjisteni-robota#reply159355
chceš to napsat nebo poradit jak na to? :-)
13. 11. 2008 17:08:13
https://webtrh.cz/diskuse/zjisteni-robota#reply159354
ondrej.baar
verified
rating uzivatele
(2 hodnocení)
13. 11. 2008 17:13:45
Jeste jsem nasel toto:
// Robot Logger by Jakub Stacho
// based on Web Browser Identifier v0.7
// Written by Marcin Krol
// License: free for non-commercial use
function get_robot_name($user_agent) {
// Googlebot
if(preg_match('/Googlebot/(+).*/s', $user_agent)) {
$robot = "Googlebot";
}
// Googlebot Image
if(preg_match('/Googlebot-Image/(+).*/s', $user_agent)) {
$robot = "Googlebot Image";
}
// Gigabot
if(preg_match('/Gigabot/(+).*/s', $user_agent)) {
$robot = "Gigabot";
}
// W3C Validator
if(preg_match('/^W3C_Validator/(+)$/s', $user_agent)) {
$robot = "W3C Validator";
}
// W3C CSS Validator
if(preg_match('/W3C_CSS_Validator_+/(+)$/s', $user_agent)) {
$robot = "W3C CSS Validator";
}
// MSN Bot
if(preg_match('/msnbot(-media|)/(+).*/s', $user_agent)) {
$robot = "MSNBot";
}
// Psbot
if(preg_match('/psbot/(+).*/s', $user_agent)) {
$robot = "Psbot";
}
// IRL crawler
if(preg_match('/IRLbot/(+).*/s', $user_agent)) {
$robot = "IRL crawler";
}
// Seekbot
if(preg_match('/Seekbot/(+).*/s', $user_agent)) {
$robot = "Seekport Robot";
}
// Microsoft Research Bot
if(preg_match('/^MSRBOT /s', $user_agent)) {
$robot = "MSRBot";
}
// cfetch/voyager
if(preg_match('/^(cfetch|voyager)/(+)$/s', $user_agent)) {
$robot = "voyager";
}
// BecomeBot
if(preg_match('/BecomeBot/(+).*/si', $user_agent)) {
$robot = "BecomeBot";
}
// Alexa
if(preg_match('/^ia_archiver$/s', $user_agent)) {
$robot = "Alexa";
}
// Inktomi Slurp
if(preg_match('/Slurp.*inktomi/s', $user_agent)) {
$robot = "Inktomi Slurp";
}
// Yahoo Slurp
if(preg_match('/Yahoo!.*Slurp/s', $user_agent)) {
$robot = "Yahoo! Slurp";
}
// Ask.com
if(preg_match('/Ask Jeeves/Teoma/s', $user_agent)) {
$robot = "Ask.com";
}
// Seznam.cz
if(preg_match('/SeznamBot/s', $user_agent)) {
$robot = "Seznam Bot";
}
// Jyxo.cz
if(preg_match('/Jyxobot/s', $user_agent)) {
$robot = "Jyxo Bot";
}
// Centrum.cz
if(preg_match('/^holmes/s', $user_agent)) {
$robot = "Centrum Bot";
}
return $robot;
}
function log_robot($robot) {
$page = $_SERVER;
mysql_connect('127.0.0.1', 'uživatel', 'heslo');
mysql_selectdb('databáze');
mysql_query("INSERT INTO robots VALUES ('', '$robot', '$page', NOW())");
mysql_close();
}
$robot_name = get_robot_name($_SERVER);
if($robot_name) log_robot($robot_name);
?>
http://webtrh.cz/13601-nekdo-skript-detekci-botu
Jinak to bylo asi 2 - 3 minuty hledani, zacnete zvazovat jestli je lepsi hledat, nebo se ptat...
13. 11. 2008 17:13:45
https://webtrh.cz/diskuse/zjisteni-robota#reply159353
Sry tedy no....vubec me to nenapadlo to prvni hledat. Zbrkle jsem se do toho dal.
Diky moc ale za Vas cas.
13. 11. 2008 17:35:36
https://webtrh.cz/diskuse/zjisteni-robota#reply159352
ondrej.baar
verified
rating uzivatele
(2 hodnocení)
13. 11. 2008 17:44:42
v poradku, at skripty slouzi...
13. 11. 2008 17:44:42
https://webtrh.cz/diskuse/zjisteni-robota#reply159351
MzK
verified
rating uzivatele
(44 hodnocení)
14. 11. 2008 08:25:02
místo do db bych to zapisoval do souboru je to jednodušší pro začátečníka v php.
14. 11. 2008 08:25:02
https://webtrh.cz/diskuse/zjisteni-robota#reply159350
ondrej.baar
verified
rating uzivatele
(2 hodnocení)
14. 11. 2008 13:28:41
nesouhlasim - vyhledavani v souboru? pro zacatecnika? aby nezapomnelna vsechny white space a podobne? Radsi rychle naucit databazi ukazat v cem jsou jeji klady a do souboru ukladat, maximalne dlouhe xmlka... a generovat je pri zmene v db .o)
To fakt neni dobry napad s temi soubory.
14. 11. 2008 13:28:41
https://webtrh.cz/diskuse/zjisteni-robota#reply159349
Pro odpověď se přihlašte.
Přihlásit