Search Engine Spider Ping Tool
Last night I got thinking about some search engine possibilities and creative ways that I can tackle a couple problems I am having. All of my ideas basically boiled down to having some more research data and I wanted to try a couple experiments myself.
In the end I got an idea that I decided could actually help me quite a bit in terms of investigating things. I am not so sure if it will be an actionable tool, but hey, maybe you guys can expand on the idea and make it awesome.
I want a web based app that simply pings me every time a search engine spider visits my site. I don’t want a notification for each page, but a simple notification of when the spider arrives, how long it was there for, how many pages it crawled, and when it left. I am sure this can be done with a bit of PHP and looking at the server logs, but I don’t have the skills to build this myself.
What got me thinking about some other possibilities was a little tool that I stumbled upon yesterday. It is called Serperture and seems to be an “SEO reporting and analysis tool“. There is pretty much only a landing page up at the moment, but it looks like it could be a really useful tool for SEO’s like me out there. The more tools and scripts I have at my disposal the better.
What do you guys think about that little script? Any other ways that you could use it or expand on it?
- Captain AwesomeIf you enjoyed this post, please leave a comment or subscribe to the feed and get future articles delivered to your feed reader.
Comments
Rocking! I will definitely play around with that later today. Yea like you say, I am sure I can find that extra info in analytics for what I need. To create it as an actual tool it would rock to have that all together, but that script does what I want it to do :) Thanks!
Elaborate on your thoughts as to what other information would be useful and I can try build on top. Tracking the pages I guess could work by storing the agent in a session, then building the session as the crawler moves around with the URLs of the pages that the crawler goes to. That being said, that’s a bit of a mission to write and I’m not sure I have the motivation to do that ;)
Any other cool ideas for the script?
PS. I’ll probably turn it into a WordPress plugin with some options.



I saw you mention the ping tool on Twitter earlier and it is indeed possible with some PHP to deduce whether the “person” on the site is an engine or not. The crawling of pages, time spent on site and so forth it a little trickier. That being said, Google Analytics can provide that information though, at least you could drill down to engine visits, pages visited on average and so forth. I would imagine that Google’s Real-time stats could do something similar in real-time, not sure if reporting is available for that though.
Ok, so anyway, I decided to write a little script to check when Google, MSN or Yahoo visit a website, was quite straight forward in the end and it’s a start.
Here’s the code: http://pastebin.com/aP5RBNZm
What you think?