Monday, November 17, 2008

Thumbs Up for Sphider

The model club site that I run has been steadily growing. We have been adding lots of great content - articles, tips and photos, lots of photos. Our gallery now contains nearly 5000 of them. I am using a gallery application called Coppermine (PHP front-end, MySQL back-end). It's very nice. Each photo has a title and many have more detailed descriptions.

With all this content I felt that the site really could benefit from having some kind of comprehensive search mechanism. Coppermine has its own search, which works fine, but I wanted a way to search the whole site at a go.

My first thought was to use Google Custom Search, which I had implemented with some success on another site. I was able to implement it on the club site without any trouble. The issue that I had was getting it to re-index in a timely fashion when I made changes. I decided that I wanted a mechanism that gave me more control over the indexing. As I have no budget for the club site, I also wanted something that was free.

I found no shortage of free search engines out there and tried a few. But the problem I kept running into was that the free versions had a limit on the number of pages they would index. The limit was high - usually several thousand - but I kept exceeding it. The reason I kept exceeding it was the photo gallery. Nearly 5000 photos, each of which gets indexed as it's own page, plus gallery sub-area pages, etc., etc. That ends up being a lot of pages to index.

I finally found a search engine that I could run locally on my site that was free and had no page limit - Sphider. Sphider uses PHP and requires a MySQL database to store it's indexes. It is really quite nice. Not only can you re-index at will, you can choose to index just certain parts of your site by setting up "sites" in the admin panel that limit their inclusion to just certain areas. This was especially useful for me because I often want to re-index everything except the gallery, which is pretty time consuming due to the sheer size of it.

It took me a little time to get the filters right for indexing the gallery. I had to keep it from indexing certain ancillary pages that had no business showing up in a search result. But Sphider has some decent include/exclude filtering mechanisms to facilitate that. It also respects any directives in your robots text file.

It provides some nice statistics on what search terms your visitors are entering, most popular searches and so on.

Implementation was fairly easy. It uses a template with a header, footer, etc., which gives you enough flexibility to make it a seamless part of your site. Once your MySQL database is in place, you just pass Spider's admin panel your db user credentials and it takes it from there.

All in all, really not bad. I have had it in place for about 2 months now and it seems to work really well.

No comments: