Searching Again #announcement fstdt.com blog

So I have a much more solid idea of what needs to be done for the search since I've been actively playing with the mechanics involved. I've found something suprising, it actually works better than I expected.

Say I search for David J. Stewart. The new functionality I'm using would find that term, or David J Stewart or David Jay Stewart. The new search functionality i'm looking at has some built in thesaurus comparisons it will make, while it won't automatically look for everything starting with a J, it will look for anything that actually sounds like J. And this is all accomplished using one of the simpler syntax possiblities.

Which brings me into the next part, the differences between what I can do and what I probably should. This is the definitions for the term I'm using, you may note it's a tad complicated and well it gets a whole lot worse with more search terms. If I took the time and had the knowledge I could setup all kinds of potentially useful syntax to the search page. Except I'm a lazy git despite having changed the name of this little bloggy bit.

What I'm most likely going to do is setup the search interface so that the given search parameters are handled appropriately based on the type. Which is harder said than done. For instance that David J. Stewart example above has some problems depending on the syntax used, if I search for all three parts of that separately by sending in '"David" AND "J" AND "Stewart"' into the contains function, so it will try to find each word in the author field, it returns nothing. Still figuring out why, but if I just search david and stewart it returns everything, this was a few hours of my week. But oddly enough, if I send in '"David J Stewart"' I get the results I mentioned above, where it checks thesauruses and is generally really bloody useful.

But there's still a catch there, all that nice functionality, yeah it involves very carefully deciding how something is searched. When I send it all in wrapped in "" it's looked for as phrase, fine for authors or sources, maybe even URLs, but completely useless for searching the content of quotes. Wait, it gets even better, this search relies on full word matches unless explicitly told what it is receiving is a prefix, and then requires that it MUST be a prefix or nothing will be found. Send in Test* and you get back Testing but not Test. Which has all been fun to play with, but frustrating to figure out how to really implement.

What I could do right now is just always assume people are searching for phrases, but I know that's not true. I do keyword searches regularly and I know a few others did back when search worked reliably. What I need to do is figure out how I'm going to parse what is received in a quote or comment search and best instruct the search engine to handle it. I doubt people want to have to figure out this syntax for themselves, so I'm going to do what I can to simplify it a bit and take care of assumptions. This is my project for the next few days, and I'll poke my head back up when I have something to mention.

6 comments

Confused?

So were we! You can find all of this, and more, on Fundies Say the Darndest Things!

To post a comment, you'll need to Sign in or Register. Making an account also allows you to claim credit for submitting quotes, and to vote on quotes and comments. You don't even need to give us your email address.