Allow search one character words

Started by Sir Osis of Liver, July 12, 2022, 05:22:35 PM

Previous topic - Next topic

Sir Osis of Liver

For reasons unknown, forum search does not allow one character words, such as the article 'a'.  Makes it difficult to search for book titles.  How to fix this?
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Arantor

You basically can't.

Depending on exactly how search is configured, that might literally require rewriting parts of MySQL *itself* to change.

There are two reasons, both of which you'll argue are stupid, but for the sake of sanity I will explain anyway.

Firstly, regardless of which search system you're using, searching for a single character can and will match way too many things to be of any use. You're searching for a title of a book. Let's say, I don't know, first thing I see on my bookshelf, the James Bond story "View to a Kill". Contains the word 'a'. If you're not treating things as words (like the non-index method does), that matches many many many things - vastly too many to be of any use to anyone (just count how many a's matched in the last few sentences alone). It'd match half the forum if not more. As a result, one character matches are excluded.

Secondly, if you are using one of the more complex search methods, they all excluded words below a minimum letter count. All of them, whether that's MySQL fulltext, SMF custom index, Sphinx, ElasticSearch, they all start at either 3 or 4 letters as the minimum size of a word. Mostly to eliminate the things that are either noise (because they're not words, e.g. someone mentions the ship from Star Trek, the NCC-1701-A... there's an A that isn't part of a word, or is it?) or so frequently mentioned in text that searching on it would be mostly useless.

The problem here is that you're doing something that it wasn't *really* designed for and now you're complaining that it doesn't work the way you'd like it to, and there really isn't a good way to fix it other than to implement a search actually geared meaningfully to the actual content you're trying to index.

Sir Osis of Liver

I'm not complaining, I'm trying to understand how it works.  The point of this project is to take a platform that's unsuitable for the purpose it's being used and modify it so she doesn't have to move to a different platform and redo years of work.  Search works ok with strings containing 2 letter words ("Leaves of Grass"), and titles containing 'a' can be searched if placed in quotes.  Maybe I'll add a message to that affect on search page.
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Arantor

Quote from: Sir Osis of Liver on July 12, 2022, 05:46:13 PMThe point of this project is to take a platform that's unsuitable for the purpose it's being used

I hate to be that guy but maybe that's a red flag from the start?

Sir Osis of Liver

Yes, but it's far to late to change course now.  And it's been an interesting (if exasperating) process so far.

Anyway, here'a a useable solution -

You cannot view this attachment.
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Sir Osis of Liver

Quote from: Arantor on July 12, 2022, 05:49:24 PMI hate to be that guy but ....

Does anyone else think that should be the core error message?  Would allow users to successfully complete the search rather than leave them with nothing.

But what do I know.  ::)
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Arantor

No, I don't because that changes what the search actually does. There's a reason the hint text spells it out.

Sir Osis of Liver

But it works, search is successful, and I haven't changed anything other than the error message,
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Sir Osis of Liver

Quote from: Arantor on July 12, 2022, 08:04:58 PMNo, I don't because that changes what the search actually does. There's a reason the hint text spells it out.

Don't understand what you're getting at.  There are only two one letter words in english, 'a' and 'I', and occasionally an integer or other single letter may appear in a book title.  But the point is search fails if there's a single character in the search text, with a message to that effect.  But if you enclose it in quotes to search the exact string, it works, it does exactly what it's supposed to do, and what users expect it to do.  That's pretty much standard for search on most websites.  Don't see how that changes anything.

Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Advertisement: