News:

Want to get involved in developing SMF, then why not lend a hand on our github!

Main Menu

What's wrong with my db_query 2.x code?

Started by Biology Forums, March 04, 2019, 09:09:29 PM

Previous topic - Next topic

Biology Forums

Running this part of my code results in an error:



$topicName_1 = 'the idea that families are systems--more';

$dbresult = $smcFunc['db_query']('', '
SELECT id_msg
FROM {db_prefix}messages
WHERE id_msg <> {int:id_msg} AND {raw:body}',
array(
'id_msg' => 0,
'body' => 'body LIKE "%'.$topicName_1.'%"',
)
);


Error:

Hacking attempt...

SELECT id_msg
FROM `XXX`.smf_messages
WHERE id_msg <> 0 AND body LIKE %the idea that families are systems--more%


What am I doing wrong?

Thanks

vbgamer45

Looks like quotes are stripped.
And -- is a mysql comment
Community Suite for SMF - Take your forum to the next level built for SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com -  Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

WebSwinger

Keep the actual SQL in the query itself, and pass just the variable itself. This keeps better separation of concerns and doesn't escape things that shouldn't be, as vbgamer45 mentioned.


$topicName_1 = 'the idea that families are systems--more';

$dbresult = $smcFunc['db_query']('', '
SELECT id_msg
FROM {db_prefix}messages
WHERE id_msg <> {int:id_msg}
                                    AND body LIKE "%{raw:like_clause}%"',
array(
'id_msg' => 0,
'like_clause' => $topicName_1,
)
);

Biology Forums

Thank you, fellas...

Need your opinions though, let's say I'm using the query above to search the exact phrase "stays invariant is termed ________, while"

HOWEVER, in the database, it exists as "stays invariant is termed ____, while". Notice the subtle difference, the blank (___) is shorter in the one recorded in the database, and therefore no match is made :(

How would you suggest I tackle this problem?

albertlast

use the fulltext search function or the search api from smf.

Biology Forums

It's my first time working with 2.x, do you mind sharing some examples regarding both suggestions?

albertlast

When you search at the index page,
you will notice that the core function behind this is PlushSearch2 on search.php,
to work with this function would be the api way.

With fulltext search function i mean do direclty the database calls,
which not so easy (reason why plushsearch2 exists).

Biology Forums

QuoteWhen you search at the index page,
you will notice that the core function behind this is PlushSearch2 on search.php,
to work with this function would be the api way.

Could you provide a code-based example, what source files would be required?

Arantor

A decent chunk of Search.php is required, look in the stated function...

Biology Forums

Could anyone point me to a mod that has used the search API just so that I get an idea how this works?

Arantor

The only mod I've ever seen doesn't use this, but does add another backend.

The reason being is that this is all geared to performing searches, something normally done in PlushSearch2(). And it gets significantly more complex in 2.1, just FYI.

But as usual, you start from a really specific point and expect us to extrapolate how to help you - you've started from 'how do I match string' to 'how do I vaguely match string', as opposed to 'I'm trying to do this, and I think the way to do it is vaguely matching strings'. Maybe there's a much better way that will solve what you're trying to do but it's hard to know since all we have is how you think you have to solve the problem to extrapolate from.

Biology Forums

Is there such thing as:

$string = 'Why is the sky blue?'

if( search_function_api($string) == true )
echo 'search results found';
else
echo 'nothing found';


Please don't make it more complicated than it has to be.

albertlast

Why you than don't use the normal search input box?
Or place a different one and call as target the search2 action.

Arantor

Quote from: Study Force on March 05, 2019, 03:42:13 PM
Is there such thing as:

$string = 'Why is the sky blue?'

if( search_function_api($string) == true )
echo 'search results found';
else
echo 'nothing found';


Please don't make it more complicated than it has to be.

It is so much more complicated than that, because it has to be.

Let me give you a quick rundown of search theory (including the underpinnings of how Google does it)

You run a process called indexing; it doesn't matter whether it uses the MySQL full text backend or the custom one or the Sphinx one or something based on Lucene (e.g. ElasticSearch or Solr), they all have the same job: figuring out what words you have.

Why does that matter? Because assuming you don't want to manually check every character by hand every time (in large sites this is complicated and slow), you want to find the relevant words in the relevant order. So what these things do is go through and work out what words you have and builds a list that says, "ok, the word 'the' is word 1, the word 'SMF' is word 2" and then lists that a given post (or document, as it usually is termed) contains words 1, 3, 4, 279 in that order.

This is much faster than looking up words by hand, and you can also do things like stemming (e.g. matching similar words) if you want to. MySQL fulltext certainly does.

Now, why is that relevant? Let's look at your original question, the partial match: "stays invariant is termed ________, while" - none of the systems out of the box will care about your line of underscores. MySQL can't, custom indexes won't, Sphinx could be made to with some serious effort, as I think Elastic could. Solr would need more effort. And remember, you need to remember to exclude commas from being parts of words otherwise you end up with 'words' that end in commas and 'words' that are the same words without commas.

And you're inherently asking the question 'how do I do vague text matching on things that literally aren't words' and you're hoping it can be boiled down into something simple.

This is why I was trying to get to what you're trying to achieve because this ****** is actually hard, and you continue to focus on how you think you have to solve the problem but you don't even understand why your problem is many many times more complex than you realise.

Biology Forums

Quote from: albertlast on March 05, 2019, 03:49:43 PM
Why you than don't use the normal search input box?

I can't because the query is being run on a loop, server-side.

@Arantor -- a simple 'no' would suffice ;D Thanks for the explanation, it made a lot of sense. But going back to my request, I was hoping the "search api" had such function that I exemplified, but unfortunately from the vibe I'm getting, there isn't one.

Arantor

Nor can there ever be one. You want, essentially, a magic function that is Google in a box. It doesn't exist.

Biology Forums

But why not? it seems so simple in theory.

Using the "sky is blue" example above, I want to simulate the SMF search with quotations, see how many exact results come up, record the number of results as a $variable... then argue:

if $variable > 0,
echo 'found';

Honestly all we're doing here is extending the search results to count the number of searches. That's all the function would return.

Arantor

Is all this because you want a number of matches? (Depending on a bunch of things, the answer might be irrelevant as SMF may likely truncate the number of matches it finds)

Please stop dancing around with how you think you need to achieve something, you might actually get somewhere. What exactly are you trying to do with search?

Biology Forums

I'm not dancing around the topic, and in fact, I've been completely transparent. I have a bunch of questions running on a loop, and I'm checking if each question matches those topics already found in the database; if a match is made, the question is skipped from being inserted. I take pieces of the question, e.g. first 4 words and 4 last words and I check using %LIKE%. Clearly it's not the best method, but it does catch them, preventing 90% of duplicates from being entered. That's why I'm looking for a better method; I think how SMF search does it, when the text is quoted produced better results. All I want to do is simulate the search results to see if matches are made.

Arantor

This is the first time in this topic you've actually explained something of what you're trying to achieve.

The problem is, to actually shove this into SMF's code, even if you strip out roughly half of PlushSearch2 (since you don't need to pre-prepare a whole bunch of stuff like ranking), you still need to go through all the hoops needed to make it work.

Looking at PlushSearch2 in SMF 2.0.15, the fun starts at line 318 with db_extend('search'); where it loads the DB driver code needed.

Lines 320-337 deal with loading the $searchAPI instance you need.

344-358 can be skipped, they're pulling in the parameters from URL if provided, but that's not relevant to you.

361-573 handle importing all the values for all the options that can be selected from the search UI (limit to forum, board, topic; min age; max age; post by a given user)

578-590 handle the sorting options; you probably don't care but you will need to at least notionally pick something to hand it off to the search API.

601-774 handle splitting up the entered text into multiple arrays of settings based on the phrase and what options were picked. If nothing else, a subset of this will need to be done.

776-881 handle the spell checker and 'did you mean' which you can skip.

884-944 are for putting things back into the URL for later and setting up CAPTCHA.

980-1673 do the actual search. This is the bit you mostly want to take and carve up assuming you don't want to pollute various DB tables for caching purposes and log_search_results table will give you the results number you actually wanted.

That's a fun day's work to pull out, especially given the many underlying issues you may not actually solve with it - you say you have 90% match rate as is, but changing to this may actually drastically lower it or drastically change the false positive rate depending on 'match all words vs match any words'.

Biology Forums

#20
Thanks for the breakdown.

After reading that, it makes me not want to open up the can of worms :-. Sure wish someone had a function done up already.

Anyway, will work on it eventually and update this thread accordingly

vbgamer45

I have some full text searching in gallery pro or you can check out he related topics mod on the mod site as well.
Community Suite for SMF - Take your forum to the next level built for SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com -  Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

live627

Maybe you could try fulltext searching in mysql (match... against...)

Arantor

But that requires having the index built. This may not be possible depending on various factors (notably table type).

Advertisement: