Anyone know how to implement Sphinx search on SMF?
http://www.sphinxsearch.com/
Have been searching all over the web without results.
TIA!
try using search on the site, next time?
(first result when searching for "Sphinx" on the community.
http://www.simplemachines.org/community/index.php?topic=127672.0
I have tried the search and seen that search result, however, I cannot get into that specific thread.
Everytime the site gives me the following message:
An Error Has Occurred!
The topic or board you are looking for appears to be either missing or off limits to you.
So it's hidden in some secret cave?
Sorry, not in a search result, I saw the link in another thread.
But I don't have access to that forum for some reason, big forums or what it is called.
hmmmm..... odd.
Anything I can do to get the information?
Bribe you? ;)
well, it's not my mod package to distribute...
So you don't want to help (or isn't allowed to) with any information on how to do this.
Can you let me know who is the maintainer of the package so I can contact this person?
Hi keptang,
I've forwarded your query to the relevant people. Hopefully they'll respond shortly
Thanks H. :)
Hi Keptang,
It was decided to release this mod (previously we supplied it only to those in the Big Boards section which we grant access to once you've got a forum with 750k+ posts).
QuoteSMF has always considered search as one of the most important features. Especially when forums grow larger, search becomes more and more important.
Until now, SMF supported two types of indexes: fulltext (using MySQL's own indexing system) and custom (using an index created by SMF and stored on the database). Though for many forums one of these indexes is sufficient, the larger the forum gets, the harder it gets to query the indexes. Not only are there limits to what it can reasonably find within a second, a search query also puts pressure on the database by using resources and locking tables.
With this in mind, Andrew Aksyonoff started his own engine, outside of MySQL: Sphinx (www.sphinxsearch.com). This engine runs as a separate deamon process and provides query results to applications like PHP. A scheduled task retrieves the data from the database and rebuilds the indexes. This engine is fulltext specialized and returns results often a thousand times faster than MySQL.
Impressed as we were with these results, we immediately embraced the technology to see what it could do for SMF. We dived into it, created a script to get the configuration right and updated SMF to support the Sphinx index. Ben's Red and White Kop (http://www.redandwhitekop.com/forum) with 2,4 milion messages was the ultimate test. Now that we've got it working there (ask Ben about the results, or try it yourself on his forum), we'd like to share the code with the group that's probably needing it the most: the big forum administrators.
Based on your feedback we will improve the scripts and eventually have Sphinx built-in as feature in SMF.
A few notes:
- Please remember that the attached files are still in beta!
- You'd need root access and a few basic admin skills (though we tried to describe the install process in detail)
- The attached file sphinx_config.php contains detailed instructions on how to install Sphinx search for SMF. The script needs to be run from SMF's base dir and SSI.php needs to be present in that same directory.
- Installing and configuring Sphinx search will take about 10-20 minutes (the indexing probably less! 8))
- Sphinx can currently only be used in combination with
SMF 1.1 RC3 the SMF 1.1 series
- Sphinx does currently not support phrase search. SMF's search engine will break phrases up into words.
Thanks a lot H! That worked great :)
Glad to hear it worked for you! :D
Is there a version for SMF 1.1.4 and Sphinx 0.9.8-rc1?
The one above will work fine, just ignore any changes sphinx_config.php suggests making to the sphinx files before compiling.
Just to be sure that I understand you clear.
I can use sphinx_config.php from this topic and latest 0.9.8-rc1?
Yes.
I can't find and of search/find lines in Sphinx-0.9.8-rc2 src/sphinx.cpp file :(. Will there be sphinx_config.php for latest version?
Authors say:
This version is strongly recommended instead of older releases such as 0.9.7.
...
Quote from: Ben_S on March 26, 2008, 09:39:46 AM
The one above will work fine, just ignore any changes sphinx_config.php suggests making to the sphinx files before compiling.
:)
I got this message after running
# indexer --config /usr/local/etc/sphinx.conf --all
Sphinx 0.9.8-rc2 (r1234)
Copyright (c) 2001-2008, Andrew Aksyonoff
using config file '/usr/local/etc/sphinx.conf'...
WARNING: key 'strip_html' is deprecated in /usr/local/etc/sphinx.conf line 10; use 'html_strip (per-index)' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 39; use 'sql_attr_uint' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 40; use 'sql_attr_uint' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 41; use 'sql_attr_uint' instead.
WARNING: key 'sql_date_column' is deprecated in /usr/local/etc/sphinx.conf line 42; use 'sql_attr_timestamp' instead.
WARNING: 2 more warnings skipped.
indexing index 'smf_base_index'...
collected 248249 docs, 58.4 MB
sorted 9.6 Mhits, 100.0% done
total 248249 docs, 58379427 bytes
total 17.275 sec, 3379421.83 bytes/sec, 14370.44 docs/sec
indexing index 'smf_delta_index'...
collected 1 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 1 docs, 252 bytes
total 1.682 sec, 149.82 bytes/sec, 0.59 docs/sec
distributed index 'smf_index' can not be directly indexed; skipping.
searchd --config /usr/local/etc/sphinx.conf
Sphinx 0.9.8-rc2 (r1234)
Copyright (c) 2001-2008, Andrew Aksyonoff
using config file '/usr/local/etc/sphinx.conf'...
WARNING: key 'strip_html' is deprecated in /usr/local/etc/sphinx.conf line 10; use 'html_strip (per-index)' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 39; use 'sql_attr_uint' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 40; use 'sql_attr_uint' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 41; use 'sql_attr_uint' instead.
WARNING: key 'sql_date_column' is deprecated in /usr/local/etc/sphinx.conf line 42; use 'sql_attr_timestamp' instead.
WARNING: 2 more warnings skipped.
Will this make any problem?
I don't think that should affect anything as it isn't skipping actual data and should make another index of its own :)
Don't have install button for mode in packages...
Sphinx for SMF sphinx_0-9-7-rc2/smf_1-1-1 [ List Files ] [ Delete ]
Is there version for 1.1.4?
Will changing version in xml help?
OK, modified xml, instaled... and work almost fine... Speed is great, but id can't search for letters from my lanugage like: šđčćž
Normaln smf search works fine with those letters.
For search in other letters you need to change charset_table in /usr/local/etc/shpinx.conf
For example:
# 'sbcs' defaults for English and Russian
charset_table = 0..9, A..Z->a..z, _, a..z, \
U+A8->U+B8, U+B8, U+C0..U+DF->U+E0..U+FF, U+E0..U+FF
or
# 'utf-8' defaults for English and Russian
charset_table = 0..9, A..Z->a..z, _, a..z, \
U+410..U+42F->U+430..U+44F, U+430..U+44F
(from documentation for Sphinx - http://sphinxsearch.com/doc.html#conf-charset-type)
PS Modification works great for SMF 1.1.5 with >215k posts! Thanks!
Anyone to post new mod/sphinx_config.php for new versions of sphinx?
There's a small bug in the sphinx_config.php script. If the database password has a # in it, it needs to be escaped with a \. When the sphinx commands parse the config file, the # in a password is treated as the rest of the line being a comment. Escaping any # fixes the problem. And no, I'm not changing my password lol
Okay, I've got it up and running now. Compared to MySQL's unindexed search, holy crap, fast! Thanks for this!
For anyone who wishes to have sphinx start on reboot, here is my init.d script. Put it in /etc/init.d .
To make it start automagically, run 'update-rc.d sphinx defaults' (on Debian... may be different on other OSes).
Quote from: pcigre on January 10, 2009, 05:52:39 AM
Anyone to post new mod/sphinx_config.php for new versions of sphinx?
No changes are necessary. Just modify the package-info.xml file for your version of SMF. Ignore the changes in the "Editing the sources of Sphinx" step.
Will Sphinx 0.9.9-rc1 work or should I stick to 0.9.8.1?
Also, does SMF 2.0 have support for Sphinx and is there a mod for that?
Thanks
Can I use sphinx in shared server or it's only for vps and dedicated server?
Quote from: Phalloidium on January 16, 2009, 07:10:33 PM
Quote from: pcigre on January 10, 2009, 05:52:39 AM
Anyone to post new mod/sphinx_config.php for new versions of sphinx?
No changes are necessary. Just modify the package-info.xml file for your version of SMF. Ignore the changes in the "Editing the sources of Sphinx" step.
OK, Near as I can tell I got Sphinx working. I followed the instructions, everything appears to be go. I have the option selected in the CP now, I even grabbed your script. Search works and blazing fast. However (you knew that was comin right?) I've noticed an oddity. It appears selective in how it returns results depending on what you search for.
Example. I search for for rose or roses and I get the same results. Rose gets me rose and it's derivations. If I search for compost I get nothing. If I search for composts I get results.
EDIT:Do the search settings in the CP work when Sphinx is the engine. Meaning weighting mainly. What about forcing an index? Should I uncheck that now?
Sphinx it's a great search engine. Especially useful for highly searched forums.
Unfortunately, its pretty useless for forums updated frequently, it didn't support live indexing as it's a separate tool. Even with two indexes, one for main content and the other one for recent content you won't be able to add the new topics to the index sooner than 1 hour. If you set up an indexing cron on every 10 minutes you'll kill the server faster than using SMF native search indexing.
I would think that the content of many forums would make it not that much of an issue even if the search index was only updated once a day no?
I mean sure, realtime indexing is what everybody would want, but, especially once there's already a ton of content, does it really matter that much in most cases if the very latest posts aren't indexed for a little while.
Sphinx 0.9.9-rc1 support phrase search. It's perfectly compatible with SMF.
Quote from: Phalloidium on January 16, 2009, 06:49:25 PM
Okay, I've got it up and running now. Compared to MySQL's unindexed search, holy crap, fast! Thanks for this!
Yep, it's so much quicker :) On my forums (centered around the Warhammer army "Tau"), searching for "Tau" used to take 30 seconds (it basically killed everything :P), but with Sphinx it takes around 4-8 seconds :)
Did you get the delta index to work or just rely on the main index daily refresh? I set delta index to refresh every 10 minutes, but doesn't index new content.
0-59/10 * * * * /usr/local/bin/indexer --config /usr/local/etc/sphinx.conf --rotate smf_delta_index
Gave me this error via e-mail:
using config file '/usr/local/etc/sphinx.conf'...
WARNING: key 'strip_html' is deprecated in /usr/local/etc/sphinx.conf line 10; use 'html_strip (per-index)' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 38; use 'sql_attr_uint' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 39; use 'sql_attr_uint' instead.
WARNING: key 'sql_group_column' is deprecated in /usr/local/etc/sphinx.conf line 40; use 'sql_attr_uint' instead.
WARNING: key 'sql_date_column' is deprecated in /usr/local/etc/sphinx.conf line 41; use 'sql_attr_timestamp' instead.
WARNING: 3 more warnings skipped.
indexing index 'smf_delta_index'...
collected 6 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 6 docs, 2205 bytes
total 0.010 sec, 220500.00 bytes/sec, 600.00 docs/sec
total 3 reads, 0.0 sec, 11.5 kb/read avg, 0.0 msec/read avg
total 7 writes, 0.0 sec, 1.0 kb/write avg, 0.0 msec/write avg
WARNING: access denied to PID 26499.
WARNING: indices NOT rotated.
I've CHMOD all paths to Sphinx index 777, but no good.
It's possibly to happen because httpd and sphinx were started under different users? What can I do?
Thank you
Quote from: Phalloidium on January 16, 2009, 07:09:06 PM
For anyone who wishes to have sphinx start on reboot, here is my init.d script. Put it in /etc/init.d .
To make it start automagically, run 'update-rc.d sphinx defaults' (on Debian... may be different on other OSes).
Very useful script, thank you.
To set it start automatically on RedHat / CentOS, run the following
to add the service:
chkconfig --add sphinx
to check if running:
chkconfig --list sphinx
You may also use
ntsysv console to set the services:
ntsysv
In the topic is placed Mod for version Sphinx 0-9-7-rc2. Now the actual version 0.9.9-rc2.
Who has newer version a Mod for SMF?
Welcome to SMF
Are you actually experiencing any problems with the current mod? Most likely nobody has updated it, unless there have been changes to Sphinx which actually prevent it from working
I'll double check, but I believe the one I posted above may still be the latest :)
Quote from: H on June 04, 2009, 11:54:38 AM
Welcome to SMF
Are you actually experiencing any problems with the current mod? Most likely nobody has updated it, unless there have been changes to Sphinx which actually prevent it from working
Thanks.
If the morphology is included (morphology = stem_ru) - in results there are no quote with allotted search result by means of morphology.
Who can correct it?
I am sorry for my English.
As an example, stem_en should truncate ' walking ' to ' walk '.
It so, turns out for ' walking ' we see in results of search the allocated words "' walking '", and by a word "' walk" is not present, only links to topic and author (topic and last post).
And so Mod works well, only sphinx.conf it is necessary to correct a little under new syntax Sphinx.
It is possible to receive a word on which searched Sphinx (for example "walk" by search "walking")?
I have Sphinx 0.9.9-rc2 running on SMF 1.1.9. Things went well enough. Now that I am using Sphinx instead of fulltext, can I remove the fulltext index from the messages table and convert the table to InnoDB so that I can do incremental backups of the database?
In default installation of SMF, the seach results is ordered by relevance. With Sphinx this isn't occurring, as you can see the example in the attachment.
How do i fix this ?
Quote from: Arantor on August 23, 2009, 07:05:22 PM
Can I see your configuration file? The one here is calculated from a custom formula; there is nothing in Sphinx by default to generate a 'relevance' like that.
I think the problem is how the search result is sorted, the relevance is calculate right. This happen with you ?
#
# Sphinx configuration file (sphinx.conf), configured for SMF 1.1
#
# By default the location of this file would probably be:
# /usr/local/etc/sphinx.conf
source smf_source
{
type = mysql
strip_html = 1
sql_host = localhost
sql_user = *******
sql_pass = *******
sql_db = db
sql_port = 3306
sql_query_pre = \
REPLACE INTO `db`.smf_settings (variable, value) \
SELECT 'sphinx_indexed_msg_until', MAX(ID_MSG) \
FROM `db`.smf_messages
sql_query_range = \
SELECT 1, value \
FROM `db`.smf_settings \
WHERE variable = 'sphinx_indexed_msg_until'
sql_range_step = 1000
sql_query = \
SELECT \
m.ID_MSG, m.ID_TOPIC, m.ID_BOARD, IF(m.ID_MEMBER = 0, 4294967295, m.ID_MEMBER) AS ID_MEMBER, m.posterTime, m.body, m.subject, \
t.numReplies + 1 AS numReplies, CEILING(1000000 * ( \
IF(m.ID_MSG < 0.7 * s.value, 0, (m.ID_MSG - 0.7 * s.value) / (0.3 * s.value)) * 25 + \
IF(t.numReplies < 200, t.numReplies / 200, 1) * 20 + \
IF(m.ID_MSG = t.ID_FIRST_MSG, 1, 0) * 10 + \
IF(t.isSticky = 0, 0, 1) * 0 \
) / 55) AS relevance \
FROM `db`.smf_messages AS m, `db`.smf_topics AS t, `db`.smf_settings AS s \
WHERE t.ID_TOPIC = m.ID_TOPIC \
AND s.variable = 'maxMsgID' \
AND m.ID_MSG BETWEEN $start AND $end
sql_group_column = ID_TOPIC
sql_group_column = ID_BOARD
sql_group_column = ID_MEMBER
sql_date_column = posterTime
sql_date_column = relevance
sql_date_column = numReplies
sql_query_info = \
SELECT * \
FROM `db`.smf_messages \
WHERE ID_MSG = $id
}
source smf_delta_source : smf_source
{
sql_query_pre =
sql_query_range = \
SELECT s1.value, s2.value \
FROM `db`.smf_settings AS s1, `db`.smf_settings AS s2 \
WHERE s1.variable = 'sphinx_indexed_msg_until' \
AND s2.variable = 'maxMsgID'
}
index smf_base_index
{
source = smf_source
path = /var/sphinx/data/smf_sphinx_base.index
min_word_len = 2
charset_type = sbcs
charset_table = 0..9, A..Z->a..z, _, a..z
}
index smf_delta_index : smf_base_index
{
source = smf_delta_source
path = /var/sphinx/data/smf_sphinx_delta.index
}
index smf_index
{
type = distributed
local = smf_base_index
local = smf_delta_index
}
indexer
{
mem_limit = 32M
}
searchd
{
port = 3312
log = /var/sphinx/log/searchd.log
query_log = /var/sphinx/log/query.log
read_timeout = 5
max_children = 30
pid_file = /var/sphinx/data/searchd.pid
max_matches = 1000
}
There is no API in this thread. Only the Sphinx configurator and MOD. I'm using the Sphinx 0.9.9-RC2 and i copy the api/sphinxapi.php from sources files to SMF Sources directory.
I'm using the MOD from this thread. Yes, there is some errors when i start the Sphinx, but the is no fatal errors.
I found it!
Change Sources/Search.php
From:
$mySphinx->SetGroupBy('ID_TOPIC', SPH_GROUPBY_ATTR);
To:
$mySphinx->SetGroupBy('ID_TOPIC', SPH_GROUPBY_ATTR, 'relevance DESC' );
any chance of an update to this mod with the latest version of Sphinx for smf 1.1.10
ok but that is 2 years old.. can't someone do a quick edit for us who have smf 1.1.10 and Sphinx 0.9.9-rc2? PLEASE?
so how hard is it and how long would it take?
Quote from: Arantor on September 08, 2009, 10:47:00 AM
Out of interest, how big is your forum?
365k Posts in 93k Topics by 78k Members
Quote from: Arantor on September 08, 2009, 11:04:36 AM
Ah, so you're in the category of user that would begin to benefit from it, and you're probably on a VPS where you can compile your own software.
Are you using a Large Custom index in the interim?
i have my own full root server and no i use no index atm (i know, i know.. lol)
well i've tried Fulltext index but it's not any faster then No index in fact it might be a little slower :\
so what's the point in a full text index if it's slower?
Arantor, let me know if you need a hand with the sphinx integration. Forum search is actually the slowest part of my site right now, and since my site search uses sphinx already I figured that'd be the natural progressing, so I'm damn glad you're already working on this.
Jesus christ, just had a look at the search function and it's 1500 bloody lines... It's times like this I really wish SMF had nicer code. =/
Quote from: Ensiferous on September 12, 2009, 09:23:27 PM
Jesus christ, just had a look at the search function and it's 1500 bloody lines... It's times like this I really wish SMF had nicer code. =/
The code is actually pretty nice. it's the number of features that are responsible for the code size.
I have to respectfully disagree, it's all just mashed together. Just because there's a lot of code needed doesn't mean you can't structure it in a logical fashion.
for example a function to setup the variables you need and to step through sub-parts.
A function for splitting words, a function for converting the search parameters into query parts, a function for actually performing the query, a function for parsing the output and so on so forth.
Any decent editor will have a nice overview of the functions in a file so you can quickly find where stuff is.
A simple rule of thumb is that if your comments are explaining the what and not the why then your code is improperly structured. And in this case I highly agree with that.
SMF code is efficient enough, but it's hardly elegant or easy to understand for anyone who doesn't live and breathe it. And I know a lot of developers who agree with me on that part.
You bring up good points.
Personally, the only bit I find really complicated is the parse_bbc function. It's a mess.
I'm not really making a comment on the individual pieces of code. I run a forum with a whole lot of members and a decent amount of activity 2000-3000 posts a day on just one server that also runs the main webapp. So I know the code is fast and efficient, but finding out where stuff happens is a mess, it really lacks a sensible structure and the liberal use of globals and no clear definition of where things are defined certainly don't help it. But I've ranted about that before here, so it's probably not a good idea to get into it now when you might as well just read my earlier posts on it. :)
Despite the horror that awaits me I'm still going to look into fixing this sphinx stuff up since some searches takes upwards of 10 seconds to perform right now.
I'm running Sphinx 0.9.7RC2 with SMF 1.1.10 via this mod and it appears to be working just fine; A big thanks to the mods here and the folks over at Sphinx. What do I need to do to upgrade to 0.9.8 or 0.9.9RC2? I really could use the phrase search feature.
TIA.
Thanks Arantor, much appreciated.
The features listed in the 0.9.8.1 manual (http://www.sphinxsearch.com/docs/manual-0.9.8.html#features) include:
"supports boolean, phrase, and word proximity queries"
I just went back and looked at the feature list in the 0.9.7 manual (http://www.sphinxsearch.com/docs/manual-0.9.7.html#features), and it has the same statement.
Thanks for clarification Arantor. Does that mean that phrase search works in 0.9.8.1?
Great! Thanks again Arantor.
Dumb question time ...
Sphinx has worked like a champ since I installed it several weeks ago, and the improvement was quite dramatic. However, after I restarted Apache today, SMF gives me a "Can't access the search daemon" error. Mark Rose's small script doesn't seem to work. How do I re-start Sphinx without having to re-install and re-building the index?
TIA.
That did it Arantor, thank you.
FWIW I first searched for the files (from the command line), but wasn't 100% sure of the file names.suffixes. If it helps anyone else, here's what I ended up running:
/usr/local/bin/searchd -c /usr/local/etc/sphinx.conf
Thanks again for help. Bookmarking this for future use.
BTW any idea what I'd need to do make it restart automagically? Mark Rose's script didn't seem to work.
I'm sure that was covered on the Sphinx forum more than once. I haven't been there in a couple of weeks but searching for reboot should nail it.
Thanks. Looks like I was searching for the wrong term over there before I posted here. "Reboot" pops up some answers.
QuoteThe actual upgrading to 0.9.8 is fairly straightforward. I wouldn't be recommending 0.9.9 yet though.
Since 0.9.9 was released December 2, could it be used with the existing mod, or are other changes needed?
Any progress with integration with SMF2?
TIA for any info/advice.
There are, as far as I know, some posts in the big boards area on Sphinx with SMF 2 but I haven't tried it.
Well, I haven't tried using 0.9.9-release. I hope the bug that crashes the daemon on unsanitised input has been fixed, but seriously I have no interest in looking after Sphinx at all.
You'd think that the main support person would have been warned in advance prior to a release candidate, wouldn't you? The first I heard of any new version was by reading the news. I didn't even know 0.9.10 had string attributes until someone asked about it in the forum, because I'm no C++ programmer and thus didn't check the SVN logs routinely, which made me look kind of stupid in posts since as far as I knew, there were no plans to support string attributes at all.
Thanks. I don't qualify as a big board, so I can't see those messages. 0.9.7 has worked like a champ on SMF 1.1.x. I'd hate to have to revert back to a non-Sphinx search when I upgrade to SMF2 pretty soon.
Anyone from the big boards, or anyone having experience of Sphinx with SMF2, care to comment?
TIA.
QuoteAnyone from the big boards, or anyone having experience of Sphinx with SMF2, care to comment?
Apologies for he bump, but was still hoping someone could help with the script for running Sphinx with SMF 2. It worked so well with SMF 1.x, and my forum is really missing it since I upgraded.
TIA.
What part of getting it running with 2.0 are you needing?
In an earlier message (maybe a different topic), you mentioned that a different script was required to run Sphinx with SMF 2, and you were having a dialog with someone else about the difficulty of understanding the coding in SMF 2. That's why I wondered how folks were able to install Sphinx with SMF 2.
Apologies if I misunderstood.
I later discovered a script was built for SMF 2, it's in the big boards area. However it still needs an overhaul to actually work without any errors; it uses the 0.9.7 style of configuration, which will throw errors in both 0.9.8 and 0.9.9. I also haven't tested 0.9.9-release to see if the bad bug that was in 0.9.9-rc1 (and possibly -rc2) has actually been fixed.
I did raise a further issue a bit back before I did any further work - there are licensing issues involved since the Sphinx API is a GPL library, and SMF is not GPL compliant.
Quote from: Arantor on January 11, 2010, 10:42:57 AM
I later discovered a script was built for SMF 2, it's in the big boards area. However it still needs an overhaul to actually work without any errors; it uses the 0.9.7 style of configuration, which will throw errors in both 0.9.8 and 0.9.9. I also haven't tested 0.9.9-release to see if the bad bug that was in 0.9.9-rc1 (and possibly -rc2) has actually been fixed.
I did raise a further issue a bit back before I did any further work - there are licensing issues involved since the Sphinx API is a GPL library, and SMF is not GPL compliant.
That is only a problem in that SMF cannot release a version that comes with sphinx support built in. It doesn't affect distributing a modification in any way. This is alright because the GPL allows you to use GPL code with non-GPL code privately, it just states that you can't distribute code like that, which isn't a problem anyway since SMF prohibits distribution itself.
As I understand it when I raised it, you cannot bridge non GPL software to a GPL library - that's kind of the point of the LGPL's existence.
See http://www.gnu.org/licenses/gpl-faq.html#NFUseGPLPlugins
QuoteIf the program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single program, which must be treated as an extension of both the main program and the plug-ins. In order to use the GPL-covered plug-ins, the main program must be released under the GPL or a GPL-compatible free software license, and that the terms of the GPL must be followed when the main program is distributed for use with these plug-ins.
more specifically, we can not distribute a bridge like that... doing it on your own is fine, but the moment you (or we) distribute any sort of link between GPL and Non-GPL softwares, it becomes a violation.
I did ask about this a couple of weeks back in the team board if anyone wanted to contact Andrew about a licensing exception (or even the API being relicensed generally) but since I didn't want to go request it myself for personal reasons, and this should come from higher up...
Does the latest GPL-related discussion mean that this script isn't available any more, or only available to the big boards? Reason I ask is that the script for SMF 1.x was originally only available to the big boards, but was subsequently made available to the general membership.
QuoteI later discovered a script was built for SMF 2, it's in the big boards area. However it still needs an overhaul to actually work without any errors; it uses the 0.9.7 style of configuration, which will throw errors in both 0.9.8 and 0.9.9.
Any signs of progress, or maybe a methodology to work with/around the errors?
TIA.
None from me, because I wanted someone from the team to contact the Sphinx team regarding the license. Legally this site CANNOT distribute Sphinx files because SMF is not GPL compliant.
Even if the configurator was updated, the underlying code also needs work because it's possible to throw errors, ones that would cause a general error on 0.9.8 and ones that would cause searchd to crash in the 0.9.9 RCs (I *hope* it's been fixed by 0.9.9 final but after I left there last year, I stopped caring, sorry :()
OK thanks Arantor. Just wondering if/how the big boards guys are able to use Sphinx with SMF 2 (?)
Because in the big boards board there is a 2.0 version of this.
So why can't the "2.0 version" be shared with us who're not yet quite a big board?
Because it's ILLEGAL to do so? Strictly speaking even distributing the files here is breach of license.
Does that mean that the script on the Big Boards is also illegal?
QuoteBecause in the big boards board there is a 2.0 version of this.
Apologies if I appear dense.
Yes, it is at present.
I asked the higher-ups in the team to investigate over a month ago, I assume that no-one bothered since I've heard nothing in any direction.
OK thanks. Might be time to nudge the "higher ups".
Sphinx is a great search engine, and worked extremely well for me with SMF 1.x. I'd really like to use it with SMF 2.
Nope. I have to say that right now it isn't my problem - they didn't listen to me a month ago, I don't expect them to listen to me now I'm not even on the team.
OK, to THE TEAM - could someone please listen to and respond to Arantor?
We have...
the short answer is "no... we can not legally distribute the sphinx integration because of the license conflict."
the longer answer is, well... much longer and involved all sorts of work-arounds and legal grey areas.
GPL is possibly one of the worst licenses out there.
Thanks Kindred, I think.
Kindred: Please see the suggestion from me in the team boards, thread started December 25th. Sphinx is dual-licensed, SMF can request a non-GPL license for the API. Please can someone contact Sphinx Technologies about this.
that is something I will have to handle after we straighten out this other mess....
Quote from: Kindredthat is something I will have to handle after we straighten out this other mess....
Just curious, was this (license?) issue resolved, and are you able to share the script to integrate Sphinx with SMF 2.0 RC3?
TIA.
technically, RC3 is not compatible. We're working on the situation for 2.0 final though.
Thanks Kindred. Sounds like good news.
I might be wrong, but if my memory doesn't let me down, I saw this forum (on a server down problems time) Search function sending "Can't access the search daemon" message after server restart. SMF 2.0... Sphinx...? ;)
means you haven't restarted sphinx
Quote from: Oya on October 09, 2010, 05:37:44 AM
means you haven't restarted sphinx
Is not about me. I was talking about THIS forum.
Quote from: exxocet on October 10, 2010, 06:38:26 AM
Quote from: Oya on October 09, 2010, 05:37:44 AM
means you haven't restarted sphinx
Is not about me. I was talking about THIS forum.
Means SMF hadn't started Sphinx. (Which they really should be doing automatically)
Quote from: Ensiferous on October 11, 2010, 01:33:58 PM
Means SMF hadn't started Sphinx. (Which they really should be doing automatically)
or not
i remember trying to install sphinx for one phpbb customer and saw what happened when you had the php layer starting a system service manually and controlling it from there (like forcing it to reindex and forcefully shut down sphinx with kill -9 from php exec)
its a system daemon, should be treated as one
I never said one should use PHP to do it, that'd be pretty insane considering there are much better alternatives such as supervisord (http://supervisord.org/) and Monit (http://mmonit.com/monit/).
i thought you were saying smf should start sphinx...
Pfufff, people say SMF 2 beta / RC is not compatible with Sphinx. This forum runs a version of SMF 2. Still I've encountered that Sphinx error.. get it now?
the license isn't compatible, never has been
it does also depend on what you search for, what did you search for?
so what Exxocet?
License complications mean that we are unable to officially distribute a mod to work with smf/sphinx. There is not difficulty in USING such a mod, just in the distribution.
but no.... I don't "get it now".
one-line comments do not provide any explanation...
I know about license incompatibility, but some people said Sphinx won't work with SMF 2. Technically. So I've mentioned it does work.
By the way, I'm running Sphinx' newest release (1.10 beta) on SMF 1 and it works flawlessly. Phrase search finally works nicely in this release.
Note: on my case ,for some reason, Sphinx is slower than regular "no index" search. ~800 MB db, VPS server, everything optimized. Tried with different versions, 0.9.8.1, 0.9.9, 1.10beta, limited results to 500, with/without stopwords list. It's twice as slow as SMF's integrated search methods. Pretty odd...
Quote from: KindredWe're working on the situation for 2.0 final though.
Hate to be a pain, but are we any closer to resolution with RC4?
there will likely be no further development with RC4... We're working on the situation for 2.0 FINAL
Understood Kindred, thanks. I should have phrased my question a little differently. Sounds like we'll have to wait for SMF 2.0 final for any hopes of using Sphinx.
Is there any example of an SMF site using Sphinx? I would like to test how it works.
Many thanks.
This one uses Sphinx ;)
Quote from: Arantor on February 20, 2011, 09:17:04 AM
This one uses Sphinx ;)
wow even smf don't use there own search system? think that says a lot!!!
Not really.
When you have a forum with over two MILLION posts, you turn to the most optimised way of doing it. A solution that's compiled and runs directly on the server HAS to be faster than one running through two or three layers of code. That's how Sphinx works, it's compiled and runs directly on the server and it's built for the single purpose of searching text. It's not like Apache, or PHP or MySQL that's built to handle a range of tasks.
Quote from: Arantor on February 20, 2011, 09:30:24 AM
Not really.
When you have a forum with over two MILLION posts, you turn to the most optimised way of doing it. A solution that's compiled and runs directly on the server HAS to be faster than one running through two or three layers of code. That's how Sphinx works, it's compiled and runs directly on the server and it's built for the single purpose of searching text. It's not like Apache, or PHP or MySQL that's built to handle a range of tasks.
was just saying that smf's own search system isn't up to smf's own needs
why not make/implement something better? or just have a 'Sphinx' option built right in?
There's not a lot better than Sphinx.
And I don't think you understand me. There is NO WAY ON EARTH to make something in PHP that's even remotely close to Sphinx in speed and processing capacity.
EDIT: At least not that can be used by normal users. You might get something remotely close if you use something like HipHop and compile a PHP script, but even then it's still not going to be as efficient as Sphinx is.
Thanks, Arantor, I did not know that. Can one apply search weights with Sphinx? I.e. I use high search weight for a matching subject since it is a translation forum with matching translations in subjects (http://www.translatum.gr/forum/index.php?board=57.0) and Search enhancement mod is a good way of presenting this information.
I don't know how the search enhancement mod works but you can apply weightings per column in Sphinx so if a match occurs in a given column it gets ranked higher. (SMF 2 can do that too, btw)
Many thanks. The good thing about the search enhancement mod is that it creates a result summary (with anchored subject lines only) which when clicked go to the relevant message. I wonder whether something like this could be implemented on Sphinx.
Sphinx just physically handles the search and returns values back to the rest of the code to fetch the messages and display them. I see no reason why that display couldn't be managed.
Quote from: Arantor on February 20, 2011, 09:34:45 AM
And I don't think you understand me. There is NO WAY ON EARTH to make something in PHP that's even remotely close to Sphinx in speed and processing capacity.
no i totally understood you but what i was saying is that smf should support Sphinx as an option like how they support cache systems like Memcached and APC.. just make an option for it and IF it's installed to the server then it'll use it etc
Quoteno i totally understood you but what i was saying is that smf should support Sphinx as an option like how they support cache systems like Memcached and APC.. just make an option for it as IF it's installed to the server then it'll use it etc
Considering that you have to expressly configure it manually for SMF, you can't just have it as a simple option like that. Mind you, 2.0 does make it considerably easier to set up.
Sphinx only supports one master configuration file per instance of the daemon process, which in this case is either entirely SMF's indexes, or you already have it set up for something else. Either way you have to manually adjust its configuration to make that work properly, so there's no magic easy set up for it.
And because AFAIK none of the major distributions are shipping it as a service, you have to compile it from scratch anyway making it even less easy to have a simple configuration option of the nature you're asking about.
Are the files (on the beginning of topic) for Sphinx installation the most up to date ones (for latest Sphinx version or which latest Sphinx version)? Does it run on both 1.1.x and 2RC5? Are there any installation instructions?
Sorry for the many questions but I could not decide on these matters by reading this thread.
The one in this thread is for 1.1.x only. 2.0's is very very different (due to the pluggable architecture for search APIs in 2.0) and that's documented in the big boards area.
As for installation instructions see http://www.simplemachines.org/community/index.php?topic=127672.0
Many thanks. So I take it that the latest supported Sphinx version for smf1.1.x is sphinx_0-9-7-rc2 and one needs to download that specific version?
I wonder whether I qualify to access the big boards area for a forum with 300,000 posts and 100,000 topics.
The installation instructions link gives me:
The topic or board you are looking for appears to be either missing or off limits to
I really wouldn't recommend using 0.9.7 at all. 0.9.8 as a minimum, since the 0.9.7 configuration files it generates are compatible, but I have a feeling they're not compatible with 0.9.9.
Argh, that link's in the big boards area, of which access is granted on forums of 500k posts or thereabouts :(
Lol, I must either push or wait my forum to reach the 500,000 benchmark then ::)
Quote from: steve51184 on February 20, 2011, 09:51:15 AM
Quote from: Arantor on February 20, 2011, 09:34:45 AM
And I don't think you understand me. There is NO WAY ON EARTH to make something in PHP that's even remotely close to Sphinx in speed and processing capacity.
no i totally understood you but what i was saying is that smf should support Sphinx as an option like how they support cache systems like Memcached and APC.. just make an option for it and IF it's installed to the server then it'll use it etc
With SMF 2.1, Sphinx will likely be rolled out as an option coupled with SMF, HOWEVER, as Arantor said, it's not something you can just 'enable' and be on your way. It requires server level access to configure, even if it's already installed. If it's not installed, you need root access to the server to install it.
Quote from: SlammedDime on February 20, 2011, 10:39:47 AM
Quote from: steve51184 on February 20, 2011, 09:51:15 AM
Quote from: Arantor on February 20, 2011, 09:34:45 AM
And I don't think you understand me. There is NO WAY ON EARTH to make something in PHP that's even remotely close to Sphinx in speed and processing capacity.
no i totally understood you but what i was saying is that smf should support Sphinx as an option like how they support cache systems like Memcached and APC.. just make an option for it and IF it's installed to the server then it'll use it etc
With SMF 2.1, Sphinx will likely be rolled out as an option coupled with SMF, HOWEVER, as Arantor said, it's not something you can just 'enable' and be on your way. It requires server level access to configure, even if it's already installed. If it's not installed, you need root access to the server to install it.
well i have root so i can't wait for that ;)
Sorry to bump an old thread, however I'm running a big board and i'm upgrading to 2.0.4, which means upgrading my old sphinx stuff to 2.0+ but I don't have access to the big board area.
Can somebody help ?
Edit: Apparently one can apply through ones profile, which I just did, sorry for the bump.