Customizing SMF > SMF Coding Discussion
Best way to parse bbcode in database?
MrPhil:
I've done some reading on the subject, and it seems to be a matter of dispute whether <b> and <strong> make all that much difference in SEO results, or in how a screen reader will speak the word. Does anyone have some conclusive evidence on this front? If SEO doesn't, in fact, matter all that much, but screen readers do emphasize <strong>, just stick with <b> for such initialisms.
For a case such as "LOL => Laughing Out Loud", you might want to explore the use of [acronym=Laughing Out Loud]LOL[/acronym] or [abbr=Laughing Out Loud]LOL[/abbr] (in practical terms, I don't know if there's a distinction: LOL and LOL).
When separating complete words from initialisms, don't forget cases such as "permissions => Read, Write, and eXecute". This would require looking for letters before the highlighting tag, and not just after. Since this could mean quite severe changes in the BBCode parsing, I would think it better to devote a new (custom) tag to it, to avoid semantic problems with <b> or <strong>. Say, [initial] or [initial=L] to render as <strong> in text and <b> if a screen reader is in use?
Arantor:
--- Quote ---and it seems to be a matter of dispute whether <b> and <strong> make all that much difference in SEO results
--- End quote ---
There is very likely little difference in SEO terms for one vs the other. Certainly it is more semantically correct in most cases to use strong.
--- Quote --- or in how a screen reader will speak the word.
--- End quote ---
Depends on the screen reader, some will treat both as 'strong' and emphasise both, some only emphasise strong. What is certain is that strong will always be emphasised regardless of medium (see footnote) while b might not be. The guidelines from WCAG take the view that strong should be used when you mean emphasis while b should only be used for visual effect.
--- Quote ---in practical terms, I don't know if there's a distinction
--- End quote ---
Most browsers will render them the same way, using the title to provide a tooltip but otherwise not differentiating the content.
--- Quote ---if a screen reader is in use?
--- End quote ---
There is no way to know that at the parsing stage, especially because more than one of the reader tools works as a browser plugin, and doesn't always identify itself.
Really what I think it means is for people to not abuse tags unnecessarily.
--- Quote ---Since this could mean quite severe changes in the BBCode parsing
--- End quote ---
As I intimated, the parser does not pass given any content outside the tag - in either direction - to the tag handler. You'd have to do it at save time, or do it all as a post-process before leaving parse_bbc(), there is no way to handle this in either case from inside the main code of the parser.
Footnote: You can do insane things like the following CSS:
--- Code: ---strong { font-weight: normal; font-style: italic; }
--- End code ---
and it will make strong tags look like italics if you should so wish (and one website I know did actually do this very thing!) while speech readers will ignore such rules. My statement above, then, is reflective of what browsers will do provided boneheaded statements are not in use.
DaKrampus:
well i thanks to all... for the ideas and the responses.
As far as I know seo wise there is hardly any difference.. the thing is in my case its just visual, so user on page sees it.. there is no point of using it for seo.
As the word (in our LOL example)
LOL will be in acronym tag... <acronym title="Laughing out loud">LOL</acronym>
For me (I think though it is correct) the difference of acronym and abbreviation is that acronym is spoken as whole word and abbreviation is spoken each letter one by one. So LOL would be an acronym and USA or SMF would be abbreviation.
all the other words will be in definition tags.
There is then also 2 different kind of abbreviations: initialism and truncation...
i dont know if its correct, but i use the additional css for screen readers shown on this page: http://www.lyberty.com/encyc/articles/abbr.html
i changed it a little, now using:
acronym {speak : normal;} /*...say it as a word... */
abbr {speak : spell-out;} /*...say each letter seperately... */
As for the bold /strong issue i think i found a solution... I wont let smf parse it at all...
as I am only using 3 tags.. i will send it to js. that works very well and takes no php ressources.
--- Code: ---function js_bbcode(text)
{
text = text.replace(/\[b\]((\s|\S)*?)\[\/b\]/ig, '<b>$1</b>');
text = text.replace(/\[i\]((\s|\S)*?)\[\/i\]/ig, '<i>$1</i>');
return text.replace(/\[u\]((\s|\S)*?)\[\/u\]/ig, '<u>$1</u>');
}
//
--- End code ---
That meens on the page where i use it, (the only one) I will have <b> and in forums I will keep <strong>
DaKrampus
MrPhil:
My understanding:
* Abbreviation: Truncated word, e.g., abbr. instead of abbreviation (often with period at end). Ideally a screen reader would pronounce the full word, so you would have to give the full word for its benefit. This has a long tradition in English (e.g., "Ltd." pronounced "limited").
* Initialism: First letters of words, e.g., SMF (all or mostly capital letters, periods optional) Ideally a screen reader should spell it out.
* Acronym: Initialism pronounceable as a word, e.g, laser (often written in lowercase as a normal word). A screen reader should pronounce it as a word.The boundaries between these seem to be a bit fuzzy...
In your js_bbcode(), this is Javascript running on the browser? If so, when is it going to see BBCode (square brackets)?
DaKrampus:
actually, you gave me an idea. When user (admin) defines the word. and the word is an abbreviation, he will have to choose between to options. Speak as Word or Speakout each Letter .
that could give an extra class:
<abbr class="speakasword" title=....
or
<abbr class="speakeachletter" title=....
acronyms will be automatically spoken as word.
--- Quote from: MrPhil on July 15, 2012, 03:13:26 PM ---In your js_bbcode(), this is Javascript running on the browser? If so, when is it going to see BBCode (square brackets)?
--- End quote ---
Well I am using it for the tooltips (for the glossary), to parse the bbcode there. That works great. The text is the same text, this one is the definition that is also shown on a full page. (For the moment I am using the db cached version here. havent implemented the js on this side.)
One way i can imagine doing it is : setting all unparsed ones to visibility hidden and to iterate in a for loop through the names (fields have id: 0 to last one)
and once its parsed, replace innerHTML and set from hidden to visible... But I agree.. its a bit clumsy.
I will play around a litte.. (for the moment as I am using the db cached version..) so its no hurry. See what runs better.
DaKrampus
Navigation
[0] Message Index
[*] Previous page
Go to full version