hypothetical language question

Started by drewactual, April 23, 2019, 03:17:20 PM

Previous topic - Next topic

drewactual

in effort to eliminate bloat, language files are in my cross hairs (along with a lot of other files outside of SMF)

i am fortunate to have a very specific crowd with no variance in languages.  So, I'm thinking everything but english will be dropped as an option.  preferably, english utf8 will be the only files retained (if they're available)..

question 1; is there any impact progamatically by doing this?
question 2: let's suggest there is a mod utilizing english.php as opposed to english utf8- and i eliminate the non-utf 8 language file.... will SMF pick up the available file by itself?   

thanks in advance!!!


drewactual

.... oh, and to add: it's NOT about the size of the files or space, it's about the sheer number of them.

Kindred

SMF will use the file defined in the language settings.

If you have defined SMF to use UTF-8, it will use the UTF-8 version of the language file...  these will fall back to the english version if the specific file does not exist.

So, this means, if a mod touched modifications.english.php, but you also have modifications.english-utf8.php, then you need to make sure that the utf8 version is also updated.

Caveat - as of 2.1, utf-8 is the only choice and have had the specific designator removed --   so english-utf8 is not just referred to as english


over all, no....there is no issue
deleting languages which you don't use will have no impact, whatsoever.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Arantor

With one exception: english non-UTF-8 must remain even if you use UTF-8 otherwise some of the innards will cry.

Other than that, deleting what you're not using is fine but honestly even the number isn't a significant deal, there won't be a magic performance benefit, and unless you're on a stupid-ass host the performance during a backup is also a non-issue.

drewactual

Quote from: Arantor on April 23, 2019, 04:43:18 PM
With one exception: english non-UTF-8 must remain even if you use UTF-8 otherwise some of the innards will cry.

Other than that, deleting what you're not using is fine but honestly even the number isn't a significant deal, there won't be a magic performance benefit, and unless you're on a stupid-ass host the performance during a backup is also a non-issue.

the server is dedicated and managed by me solely, which is to confirm that my host is stupid-ass and with a certainty.  :) 

Kindred and Arantor, thanks! that was much appreciated.

Arantor

Nah, you're miles above the kind of people I was thinking of where 'stupid' is above their skill level. In my time I've had people try to tell me that 100 files in a directory is a performance problem. For any modern operating system this simply isn't true (and it hasn't been true for decades, you have to go back to DOS 2.0 for that to be a realistic problem)

drewactual

the biggest problem i face (for a little while longer) insofar as managing the site is the sheer number of files that still exist as the site has evolved over the past couple years. 

I have likely five times the number of files not in any use as i do that are actively used.  Of those, 85% is a good guess as a percentage that i'll never use again or that have no clever code snippets that i'd want to retain for future use.  It's kinda like cleaning out the console of your car, more than it's not.  :)  The biggest threat faced by having these files laying around is SOMETHING SHINY!!! whilst I'm actually attempting to do something... I find myself navigating to a file just to pause and ask myself "what was that file about?".... and down the wormhole i go...

i've kept mod's at a minimum- I think i've got 5 running.  Of those five, the authors were gracious enough to provide extensive language files.  there are likely 20 mod's i have but are not active any longer, and they also have language-gracious authors- which means there are likely 65~70 total language files in that directory I don't need.  That's literally all this is about ... and the language files are actually safe compared to the other malformed and buggy files i started out with when i first started carving up the website.

Arantor

Thing is, I really don't get what the problem is. So you have 65-70 files you don't need... and?

I have platforms I manage that have 10,000 files that I know are not in use but I don't remove them because that ends up being higher maintenance in the long run.

drewactual

heheheheeee-   ya see, i want to keep from having those 10k files if i don't need them.  it's more to comb through, quite simply, when i don't have to.  atop that, and this is small i admit, if i have the server set up for shared IP space (which i'm considering if i actually partition) and allowing htaccess's, the server has to comb through every directory looking for an htaccess whether it finds one or not, and on every call... the performance impact is minimal if any and not the point in that at all- simply de-cluttering is the goal. 

as a for instance, i wanted to work on a file named A-adminaccessalerts, but there are three- a-adminaccessalerts, a-adminaccessaall, and a-adminaccessallocated... it's a simple thing to pay attention to detail, but several times i've found myself working on file and wondering how it got so cluttered up- just to realize the code was intended for another file that was accessed by accident.  the a- designation was to thwart this, but now contributes.  it's just a matter of clutter that is unneeded.   I've seen some strategies use 'aaa' at the end of a file name to designate the first 'throw away', and using the three letters so they catch your eye... the next throw away of that file being bbb, then ccc, ect.. this works for a 'live' file- meaning you're still using the file (the one w/o the three letter ending notation).  I've even used that tactic some- hence, there are some files on the server all the way up to 'xxx' in some cases.  that's 24 versions of the same file, all retained in effort to provide milestone's if you ever have to step back for any reason.   Some of those, in my case, will be retained.  most of them won't.  all in the name of de-cluttering.

i just wasn't sure what difference it would make if some language files were dispensed... as it turns out, I think i'll be retaining them instead.  they account for only a small fraction of the garbage, anyway.

Arantor

This is where the devops world has some ideas, they put all the code into a git repo - unless it's checked in, it doesn't get deployed to the server. So not only would you change it, you'd have to double check it as you checked it in.

Peer review is also great for this if achievable.

GigaWatt

Quote from: Arantor on April 23, 2019, 05:28:28 PM
In my time I've had people try to tell me that 100 files in a directory is a performance problem. For any modern operating system this simply isn't true (and it hasn't been true for decades, you have to go back to DOS 2.0 for that to be a realistic problem)

Then why is it when I try to open a directory containing 8000 files, the OS lags a bit?

With 100 files, yeah, I'd agree, but up that number 100 times and no, I'd argue that that's not true.

And let me just add that this is happening now, not a decade ago, when the fastest thing you could have was some sort of a quad core (which would probably lag badly opening a dir containing 10.000 files)... now, on a relatively up to date and powerful rig (Intel Core i7-4790K, 32GB DDR3 RAM, SSD).
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

Arantor

Firstly, straw man. I wasn't claiming that 8000 files on a folder wasn't a problem, because filesystems generally aren't designed for that. I was stating that 100 files in a folder is not a problem.

8000 files is starting to hit the point where things start to hurt though it depends on which file system you're using. FAT32 flails first, NTFS is better but not great, ext2/3 similar, ext4 better again (that tends to struggle by 32k files).

But those are based on raw accesses to the FS, which I'm certain you're not talking about (therefore double straw man) because if you're talking about doing things with Windows Explorer, it will actually go reading files based on metadata (way more disk accesses than just getting directory entries) to vary how it displays.

GigaWatt

Quote from: Arantor on April 25, 2019, 01:50:32 AM
I was stating that 100 files in a folder is not a problem.

Completely agree... on a rig that's not 15 years old... which practically means that, yes, listing 100 files in a folder is a "stress", except that it's not noticeable on modern hardware. That is what I was actually trying to say ;).

Try listing 100 files on a 386 CPU, see what happens :P :D.

Quote from: Arantor on April 25, 2019, 01:50:32 AM
But those are based on raw accesses to the FS, which I'm certain you're not talking about (therefore double straw man) because if you're talking about doing things with Windows Explorer, it will actually go reading files based on metadata (way more disk accesses than just getting directory entries) to vary how it displays.

Depends on whether you've got metadata features enabled on Windows. Yes, they're enabled by default on workstations, but you can disable them.

And yes, I am "talking about that" since I've got media features disabled and I was talking about files that either don't have extensions or have extensions that the OS just doesn't care about (.dat, .bin, etc.).

And the problem is getting even more interesting in most mainstream Linux distros, since it reads headers, not extensions, so things should get "laggier"... they're not, more or less the same as in Windows.

Beside the point. Yes, the lag is not noticeable with 100 files, but it sure as hell is when you've got 10.000 files in a dir.

Just trying to expand on your theory "100 files are not a problem"... yes, that is true, depending on the hardware.
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

Arantor

I had a 386 for many years, 100 files in a folder using a sane file manager is not a problem. I learned how useful dir /wp could be (the p, for pagination, because of number of files). Even the venerable Windows File Manager wasn't troubled much either, though I usually used XTree-Gold for the most part instead. Even Win95 on that 386 didn't die tragically even if it struggled running Win95 itself (386-SX40 with 8MB RAM is basically the physical minimum for Win95)

But your "just trying to expand" argument is BS, it reads like you were finding a reason to argue for the sake of arguing. If you expand a data point by two orders of magnitude, things are going to change, and no one except you was suggesting that 10000 files wouldn't be problematic. I merely asserted that 100 files in a folder wasn't a significant problem. If it was, SMF 1.0 would have had some serious challenges based on the fact that in 2004 (15 years ago), it had roughly 40 tables which in MySQL would generate 3 files per table (this is MyISAM, InnoDB didn't exist yet, so there's the frm, myd and myi files per table) and they'd all be in the same folder (one folder per database). Which means that you're talking 100+ files in a folder, in 2004, 15 years ago, but sure, that's a huge and catastrophic problem.

And you're telling me this would be a problem, but funny how it actually isn't, right? You are conflating filesystem performance with application performance, which only entered the discussion because you brought it up, and complaining that modern applications take more effort to do things mostly because they do a lot more than they did 15 years ago. That part is true, developers are lazier now and programs more feature rich, and dev time is more expensive than hardware time, so yeah.

But it has made it very clear that I should step out of any conversation going forward because all that's going to happen is that you're going to spout your wisdom and I can't be bothered challenging the obvious fallacies and straw men any more.

GigaWatt

As usual Arantor, you're correct :).

And for the record, I wasn't arguing. As I said, I was just trying to make you realize that, yes, 100 files in a dir is not a problem for a modern CPU, but it is for an ancient one ;).
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

Arantor

Even after I flat out tell you that I was using it on an ancient one with no problems, unless suddenly a 1994-made 386SX-40 chip is now a modern CPU. I again call BS.

GigaWatt

"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

Kindred

Hey, When I ran my BBS on an 8088, eventually upgrading to a 286 and 386SX machine -- starting with DR-DOS, eventually moving to MS-DOS, OS2/WARP and finally Windows 3.1 ---    I had no problem with 100 files in a folder.   Heck, I had 342 files in my downloads folder and never noticed much if any slowdown in the system when accessing that folder.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Advertisement: