News:

Want to get involved in developing SMF, then why not lend a hand on our github!

Main Menu

PHP Parse error: syntax error, unexpected ''xxx" ' (T_ENCAPSED_AND_WHITESPACE)

Started by shawnb61, August 11, 2020, 11:55:00 PM

Previous topic - Next topic

shawnb61

Error message:
PHP Parse error: syntax error, unexpected ''a:379:{s:10:"smfVersion";s:6:' (T_ENCAPSED_AND_WHITESPACE) in /home/blahblah/public_html/cache/data_e5def45bc64284f0d829317a2d0686be-SMF-modSettings.php on line 1

This error has been reported multiple times for both 2.0 & 2.1. For some folks (such as myself) the error message is fleeting, and there is no impact on end users. For some forums, however, users are locked out of the forum until cache is cleared.

My system uses simple SMF file caching, level 1, with no OS or host-level caching.

Sample threads:
https://www.simplemachines.org/community/index.php?topic=574234.0
https://www.simplemachines.org/community/index.php?topic=570852.0
https://www.simplemachines.org/community/index.php?topic=569135.0
https://www.simplemachines.org/community/index.php?topic=563664.0
https://www.simplemachines.org/community/index.php?topic=567903.0
https://www.simplemachines.org/community/index.php?topic=567931.0

This issue has been logged for 2.1 here:
https://github.com/SimpleMachines/SMF2.1/issues/5826

And for smf 2.0.x internally as #69.

We believe we have a fix for this issue in 2.0.17.

Please let me know if you have been impacted by this in 2.0.17 & are available to help test a solution.

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Arantor

So which symptom are you fixing, null bytes, cut off after 8192 bytes, or something else?

If it's the 8192 bytes issue, that's probably best fixed by way of writing, doing a scandir, then reading the file size.

Or not caching modSettings except on level 2+, or only caching it if you're on something other than file, it's just too volatile otherwise.

The 8192 bytes problem applies in almost all cases so even moving to not using PHP files won't save you because it's not failing on read, it's failing complete writes even on allegedly exclusive locks, which is common if not careful with NFS.

shawnb61

All of the above.

This is a much simpler issue than all that.  Flat file I/O is not a big mystery, and hasn't been for decades.  The only confusion here has been which php functions properly honor locks.  Tests reveal that most don't...  That's the problem.

Again, if I can find someone who has this issue, please let me know & we can coordinate a test.

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Arantor

So here's the thing, even when given exclusive locks, plus a test that "bytes written is bytes expected to be written", it still fails... because it thinks it has written the entire file and it hasn't.

Both before and after the introduction of file_put_contents, it still has this issue.

Like I said on the ticket, for the supposedly solved problem of "flat file I/O" it isn't really that well solved because I've seen file_exists tests failing on files I can clearly see. PHP's whole file handling is a bit shaky when you drop it onto NFS shares without the underlying file system configuration being tweaked which most hosts do correctly. The issue tends to cluster around attempted concurrent writes where NFS doesn't lock it as thoroughly as it should (which never shows up on dev machines since 1) they're local not networked and 2) it only presents when the data is larger than a standard file system block of 8192 bytes). Moving to JSON will help but not for the obvious reason of error catching, but simply it's less bloated than PHP's serialiser.

Good luck getting someone to test, too many people just turn caching off and move on with their day.

I really hope you fixed it. I just don't think it's that clear cut.

(Goes back to trying to fix file_exists returning false for a file that is there and has been there on the server for best part of 10 years but because networked, behaves weirdly)

shawnb61

We've been able to reproduce the issue at will in both linux & wamp environments, by simulating load.  And we have confirmed this fix addresses these.  In fact, I was getting the issue on my production forum, and it has addressed it there, also. 

If anyone is having this issue, please let me know & we'll work on a fix together.  The more confirmation the better; lots of environmental variation out there.

Quote from: Arantor on August 12, 2020, 06:54:27 AMBoth before and after the introduction of file_put_contents, it still has this issue.
Yep, understood.  The problem with file_put_contents() is that it completely ignores locks.  Lots of articles out there on that.  Apparently, so does include(). 
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Arantor

Here's the thing - before we had file_put_contents, we had an explicit fopen with FLOCK_EX which still had the same problem. We moved *to* file_put_contents because it was still writing partial files.

And fopen *does* respect locks. That's kind of its thing.

Here's a real test: bloat out your settings table so that the JSON form ends up bigger than 8KB and check it *still* works.

shawnb61

Quote from: Arantor on August 13, 2020, 06:32:11 PM
Here's a real test: bloat out your settings table so that the JSON form ends up bigger than 8KB and check it *still* works.

Yep. 
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

shawnb61

Excuse me - I made a typo above.   ::)

The problem is *not* with file_put_contents().  It's with the read.  The read needs a LOCK_SH.  And yes, the fix works with very large cache files.

Modsettings is  ~30K on my prod forum.  Prior to the fix we had these errors 10x in July, about once every 3 days.  None since. 

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Arantor

That's the thing though: you're saying it's with the read, yet *every* single case I ever saw of this being broken, the *written* file was truncated, so it wasn't the read that was the problem. (Unless you're talking about the read done to verify the write has happened.)

Honestly though if you're at the stage where you're getting these issues, you should be caching with memcache anyway.

shawnb61

I suspect your partial files were in fact partial reads that were saved off or cached somewhere in that broken state. 

Either way, reading a somewhat volatile file without a read lock is clearly a problem, and should be addressed.

As noted before, testing thus far has been great.  So, I'm just letting folks know that we'll work with you if you see this.
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Arantor

They were what people were saving via FTP and sending for analysis, which means you were getting partial *writes* being persisted, not partial reads.

Thing is, most people won't come into this topic if they have this issue, they'll post in support. Need to raise awareness of this with people like Shambles and Sir Osis to suggest they look here rather than just 'delete it' or 'turn caching off'.

I'm glad that you seem to have fixed it, this has been niggling away at me for a while, but I still take exception to the suggestion that 'flat file I/O is well understood' because it's infinitely more complex than that if you're doing massive concurrency on NFS or, woe betide you, AWS EFS or even gluster. We're at the stage that despite being massive scale on AWS (like, 50+ containers to run a single site), we run a dedicated VPS just to store files on because NFS plays better than EFS does.

I'm intrigued that you were able to replicate at will on a non-fileshare environment because I've never been able to, not at any point in the last few years, not even with ApacheBench hammering away. How much load were you triggering?

Also, if you've moved to JSON, that just means you're better equipped to handle soft fails to decode as if not present... are you really getting 100% no-fail decodings or is this really just 'gracefully handling the filesystem messing about'? (Mostly interested from an academic point of view; if users don't have dead sites, that's a win in any book)

shawnb61

Two bonehead utilities.  One that loops through ~2000 times, creates a large random array of text & issues SMF's cache_put_data for that array (just like modsettings).  The other loops through 1000x & issues SMF's cache_get_data; this one will report out if a NULL is returned or if the size of the cache returned is unexpected.  Speed can be adjusted by the amount of sleep time between each attempt.

Running TWO of the PUT scripts + one of the GET scripts all simultaneously does the trick.  You want the locks on the PUTs to occasionally collide, causing a wait.  That's when the GETs fail.

The existing logic crashes a lot - it never completes on linux.  WAMP & linux behave differently, but both will ultimately give you the T_ENCAPSED_AND_WHITESPACE error.  Many, many NULLs (sometimes 50%!), which is really a waste, because then the caller then thinks he has to rebuild cache, for no good reason... 

What was initially bugging me was - why are there so many collisions...  But thinking about it, that expiration timestamp applies to everyone...  Once the cache expires, all current sessions with activity may try to rebuild it at close to the same time. 

Kinda like horses out of the gate at a horse race... 

The read lock gets rid of all the crashes, and even all of the NULL reads.  If it's locked, you should wait 'till the lock is released before trying to read the data.
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Shambles

Quote from: ArantorNeed to raise awareness of this with people like Shambles and Sir Osis to suggest they look here rather than just 'delete it' or 'turn caching off'.
The way I see it, is "Admins" who report the error here are of that same ilk who ask "how do I stop guests browsing my forum?" and for them, the easiest way to get their forums up and running is to suggest a quick and easy fix, as you exemplified.

shawnb61

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Advertisement: