Originally, yes that was the case. But 959 has been superceded multiple times since 1985, meaning that in the current specifications, it's not a strict 7-bit safe only environment.
Do you mean to say that the only difference between an ASCII transfer and a binary transfer is what happens with line endings? John Klensin says that any translation besides that to CR/LF for transfer (specified by network ASCII) is performed by the client for the local system, not by the server (if the server follows protocol). So wouldn't that imply that between two identical systems, there would be no difference in line endings once the transfer is complete? So if I transfer a binary file between 2 FreeBSD systems in ASCII, the LFs will be converted to CR/LFs for transfer, then the client will convert them back to LFs, so the file would essentially be the same, wouldn't it? So a binary file (say, like an avatar) transferred as ASCII between identical systems will not be corrupted? (No, I haven't tested this yet, I'll have to do that later sometime.)
Arantor, I am the kind of person who likes to understand things, not just take someone else's word for them (not at all implying, though, that your word isn't good), and there is no question that you are far smarter than I am or ever will be, so how about sharing the source of these new (to me) revelations?
I have found RFC 2640
, which is titled "Internationalization of the File Transfer Protocol," and says this:
The File Transfer Protocol, as defined in RFC 959 [RFC959] and RFC
1123 Section 4 [RFC1123], is one of the oldest and widely used
protocols on the Internet. The protocol's primary character set, 7
bit ASCII, has served the protocol well through the early growth
years of the Internet. However, as the Internet becomes more global,
there is a need to support character sets beyond 7 bit ASCII.
This document addresses the internationalization (I18n) of FTP, which
includes supporting the multiple character sets and languages found
throughout the Internet community. This is achieved by extending the
FTP specification and giving recommendations for proper
but it also says this:
As the Internet grows throughout the world the requirement to support
character sets outside of the ASCII [ASCII] / Latin-1 [ISO-8859]
character set becomes ever more urgent. For FTP, because of the
large installed base, it is paramount that this is done without
breaking existing clients and servers. This document addresses this
need. In doing so it defines a solution which will still allow the
installed base to interoperate with new clients and servers.
This document enhances the capabilities of the File Transfer Protocol
by removing the 7-bit restrictions on pathnames used in client
commands and server responses, RECOMMENDs the use of a Universal
Character Set (UCS) ISO/IEC 10646 [ISO-10646], RECOMMENDs a UCS
transformation format (UTF) UTF-8 [UTF-8], and defines a new command
for language negotiation.
The recommendations made in this document are consistent with the
recommendations expressed by the IETF policy related to character
sets and languages as defined in RFC 2277 [RFC2277].
The File Transfer Protocol was developed when the predominate
character sets were 7 bit ASCII and 8 bit EBCDIC. Today these
character sets cannot support the wide range of characters needed by
multinational systems. Given that there are a number of character
sets in current use that provide more characters than 7-bit ASCII, it
makes sense to decide on a convenient way to represent the union of
those possibilities. To work globally either requires support of a
number of character sets and to be able to convert between them, or
the use of a single preferred character set. To assure global
interoperability this document RECOMMENDS the latter approach and
defines a single character set, in addition to NVT ASCII and EBCDIC,
which is understandable by all systems. For FTP this character set
SHALL be ISO/IEC 10646:1993. For support of global compatibility it
is STRONGLY RECOMMENDED that clients and servers use UTF-8 encoding
when exchanging pathnames. Clients and servers are, however, under
no obligation to perform any conversion on the contents of a file for
operations such as STOR or RETR.
The character set used to store files SHALL remain a local decision
and MAY depend on the capability of local operating systems. Prior to
the exchange of pathnames they SHOULD be converted into a ISO/IEC
10646 format and UTF-8 encoded. This approach, while allowing
international exchange of pathnames, will still allow backward
compatibility with older systems because the code set positions for
ASCII characters are identical to the one byte sequence in UTF-8.
This seems to mean it is only talking about pathnames changing, not the ASCII spec. But are the local systems changing the way things are done depending on their localization? They know that some characters in that localization require all 8 bits? Or the local system determines that a file has non-ASCII characters, and adapts accordingly?
I have found RFC 3659
, which updates (but does not obsolete) 959, and it says this:
This document also uses notation defined in STD 9, RFC 959 . In
particular, the terms "reply", "user", "NVFS" (Network Virtual File
System), "file", "pathname", "FTP commands", "DTP" (data transfer
process), "user-FTP process", "user-PI" (user protocol interpreter),
"user-DTP", "server-FTP process", "server-PI", "server-DTP", "mode",
"type", "NVT" (Network Virtual Terminal), "control connection", "data
connection", and "ASCII", are all used here as defined there.
which is defined there as:
The ASCII character set is as defined in the ARPA-Internet
Protocol Handbook. In FTP, ASCII characters are defined to be
the lower half of an eight-bit code set (i.e., the most
significant bit is zero).
This is knowledge-seeking on my part, not a "my * is bigger than your *" challenge to you.
If it were, every instance of SMF would be broken by FTP clients that treat PHP files as 'text' because there is a function in Subs.php that has 8-bit characters in it. As would any language file that isn't English.
Then if the 8th bit in ASCII transfers can now be 0 or 1, how come the images in the Avatars and Attachments directory get broken?