End of line characters in ASCII file transfers
Posted by Mohammad Jawwad on 19 March 2009 03:36 AM
A common problem during FTP/SFTP ASCII file transfers is that end of line (EOL) characters are either ignored or not properly converted. There are two characters in ASCII character set that might used as EOL characters :

1) LF

This is the line feed character with ASCII code 10. This is used by Unix OS.

2) CR

This is the line feed character with ASCII code 13. This is used by Mac OS.
Windows uses CR/LF. (CR is before LF and two ASCII chars are not interchangeable.)

If you are transferring a ASCII file from Windows to Unix the files will have CR/LF characters at the end of each line. Unix does not care about CR as this is non-printable characters. Non-printable characters under Unix are converted to control characters. The LF portion of CR/LF is recognized as a end of line character by Unix. What does this mean in context of transferring files?
If the following code is used in Secure iNet Factory for Java or Secure FTP Factory for Java, transfer of ASCII files will not present any major problems (Assume remote machine runs a Unix OS and the client machine runs Windows OS) :

Sftp s = new Sftp(new SshParameters("[unix_hostname]", "[username]", "[password]"));
s.upload(new File("D:\\sample.txt"));

Now let's assume you are transferring a ASCII text file from Unix to Windows and use the following code :

Sftp s = new Sftp(new SshParameters("[unix_hostname]", "[username]", "[password]"));
s.setAscii();"D:\\sample.txt", "sample.txt");

Open the downloaded file in Notepad and notice the line endings. Each of them will have a small black rectangle at the end.This happens because we have not specified any line ending in the above code so the API does not know which line ending to use. This problem can be solved by using setLineTerminator("\n"); before the download() call.

Why are we using "\n"? This is because the machine we are connecting to runs the Unix OS. This point is important to keep in mind when transferring files : setLineTerminator() requires the EOL of the OS the remote machine is running.

Add the call and run the code. Open the file in Notepad and you will find the annoying rectangles are gone. Note that Wordpad does not display the black rectangles : this might be that Wordpad knows how to interpret Unix end of lines and this may give a false impression that the code is working. You can also view the downloaded file in a hex editor to confirm the end of lines are valid.

The API does not know what OS the remote machine is running. You have to take care of this in the application logic yourself.

Another version of this article with screen shots illustrating the problem :

(273 vote(s))
This article was helpful
This article was not helpful

Comments (0)
Post a new comment
Full Name:
CAPTCHA Verification 
Please enter the text you see in the image into the textbox below. This is required to prevent automated registrations and form submissions.

Help Desk Software by Kayako fusion