Do line endings differ between Windows and Linux? [closed]

I am trying to parse the Linux /etc/passwd file in Java. I'm currently reading each line through the java.util.Scanner class and then using java.lang.String.split(String) to delimit each line.

The problem is that the line:

list:x:38:38:Mailing List Manager:/var/list:/bin/sh" 

is treated by the scanner as 3 different lines:

  1. list:x:38:38:Mailing
  2. List
  3. Manager...

When I type this out into a new file that I didn't get from Linux, Scanner parses it properly.

Is there something I'm not understanding about new lines in Linux?

Obviously a work around is to parse it without using scanner, but it wouldn't be elegant. Does anyone know of an elegant way to do it?

Is there a way to convert the file into one that would work with Scanner?


Not even two days ago: Historical reason behind different line ending at different platforms

EDIT

Note from the original author:

"I figured out I have a different error that is causing the problem. Disregard question"


Solution 1:

From Wikipedia:

  • LF: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD, etc.), BeOS, Amiga, RISC OS, and others
  • CR+LF: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M, MP/M, DOS, OS/2, Microsoft Windows, Symbian OS
  • CR: Commodore machines, Apple II family, Mac OS up to version 9 and OS-9

I translate this into these line endings in general:

  • Windows: '\r\n'
  • Mac (OS 9-): '\r'
  • Mac (OS 10+): '\n'
  • Unix/Linux: '\n'

You need to make your scanner/parser handle the unix version, too.

Solution 2:

You can get the standard line ending for your current OS from:

System.getProperty("line.separator")

Solution 3:

The scanner is breaking at the spaces.

EDIT: The 'Scanning' Java Tutorial states:

By default, a scanner uses white space to separate tokens. (White space characters include blanks, tabs, and line terminators. For the full list, refer to the documentation for Character.isWhitespace.)

You can use the useDelimiter() method to change these defaults.