Reading from ifstream won't read whitespace

There is a manipulator to disable the whitespace skipping behavior:

stream >> std::noskipws;

The operator>> eats whitespace (space, tab, newline). Use yourstream.get() to read each character.

Edit:

Beware: Platforms (Windows, Un*x, Mac) differ in coding of newline. It can be '\n', '\r' or both. It also depends on how you open the file stream (text or binary).

Edit (analyzing code):

After

  while(input.get(current) && current != L'\n');
  continue;

there will be an \n in current, if not end of file is reached. After that you continue with the outmost while loop. There the first character on the next line is read into current. Is that not what you wanted?

I tried to reproduce your problem (using char and cin instead of wchar_t and wifstream):

//: get.cpp : compile, then run: get < get.cpp

#include <iostream>

int main()
{
  char c;

  while (std::cin.get(c))
  {
    if (c == '/') 
    { 
      char last = c; 
      if (std::cin.get(c) && c == '/')
      {
        // std::cout << "Read to EOL\n";
        while(std::cin.get(c) && c != '\n'); // this comment will be skipped
        // std::cout << "go to next line\n";
        std::cin.putback(c);
        continue;
      }
     else { std::cin.putback(c); c = last; }
    }
    std::cout << c;
  }
  return 0;
}

This program, applied to itself, eliminates all C++ line comments in its output. The inner while loop doesn't eat up all text to the end of file. Please note the putback(c) statement. Without that the newline would not appear.

If it doesn't work the same for wifstream, it would be very strange except for one reason: when the opened text file is not saved as 16bit char and the \n char ends up in the wrong byte...


You could open the stream in binary mode:

std::wifstream stream(filename, std::ios::binary);

You'll lose any formatting operations provided my the stream if you do this.

The other option is to read the entire stream into a string and then process the string:

std::wostringstream ss;
ss << filestream.rdbuf();

OF course, getting the string from the ostringstream rquires an additional copy of the string, so you could consider changing this at some point to use a custom stream if you feel adventurous. EDIT: someone else mention istreambuf_iterator, which is probably a better way of doing it than reading the whole stream into a string.


Wrap the stream (or its buffer, specifically) in a std::streambuf_iterator? That should ignore all formatting, and also give you a nice iterator interface.

Alternatively, a much more efficient, and fool-proof, approach might to just use the Win32 API (or Boost) to memory-map the file. Then you can traverse it using plain pointers, and you're guaranteed that nothing will be skipped or converted by the runtime.