Can TXT files in Windows hold the same exact text but be of different sizes because one was created a long time before?

Solution 1:

I take "it was automatic" to mean that you used for the new file the default encoding format for notepad or some other text editor.

The default text encoding is ANSI, which uses one byte per character. As the old encoding was Unicode, which may use more than one byte per character, this may explain the size difference.

You should check the result, to ensure that no data was lost during the copy-paste, as for example by non-ANSI characters being converted by truncation, or converted to ANSI characters that are only similar.

Solution 2:

Check the encoding by opening each file in Notepad and then launching the Save As dialog. The Encoding selector will be set to whatever encoding was used when the file was saved:

enter image description here

If it's not encoding, check for Alternate Data Streams in the larger file:

  1. Open a Command Propmpt to the directory containing the file in question.
  2. A dir /R command will reveal any files with Alternate Data Streams:

(Fies with ADS are most commonly found in the Downloads direcgtory, where htey serve as ZoneIdentifiers)

C:\Users\keith\Downloads>dir /AA /R
 Volume in drive C is Windows
 Volume Serial Number is F057-590D

 Directory of C:\Users\keith\Downloads

10/02/2015  09:20 PM         1,730,272 ActiveSetupN.exe
05/12/2019  11:49 AM             8,312 Add_Settings_to_desktop_context_menu.reg
                                   280 Add_Settings_to_desktop_context_menu.reg:Zone.Identifier:$DATA
03/27/2018  10:29 PM         1,747,504 adksetup.exe
02/07/2018  09:18 PM            55,296 advchange.exe

...

06/03/2019  03:32 PM           165,184 amtrak-vacations-network-map-2019.jpg
                                   245 amtrak-vacations-network-map-2019.jpg:Zone.Identifier:$DATA

...

              92 File(s)  2,139,170,500 bytes
               0 Dir(s)  610,151,268,352 bytes free

Solution 3:

There are MANY "Unicode" character encodings. In Windows-speak, "Unicode" has usually meant UTF-16LE, so that English text would occupy 2 bytes per character. The default encoding in Notepad is now UTF-8, which would halve the size of the same standard English text, including the "usual" special characters.