Community
    • Login

    Standard ANSI and code still change to something else

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    9 Posts 4 Posters 92 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • N
      NolanNolan
      last edited by NolanNolan

      Hi

      i use v8.8.7 on windows 11 pro v 25H2.

      when i right click to create new txt, press enter and hereafter double click the txt file opening it i NPP, its ANSI.

      When i enter some text and save its saved in ANSI as supposed.

      But when i enter danish letters æøå and save its suddently saved in something else, eg windows 1255.

      It shows encoding characterset Hebrew which is wrong, it should not show anything

      Why is that ?

      How to always create and save in ANSI ?

      thanks

      Nolan

      PeterJonesP 1 Reply Last reply Reply Quote 1
      • PeterJonesP
        PeterJones @NolanNolan
        last edited by PeterJones

        @NolanNolan ,

        “ANSI” is the American National Standards Institute. One of the things they did during 80s-era computing was define different encodings that put various character sets into the 256-character page limit of an 8-bit character. At some point, computer newbies confused the organization that defined them with the encodings themselves, and the world was stuck with incorrectly calling all those 256-character encodings “ANSI”, even when referring to the 8-bit encodings that were specific to Microsoft’s DOS and later Windows operating systems.

        When doing encodings for the “Windows” GUI OS, MS thoughtfully named their encoding standards as “WIN-####” or “Windows ####” (people write them both ways). For example, Windows 1252 is the Microsoft encoding that’s nearly identical to ISO-8859-1 (ISO = “International Standards Organization”, an international body who makes standards, like ANSI is US-specific). Windows 1255, which you mentioned, is an encoding for Hebrew.

        When Notepad++ says “ANSI”, what it means is, “using whichever 8-bit encoding your installation of Microsoft has set as the default character set / codepage” (which gets even more confusing now that you can confuse things by telling the OS to use 65001, which is the UTF-8 codepage, which causes many unexpected bugs, since Notepad++ is not expecting multi-byte characters when in ANSI mode).

        But anyway, when you save a file, Notepad++ writes those bytes to disk based on the default Windows encoding; but the OS does not save any metadata about the encoding of the file (back in the DOS days, FAT and FAT32 didn’t have enough space to store such metadata; and when MS made the NTFS for NT, they could have added in metadata like encoding, but chose not to) – but that means, when any application, Notepad++ or otherwise, reads the file later, they have no way to know for sure what encoding was used for a given “text file” based on the information in the file itself or based on non-existent metadata. As a result, Notepad++ uses a set of heuristics to guess, based on byte frequency and byte sequences, what encoding it probably is. But it often guesses wrong, which is why my recommendation is to always turn off Settings > Preferences > MISC > Autodetect character encoding: assuming that the majority of “ANSI” text files you are reading are made by you on your same computer with the same default codepage/encoding, you shouldn’t need Notepad++ to “guess” what encoding it thinks it is: you can just let it always apply the Windows default encoding when it reads the file.

        Or you could do something that brings you into this century, by using the UTF-8-with-BOM or one of the UTF-16 encodings, any of which will unambiguously be able to encode any of the 160k or so characters defined by Unicode – which allows you to mix characters from across the world without any ambiguity of 1980’s style 8-bit encodings. If you have a choice in your data, choose UTF-8 or UTF-16; if you have no choice, complain to whoever is not giving you the choice that they are hindering efficiency by forcing you to continue to use outdated 1980’s character sets instead of a modern encoding built to interface with the whole world.

        N 1 Reply Last reply Reply Quote 1
        • N
          NolanNolan @PeterJones
          last edited by

          @PeterJones
          Thanks i disabled auto detect just in case, but what i observed now was after a reboot it seemed to work as expected again.

          What ever the reason for this was, i don tknow.

          The reason i use ANSI is the following:

          i use Danish Windows 11 Pro 25H2 ie the latest version, and if i create a txt file with the build in notepad.exe application which uses UTF-8 and write the danish characters æ,ø,å then windows search nor copernic desktop search can find any file with letters æ,ø,å because its interpreted as letters not being æ,ø,å. This i assume is dictated by the OS, which is from 2025. If i save txt files in ANSI windows search and copernic perfectly finds txt files with the letters æ,ø,å, but not if saved in UTF-8.

          This is both annoying and very weird.

          Saving in ANSI seems not a proper solution more of a work around.

          It does not make sense to have to use ancient ANSI to make this work and it seems contradictionary that the OS (via native notepad.exe saving in UTF-8 format) does not read correctly UTF-8, when notepad.exe saves in UTF-8.

          Any explanation or proper solution to this ?

          best Nolan

          PeterJonesP CoisesC 2 Replies Last reply Reply Quote 1
          • PeterJonesP
            PeterJones @NolanNolan
            last edited by PeterJones

            @NolanNolan said in Standard ANSI and code still change to something else:

            @PeterJones
            Thanks i disabled auto detect just in case, but what i observed now was after a reboot it seemed to work as expected again.

            Glad that helped.

            … windows search nor copernic desktop search can find any file with letters æ,ø,å because its interpreted as letters not being æ,ø,å. This i assume is dictated by the OS, which is from 2025.

            I guesss I’d never tried using Windows search to look for UTF-8 characters. That’s really annoying if they don’t handle that right. You’d think Microsoft would’ve figured that out long ago.

            This is both annoying and very weird.

            Understandable.

            Any explanation or proper solution to this ?

            Sorry, I have no insight into the OS level searches.

            Nor do I have a proper solution. But, as an alternate workaround, instead of using Windows Search, use Notepad++'s Find in Files to search your files for UTF-8 characters? ;-)


            BTW: You didn’t need to make that post twice. As the form tells you: until you have enough reputation/upvotes, you need to wait for a moderator to approve your post, so it won’t be visible immediately, so that’s why you couldn’t see your post. However, it looks like you now have enough upvotes so that your posts will go thru without moderator approval, so you shouldn’t have to wait for the post queue any more.

            N 1 Reply Last reply Reply Quote 0
            • CoisesC
              Coises @NolanNolan
              last edited by

              @NolanNolan said in Standard ANSI and code still change to something else:

              use Danish Windows 11 Pro 25H2 ie the latest version, and if i create a txt file with the build in notepad.exe application which uses UTF-8 and write the danish characters æ,ø,å then windows search nor copernic desktop search can find any file with letters æ,ø,å because its interpreted as letters not being æ,ø,å. This i assume is dictated by the OS, which is from 2025. If i save txt files in ANSI windows search and copernic perfectly finds txt files with the letters æ,ø,å, but not if saved in UTF-8.

              Try saving (whether in Notepad or Notepad++) as UTF-8 with BOM. In the absence of a byte order mark, Windows assumes files use the legacy code page associated with the system locale.

              N 2 Replies Last reply Reply Quote 0
              • N
                NolanNolan @PeterJones
                last edited by

                @PeterJones
                thanks Peter, your help was again very helpful and insightfull, much appreciated :-)

                best Nolan

                1 Reply Last reply Reply Quote 0
                • N
                  NolanNolan @Coises
                  last edited by NolanNolan

                  @Coises

                  Yes you are indeed right, i just picked ANSI as the first option which was a work around solution for windowsx search to find æøå as content in txt files, but tested UTF-8 BOM and this format also works but UTF_8 without BOM does not work. Thanks for your suggestion, that will be a more modern and my default code from hereon.

                  But really weird that using Microsofts own notepad.exe that comes with a standard windows installation makes windows search not detect characters in txt files that belongs to the installation language of the OS.

                  I have also tried to find a solution to set the txt coding, system wide in the OS, but couldnt find any. SO i guess the way to go is to default UTF-8 BOM through the NOtepad++ app (by the way this even cant be set in the native microsoft notepad.exe app)

                  Thanks :-)

                  best Nolan

                  1 Reply Last reply Reply Quote 0
                  • N
                    NolanNolan @Coises
                    last edited by NolanNolan

                    @Coises

                    i now changed the coding to UTF-8 BOM, assuming this sets the default for all new txt files, but new files are still created as ANSI, when i right click empty space in file explorer and create new. But when i open NOtepad++ as an app it is opening with default UTF-8 BOM as expected, have i missed something regarding a setting ?

                    5dc26c2f-b1ba-41a2-b8af-66fb86798dcb-image.png

                    best Nolan

                    1 Reply Last reply Reply Quote 0
                    • Thomas AndersonT
                      Thomas Anderson
                      last edited by

                      Notepad++ auto-detects encoding based on the characters you type. When you enter Danish letters like æ, ø, å, these aren’t part of standard ANSI, so Notepad++ switches to a code page that can support them (like Windows-1252 or sometimes misdetects as 1255).

                      To always save in ANSI:

                      Go to Settings → Preferences → New Document → Encoding.

                      Select ANSI as the default.

                      Check “Apply to opened ANSI files” if available.

                      Note: Some characters (like æøå in certain ANSI pages) may not display correctly in pure ANSI — using Windows-1252 is safer for Western European letters.

                      This ensures new files default to ANSI, but remember Notepad++ may still switch if characters aren’t supported in that code page.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors