Community
    • Login

    Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 5 Posters 5.0k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • byzodB Offline
      byzod
      last edited by byzod

      The Applies to ANSI file option is only available when use utf-8 as default format

      What I want:
      Use utf-8-bom as default encoding, also treat ANSI file as utf-8 (no bom)

      Why:
      utf-8 without bom is s**t, utf-8-bom is the better option for gentleman, but if you use utf-8-bom as default encoding, you can’t use Applies to ANSI file option
      Thus ANSI files are opened as ANSI encoding, this cause massive problem when paste unicode contents in it and save it as is (I had at least 3 applications messed by this)

      Cheap optional solution:
      Show a modal dialog warning that there might be encoding problem when saving ANSI file with Unicode characters, just like what Microsoft notepad.exe did

      PeterJonesP gstaviG 2 Replies Last reply Reply Quote 0
      • PeterJonesP Online
        PeterJones @byzod
        last edited by

        @byzod

        That feature does not currently exist.

        if you would like to request that feature, please see the FAQ which explains how and where to request a feature: https://community.notepad-plus-plus.org/topic/15741/faq-desk-feature-request-or-bug-report

        1 Reply Last reply Reply Quote 1
        • gstaviG Offline
          gstavi @byzod
          last edited by

          @byzod said:

          utf-8 without bom is s**t

          I am curious if you know what BOM is? Because BOM for utf-8 is truly stupid. BOM is designed for 16 bit encodings and utf-8 is NOT a 16 bit encoding (the 8 in the name is a clue).

          Admittedly the existence of BOM in utf-8 files became a simple method to identify utf-8 encoding when opening a file, but Notepad++ should definitely not add a (stupid) BOM to an ANSI/utf-8 file unless the user explicitly requested it.

          There are dozens of posts about these ansi/utf-8 issues. feel free to browse. See other people problems and opinions before offering changes.

          It also not clear what your problem is exactly. The only time where ANSI vs. utf-8 (w/o BOM) actually matters is when you edit the first non-ansi symbol into the file. Do you do it often?

          Robert CarnegieR 1 Reply Last reply Reply Quote 0
          • Robert CarnegieR Offline
            Robert Carnegie @gstavi
            last edited by

            @gstavi said in Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format:

            It also not clear what your problem is exactly. The only time where ANSI vs. utf-8 (w/o BOM) actually matters is when you edit the first non-ansi symbol into the file. Do you do it often?

            I may be misspeaking but I think you should be saying “ASCII” not “ANSI”. UTF-8 corresponds to ASCII, 7-bit character set, and the first 128 characters of Unicode (0 to 127), as single byte values; Unicode characters outside the first 128 are encoded differently. A UTF-8 file with no BOM and no non-ASCII data is, in fact, an ASCII text file.

            https://en.wikipedia.org/wiki/ANSI_character_set
            indicates that one “official” “ANSI” character set doesn’t exist, but the Microsoft Windows 8-bit “code page 1252” is commonly called “ANSI”, including by Microsoft and Windows I think. This differs from ASCII by including symbols such as British money £ with codes above 127, and differs from “PC code page 437” in where some of these extra symbols are in the encoding.

            I posted on some recent threads, about Notepad++ options which I have and haven’t tried, that may allow you to run more than one Notepad++ window at once and to have different configured settings in each window. If this works, then to avoid confusion, another option to run Notepad++ without saving and reloading a set of documents currently open (-nosession) may be appropriate.

            That is to say, I think you could run one Notepad++ window for editing UTF-8 as proposed, and a second window for editing “ANSI” as “ANSI”. The second one should be with “-nosession”, probably. And you can also (since 8.0.0) add a message “ANSI”, for instance, to the second Notepad++ window title (?)

            https://community.notepad-plus-plus.org/topic/22298/notepad-encoding-auto-detect-potential-problems/7

            https://community.notepad-plus-plus.org/topic/22304/how-to-open-notepad-with-a-new-empty-file/4

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Alan KilbornA Online
              Alan Kilborn @Robert Carnegie
              last edited by

              @robert-carnegie said in Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format:

              I may be misspeaking

              Yep.

              but I think you should be saying “ASCII” not “ANSI”

              Nope.

              1 Reply Last reply Reply Quote 0

              Hello! It looks like you're interested in this conversation, but you don't have an account yet.

              Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

              With your input, this post could be even better 💗

              Register Login
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors