Community
    • Login

    Notepad++ and NUL characters

    Scheduled Pinned Locked Moved General Discussion
    13 Posts 8 Posters 497 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn
      last edited by Alan Kilborn

      A change is coming soon to Notepad++ which will allow proper handling of NUL characters when searching; see https://github.com/notepad-plus-plus/notepad-plus-plus/pull/16469 for a preview.

      This got me wondering how well Notepad++ is positioned to be a “file editor” rather than simply a “text editor”.

      Let’s discard for a moment that it doesn’t have native hex editor capabilities, and let’s pretend that arbitrary editing of files with encoded/binary content isn’t dangerous…

      I’m specifically wondering how ready Notepad++ is to not have ANY stumbling blocks over the NUL character. Historically this has been a problem because a NUL character has been used internally by Notepad++ to signify that “the content of this string variable ends here”. Ideally, a NUL character is not treated that way, and is the same as any other character.

      Does anyone interested in this conversation have examples of things you’d want to do with Notepad++ that you currently can’t do, because the NUL character gets in the way?

      Mark OlsonM EkopalypseE mkupperM 3 Replies Last reply Reply Quote 0
      • Mark OlsonM
        Mark Olson @Alan Kilborn
        last edited by

        @Alan-Kilborn
        Most plugin messages that select text from a file (e.g., SCI_GETTEXT, SCI_GETSELTEXT) still treat the text as a NUL-terminated string.

        rdipardoR 1 Reply Last reply Reply Quote 2
        • rdipardoR
          rdipardo @Mark Olson
          last edited by rdipardo

          @Mark-Olson said in Notepad++ and NUL characters:

          Most plugin messages that select text from a file (e.g., SCI_GETTEXT, SCI_GETSELTEXT) still treat the text as a NUL-terminated string.

          Indeed, but I think the ultimate limiting factor is the Win32 API, which is basically ANSI C (properly speaking, it’s the C++98 standard, which is so old as to no longer even resemble C++ as most developers understand it today).

          The NULL byte will always have special significance as long as you are passing “string” data to C library functions, given that C has no dedicated “string” type, the closest thing being an array of char, and such arrays have no way of storing their length — as, for example, a Pascal string can ¹ — except the presence of a NULL byte to mark the end.


          ¹

          The old Macintosh operating system used Pascal strings everywhere. Many C programmers on other platforms used Pascal strings for speed. Excel uses Pascal strings internally which is why strings in many places in Excel are limited to 255 bytes, and it’s also one reason Excel is blazingly fast.

          For a long time, if you wanted to put a Pascal string literal in your C code, you had to write:

          char* str = "\006Hello!";
          

          https://www.joelonsoftware.com/2001/12/11/back-to-basics

          1 Reply Last reply Reply Quote 3
          • EkopalypseE
            Ekopalypse @Alan Kilborn
            last edited by

            @Alan-Kilborn

            I have no idea what additional possibilities I could have now if this comes, Npp is and remains a pure text editor for me.
            I’m a bit surprised that this is being implemented because, from my point of view, this signals that you can also edit files that you can search accordingly and that’s just not the case. But it is as it is or as it will be.

            1 Reply Last reply Reply Quote 3
            • mkupperM
              mkupper @Alan Kilborn
              last edited by

              Microsoft Windows’ copy/paste mechanism for text uses NUL text strings. You can get around this in Notepad++ using Edit / Paste Special / ... There are three sub-options that support NUL:

              • Copy Binary Content
              • Cut Binary Content
              • Paste Binary Content
              Alan KilbornA 1 Reply Last reply Reply Quote 3
              • Alan KilbornA
                Alan Kilborn @mkupper
                last edited by

                @mkupper said:

                Microsoft Windows’ copy/paste mechanism for text uses NUL text strings.

                Ah, okay, so here’s a current Notepad++ limitation: (Normal) Copying of some text that contains NUL character(s) won’t paste back the NUL character(s), even if one is staying within Notepad++ for both operations.

                MarkusBodenseeM Alan KilbornA 2 Replies Last reply Reply Quote 0
                • MarkusBodenseeM
                  MarkusBodensee @Alan Kilborn
                  last edited by

                  @Alan-Kilborn said in Notepad++ and NUL characters:

                  Ah, okay, so here’s a current Notepad++ limitation: (Normal) Copying of some text that contains NUL character(s) won’t paste back the NUL character(s), even if one is staying within Notepad++ for both operations.

                  Normal Copying in Notepad++ converts NUL characters to spaces, so a paste back will not cut content, but deliver the same length of content which was copied, but with NULs converted to spaces.

                  But like @mkupper posted, there are also actions available to keep binary content like it is.

                  If Copy Binary Content is used and Ctrl + V is used for paste, pasted back content is cut at first NUL character.

                  1 Reply Last reply Reply Quote 2
                  • Alan KilbornA
                    Alan Kilborn @Alan Kilborn
                    last edited by

                    @Alan-Kilborn said:

                    here’s a current Notepad++ limitation

                    So, to be clear, I meant this as a limitation when considering Notepad++ being an “editor” and not just a “text editor”. In such a case, having to do special things, e.g. “Copy/Cut/Paste” Special wouldn’t be necessary…within Notepad++. Trying to get data to the outside world, e.g. via the external clipboard is a different endeavor.

                    MarkusBodenseeM mkupperM 2 Replies Last reply Reply Quote 0
                    • MarkusBodenseeM
                      MarkusBodensee @Alan Kilborn
                      last edited by

                      @Alan-Kilborn said in Notepad++ and NUL characters:

                      Trying to get data to the outside world, e.g. via the external clipboard is a different endeavor.

                      This depends more on how other applications handle the clipboard and paste command. Notepad++ is able to fill the clipboard with normal or special copy as you like.

                      1 Reply Last reply Reply Quote 0
                      • MarkusBodenseeM
                        MarkusBodensee
                        last edited by

                        Back to the initial post: I think, the new feature does not change the category of editor in any way. I would consider it more like a consolidation to what the user expects while searching in a file and looking at the search result.

                        My expectation of search results is that it displays the exact same content from the file I was searching in. Independent from file content being only text or binary mixed with text.

                        Of course, Notepad++ is primary a text editor. But the ++ indicates, that it is so much more than that.

                        Opening a binary file with Notepad++ is a valid use case. Otherwise Notepad++ should prevent opening such binary files.

                        I am very happy that I am able to open binary with Notepad++. It is useful, and it is useful to search in those binary files as well.

                        A use case for example: I have some binary build result from C++. I want to verify the build date of the binary in comparison with other build results. Open binary in Notepad++, build date and time is there and I can identify it quickly, or even search for it quickly. No additional tool needed. Sure, there might be more valid ways to do something like this, but why not?

                        Another use case: I had a case in which Enterprise Architect (a UML modelling tool) exported a corrupted xml file, it put some binary chars in, most likely because there was an encoding issue. I was not able to import this xml file on another PC again to Enterprise architect. What I did: Searched with Notepad++ in the corrupted binary xml, deleted these characters, save. Now, Import was possible.
                        Using Notepad++ for this was super easy solution. So yes, for me it is not only text editor, more a text editor ++ also including a bit of binary handling.

                        1 Reply Last reply Reply Quote 2
                        • mkupperM
                          mkupper @Alan Kilborn
                          last edited by

                          @Alan-Kilborn said in Notepad++ and NUL characters:

                          So, to be clear, I meant this as a limitation when considering Notepad++ being an “editor” and not just a “text editor”.

                          It is possible to work around this. When an application loads stuff into the Windows copy/paste buffer the application provides the information in multiple formats. For example, if you are using a web browser and copy something from a web page then the web browser will be uploading various forms of plain text and also various forms of HTML, and usually more.

                          When you use “paste” in an application the code examines the list of available formats and picks the one that seems like the best fit. Notepad++ picks one of the plain text formats. Notepad++'s Edit / Paste Special menu has options for grabbing an HTML format blob and another option for grabbing an RTF format blob and dropping the results into Notepad++'s editing area.

                          The workaround is that applications are allowed to define their own formats, including ones that generated on the fly. Notepad++ can take advantage of this by creating a npp_binary_text format. If the text being copied contains a NUL then upload the data as the binary blob format and also upload npp_binary_text with the string value true. On pasting, Notepad++ would examine the list of available formats and if npp_binary_text is there and it’s set to true then grab the binary data format. This would allow Notepad++ to signal to itself that we are dealing with a Notepad++ to Notepad++ copy/paste while maintaining compatibility with older versions of Notepad++ and other applications.

                          Microsoft Office uses internal to itself formats that allow for a richer experience when copy/pasting within Office applications. For example, there are format blobs that contain metadata about what is available in the normal well-known formats. You can copy data from an Office application and it’ll still look good if you paste into a non-Office application. If you paste the same thing into an Office application then you get extras such as the document’s original time stamps, the name(s) of the document creators and editors, etc.

                          1 Reply Last reply Reply Quote 3
                          • ryangray01R
                            ryangray01
                            last edited by

                            Re: Notepad++ and NUL characters
                            That’s a really insightful direction, Alan. The shift toward proper NUL character handling in Notepad++ is a big step forward — especially for folks who use it for tasks beyond basic text editing. I’ve run into limitations before when inspecting log files or network dumps where embedded NULs would either cut off the view or cause weird behavior. One thing I’d personally love is being able to search and navigate those NULs just like any other character — maybe even replace them selectively without having to rely on external hex editors. If Notepad++ can bridge that gap, it would get a lot closer to being a lightweight but capable “file editor,” which would be huge.

                            PeterJonesP 1 Reply Last reply Reply Quote 0
                            • PeterJonesP
                              PeterJones @ryangray01
                              last edited by PeterJones

                              @ryangray01 ,

                              note from moderator: please reply to the original topic, rather than creating a new topic, otherwise readers lose all context. I have fixed this post (and one other), but it’s really better if you just keep the reply in the same topic to begin with.

                              update: two of your three posts so far seemed like AI nonsense; two of your three posts so far have replied into a separate topic from the post being replied to; it is really looking like you are just here to disrupt communication, rather than participate in community. If your next post is as meaningless as two of yours, or if it requires extra moderator effort to re-connect it to context, I am going to conclude that you are an AI, and thus react according to our forum’s policy on AI nonsense.

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              The Community of users of the Notepad++ text editor.
                              Powered by NodeBB | Contributors