Community
    • Login

    Regex: select/match the numbers that are repeated most often

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    19 Posts 5 Posters 9.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Vasile CarausV
      Vasile Caraus
      last edited by

      hello Guy38. I must say…I never thing about this method.

      But, you are the best.

      Thanks A LOT ! WORKS !

      1 Reply Last reply Reply Quote 0
      • Vasile CarausV
        Vasile Caraus
        last edited by Vasile Caraus

        BUT, the only problem is that works on your exemples. Not at mine.

        the \R from your regular expressions can be replace with other formula?

        1 Reply Last reply Reply Quote 0
        • Vasile CarausV
          Vasile Caraus
          last edited by

          This post is deleted!
          1 Reply Last reply Reply Quote 0
          • Vasile CarausV
            Vasile Caraus
            last edited by

            @guy038 said:

            SEARCH ^(\d(\d(\d(\d)?)?)?)(?:\t|\R)
            REPLACE (?2:0)(?3:0)(?4:0)\1\r\n

            this regex of your ^(\d(\d(\d(\d)?)?)?)(?:\t|\R) doesn’t work at my place. The first one and the most important. The other regex works fine.

            But I find another way to do this. Suppose I have:

            17 25 30 37 38 47
            2 6 7 17 30 42
            3 17 20 38 44 45
            4 5 6 30 36 42

            Search: (Leave a single space)
            Replace by: \r

            then

            Search: ^(a*) This will move the cursor at the beginning of each line
            Replace by: 00

            and I will get something like this:

            0017
            0025
            0030
            0037
            0038
            0047
            002
            006
            007
            0017
            0030
            0042
            003
            0017
            0020
            0038
            0044
            0045
            004
            005
            006
            0030
            0036
            0042

            1 Reply Last reply Reply Quote 0
            • Vasile CarausV
              Vasile Caraus
              last edited by

              @guy038 said:

              SEARCH (\d{4})\R\1

              REPLACE \1 \1 , with a space character, between the two back-references, \1

              This, again, is not working at my place. (\d{4})\R\1 And I press many time “Replace All” button

              Claudia FrankC 1 Reply Last reply Reply Quote 0
              • Claudia FrankC
                Claudia Frank @Vasile Caraus
                last edited by

                @Vasile-Caraus

                I know you are a regex fan but just to give you an idea how a python script
                would look like to solve such a problem

                from collections import Counter
                
                x = editor.getText().replace('\r\n',' ').split(' ')  # get the list of numbers
                y = [y for y in x if y !='']                         # get rid of the empty ones
                counted_list = Counter(y)                            # create a list of tuples, counting each
                for item in counted_list.most_common(4):             # iterate over the top 4
                    console.write('{}\n'.format(item))               # and print it to the console
                

                I used the list of 1000 integer @guy038 posted.
                The result in the console would be

                (‘7’, 45)
                (‘27’, 41)
                (‘8’, 40)
                (‘13’, 40)

                Meaning that number 7 occurred 45 times

                Cheers
                Claudia

                1 Reply Last reply Reply Quote 0
                • Vasile CarausV
                  Vasile Caraus
                  last edited by

                  @Claudia-Frank said:

                  n idea how a pytho

                  hello Claudia, I don’t know Phyton, so I really don’t know what to do with the phyton script you write above.

                  1 Reply Last reply Reply Quote 0
                  • guy038G
                    guy038
                    last edited by guy038

                    Hello Claudia,

                    I’ve just tested, your Python solution, changing for the six most common used numbers, with the counted_list.most_common(6) expression and it just return all the numbers that I’ve had previously found, for the 1000 random integers list :-)

                    How elegant a Python ( or Lua, I suppose ) script is, compared to my complicated regex’s cooking !!!

                    Cheers,

                    guy038

                    1 Reply Last reply Reply Quote 0
                    • Vasile CarausV
                      Vasile Caraus
                      last edited by

                      Claudia and guy038, please tell me how to use this python script !

                      1 Reply Last reply Reply Quote 0
                      • Vasile CarausV
                        Vasile Caraus
                        last edited by

                        a short tutorial for this example will be great !

                        Claudia FrankC 1 Reply Last reply Reply Quote 0
                        • Claudia FrankC
                          Claudia Frank @Vasile Caraus
                          last edited by

                          @Vasile-Caraus

                          What needs to be done first is described here.

                          Just in case that you haven’t installed python script plugin yet, I would propose to use the MSI package instead of using the plugin manager.

                          Short version, once python script plugin has been installed goto
                          Plugins->Python Script->New Script
                          give it a name and press save.
                          A new empty editor should appear.
                          Copy the content into it and save it.
                          Do NOT reformat the code as python is strict about whitespaces.

                          Open the python script console by clicking on
                          Plugins->Python Script->Show Console

                          Open your file with the numbers and run the script by clicking on
                          Plugins->Python Script->Scripts->NAME_OF_YOUR_SCRIPT
                          Cheers
                          Claudia

                          1 Reply Last reply Reply Quote 0
                          • Vasile CarausV
                            Vasile Caraus
                            last edited by

                            WORKS GREAT Claudia.

                            Thanks a lot !

                            1 Reply Last reply Reply Quote 0
                            • Vasile CarausV
                              Vasile Caraus
                              last edited by

                              by the way, Claudia, how can I use Python (like your script) to actually modify the .txt file. Because, for now, Python only show in the console the results of some function from the script. But how can I use Python script to search and replace something in the .txt files?

                              Claudia FrankC 1 Reply Last reply Reply Quote 0
                              • Claudia FrankC
                                Claudia Frank @Vasile Caraus
                                last edited by

                                @Vasile-Caraus

                                if you want to dive into python first thing, of course, is to get some basic knowledge of the language it self.
                                Either use one of the youtube videos or if you prefer to read https://www.python.org/about/gettingstarted/.
                                Note, the plugin uses python2 NOT 3 (there are differences, nothing too critical but those can be confusing
                                if you start learning the language and you try to do something which works in py3 but not in py2).

                                Next the help pages which come with the plugin itself.
                                Plugins->Python Script->Context-Help

                                And last but not least Scintillas help at http://www.scintilla.org/ScintillaDoc.html to get a better
                                understanding how the editor works.

                                The console is a good starting point to test things first.
                                In order to get all functions, attributes of a py object you can use the dir command.
                                So, if you do the following in the console you will get the list of functions of this object

                                dir(editor)
                                

                                I prefer to have not to scroll sideways so I use

                                print '\n'.join(dir(editor))
                                

                                In order to see what the parameters of a function are use the help command like

                                help(editor.insertText)   
                                

                                Next if you search the forum you will find many scripts to solve some particular issues
                                one of my first posts answered a question to unit conversion
                                https://notepad-plus-plus.org/community/topic/10966/unit-conversion-plugin/13

                                and finally, ask the question here if you have a specifc question.

                                Cheers
                                Claudia

                                Ahh… I would suggest to do the following changes in notepad
                                Settings->Preferences->Language check the “replace by space” because
                                Python don’t like it if you use tabs and spaces for indentation.

                                Scott SumnerS 1 Reply Last reply Reply Quote 0
                                • Scott SumnerS
                                  Scott Sumner @Claudia Frank
                                  last edited by Scott Sumner

                                  @Claudia-Frank

                                  Regarding print ‘\n’.join(dir(editor))

                                  I don’t think that ‘print’ outputs to the Pythonscript console window by default.

                                  From the following in the original startup.py:

                                  # This sets the stdout to be the currently active document, so print “hello world”,
                                  # will insert “hello world” at the current cursor position of the current document
                                  sys.stdout = editor

                                  This is of dubious value, especially since a ‘print’ used in this way inserts the text specified plus a UNIX-style line ending into your current file (which likely has Windows-style line endings!).

                                  I, and likely also Claudia, have changed this line in startup.py to be:

                                  sys.stdout = console

                                  thus changing ‘print’ statements to output their data to the Pythonscript console (great for debugging your scripts!)

                                  As alluded to above, the Pythonscript console seems to use UNIX-style line endings. I found this out in an odd way. If you copy-and-paste from the console to an editing window with Windows line endings, the line-endings on the source text will be changed at the time of the paste to match the destination file format, so all is good. HOWEVER, what I did one time was to paste via the “Clipboard History” window. This action seems to preserve the original UNIX-style line endings at the destination! I was quite confused as to why I had inconsistent line-endings in my document, until I figured it out.

                                  Claudia FrankC 1 Reply Last reply Reply Quote 1
                                  • Claudia FrankC
                                    Claudia Frank @Scott Sumner
                                    last edited by

                                    @Scott-Sumner

                                    Scott, you are absolutely correct, I’ve changed this in startup.py
                                    and for me this is much more convenient than using console.write to
                                    print chars to the console.
                                    Just a side not, the command
                                    print ‘\n’.join(dir(editor))
                                    should have been executed in the console itself and there it is working
                                    but if some would use it in a script, than it would print to editor unless
                                    you do changes Scott mentioned.

                                    Thx for the info about copy/paste - I do this a lot but luckily I didn’t use the history ;-)

                                    Cheers
                                    Claudia

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    The Community of users of the Notepad++ text editor.
                                    Powered by NodeBB | Contributors