Community
    • Login

    Define custom syntax hilighting with Pythonscript plugin

    Scheduled Pinned Locked Moved General Discussion
    9 Posts 3 Posters 2.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mikhail VM
      Mikhail V
      last edited by

      I want to try сustomize the syntax colors using the Pythonscript plugin.
      I’ve looked into examples by @Claudia-Frank here:
      custom colors for methods

      One thing I want to know: is it possible to make the code more ‘procedural-style’?

      In those examples I am trying to understand what all those class, __metaclass__, etc.
      things are and how it all works together.
      Looks really hacky, to me at least. I am quite proficient with programming
      but when I see a word class, I already get some kind of disturbance.

      So the question is basically - is involving classes and method tweaking the only
      approach in this case?

      I am thinking about the algorithm and how would I go about colorising
      characters and it comes down to such pseudo-program:

      1. Take the indexes of start and end of visible chunk -
        no problem, I can do it.
      2. Find the indexes of needed characters sequence -
        and store it in say “group 1”. No problem so far.
      3. Do same for some other searches, store as “group 2”,
        “3”, … etc. So end up with N groups of indexes.
      4. In a loop, take each index group and mark with its own style , i.e.
        group i -> style i

      It seems everything is super-simple and I can cope with each step easily.
      But is it possible to make the above steps into working example without
      going into PhD-level meta-programming techniques?

      Scott SumnerS 1 Reply Last reply Reply Quote 2
      • Scott SumnerS
        Scott Sumner @Mikhail V
        last edited by

        @Mikhail-V said:

        but when I see a word class, I already get some kind of disturbance

        Well-worded…is the feeling similar to motion sickness? :-D

        Mikhail VM 1 Reply Last reply Reply Quote 1
        • Mikhail VM
          Mikhail V @Scott Sumner
          last edited by Mikhail V

          @Scott-Sumner said:

          Well-worded…is the feeling similar to motion sickness? :-D

          Yes similar to that :)
          Or for example a feeling when I have to fill in an application form
          which should be easy, but the wording of questions gets irritating.

          1 Reply Last reply Reply Quote 1
          • Claudia FrankC
            Claudia Frank
            last edited by Claudia Frank

            @Mikhail-V

            I’m confident that the classes can be rewritten to use a more procedural-style as you said.

            There are two reasons why I used classes here.

            The first one involves the metaclass
            As the script is using callbacks and the intention is to use this script
            on several documents with the possibility to reassign a different lexer
            and if needed, reassign this “pseudo”-lexer I wanted to make sure that
            I do register the callback function only once - otherwise I do get
            the callback function called multiple times and this already explains
            what this metaclass feature does - it ensures that you always get the same
            object when calling EnhancedUDLLexer.

            The second reason is similar to the first one - make sure that the variables/functions
            used are not overwritten by another (or the same) script as it would break the script immediately.

            So, if you plan to use such a script on multiple documents then you need to solve this issue
            otherwise you might see unexpected behavior.

            Cheers
            Claudia

            Mikhail VM 1 Reply Last reply Reply Quote 3
            • Mikhail VM
              Mikhail V @Claudia Frank
              last edited by Mikhail V

              @Claudia-Frank said:

              I’m confident that the classes can be rewritten to use a more procedural-style as you said.
              There are two reasons why I used classes here.

              Thanks for clarification :) I hope it does not look like I want to make fun of
              you coding style. It is just my pet-peeve (OOP).

              So I was able to run that example in the linked post.
              Though I can run only on the new Npp version with the latest Pythonscript plugin.
              I suspect you have some knowledge of non-documented features ;)

              So I am still trying to understand the possibilities in the first place.
              Earlier you have written about possibilities:

              a) writing a lexer entirely with python script
              b) writing an “xml-linter” with python script

              This is taken out of the context but seems you got some experience.
              So maybe you know the answer for these 2 questions:

              1. Can I apply a styler to specific range? I make
                emphasis on “styler” because in your examples you deal only with
                “indicator” which only changes the color, but I want to change font and
                size of the characters if it is possible.

              2. Can I get the information about currently applied styler or
                any info about the lexer state for the character at specific index?

              In Scintilla Documentation I find this:

              SCI_GETLINESTATE(int line) → int
              As well as the 8 bits of lexical state stored for each character there 
              is also an integer stored for each line. 
              

              This mentions some “state for each character”. ?? But I can’t find any other calls
              that refers to this state. I guess the active lexer should store this useful info, but how to read it?

              I am just trying to understand how plausible is the idea of writing a fully
              custom highlighting with the PS plugin, and I am still in doubt.

              1 Reply Last reply Reply Quote 1
              • Mikhail VM
                Mikhail V
                last edited by

                Follow-up: regarding question #1, there are related Scintilla API calls:

                SCI_GETENDSTYLED → position
                SCI_STARTSTYLING(int start, int unused)
                SCI_SETSTYLING(int length, int style)
                SCI_SETSTYLINGEX(int length, const char *styles)
                SCI_SETIDLESTYLING(int idleStyling)
                SCI_GETIDLESTYLING → int
                SCI_SETLINESTATE(int line, int state)
                SCI_GETLINESTATE(int line) → int
                SCI_GETMAXLINESTATE → int
                

                So I suggest this might work to set the styles somehow.

                Claudia FrankC 2 Replies Last reply Reply Quote 1
                • Claudia FrankC
                  Claudia Frank @Mikhail V
                  last edited by Claudia Frank

                  @Mikhail-V

                  I hope it does not look like I want to make fun of you coding style.

                  No problem, I haven’t understood it that way anyway. :-)
                  But I hope that I do get critics when I post something which could/should
                  be coded differently. For example like guy038 does when I post regexes which
                  can be simplified or aren’t correct at all.
                  I’m still learning new things every day. If you would have asked 6 month ago
                  whether it is possible to have two different lexer acting on the same documented
                  I would have posted NO, nowadays I know better or I should say, nowadays I know a
                  way around that problem.

                  So I was able to run that example in the linked post.
                  Though I can run only on the new Npp version with the latest Pythonscript plugin.
                  I suspect you have some knowledge of non-documented features ;)

                  I hope I haven’t used undocumented features but yes pythonscript > 1.0.8 is needed
                  as we have added notepad functions like notepad.getLanguageName(notepad.getLangType())
                  only recently. I should have made clear - thx for pointing out.

                  Before answering the two question let me clarify the two possible ways scintilla supports
                  to colorize the documents from my understanding about it.
                  The first one, used from the beginning of scintilla is styling and later there were indicators added.
                  The idea of having indicators is totally different to styling and, as far as I understand, it wasn’t
                  intended to use it as another way of styling. I just misuse it as some kind of light-way styler.
                  And yes, there are differences, when using styler and indicators.
                  A lexer(styler) get the information which part of the document needs to be restyled, indicators don’t get this info.
                  When using styling you have full control over every piece of styles, like different font,
                  with indicators you only can modify the foreground and background color
                  (ignoring the different shapes you can put around of text for the moment).
                  But because they are handled independently they can be used together and from my understanding indicators are
                  the only safe way to enhance an existing lexer.

                  Can I apply a styler to specific range? I make
                  emphasis on “styler” because in your examples you deal only with
                  “indicator” which only changes the color, but I want to change font and
                  size of the characters if it is possible.

                  You can by using styling functions but with the price that you have to write your own lexer.
                  Means you cannot use it together with an existing lexer.

                  Can I get the information about currently applied styler or
                  any info about the lexer state for the character at specific index?
                  Styles can be retrieved e.g. editor.getStyleAt function.
                  You cannot get an lexer state, in terms of a builtin lexer or an udl lexer.

                  SCI_GETLINESTATE(int line) → int
                  As well as the 8 bits of lexical state stored for each character there
                  is also an integer stored for each line.
                  This mentions some “state for each character”. ?? But I can’t find any other calls
                  that refers to this state. I guess the active lexer should store this useful info, but how to read it?

                  My understanding is that this is stored by scintilla internally instead of the lexer but haven’t really checked the sources.
                  The additional linestate functions have been introduced to provide a way to the lexer to have sub-lexers working.
                  Like in the html lexer where it is needed to have different lines colored different depending on which sub-lexer (php, js, html …) is used.

                  I am just trying to understand how plausible is the idea of writing a fully
                  custom highlighting with the PS plugin, and I am still in doubt.

                  I would say it depends - a full python lexer is always slower like an builtin or udl lexer,
                  but nowadays with such computing power it might be possible that you do not even notice the difference
                  if the documents which should be colored are only thousand, and not millions, of lines.
                  Writing your own lexer gives you the full control which means you can possibly do what a builtin or udl lexer can’t do
                  like having a regex based lexer.
                  Whether it makes sense or not is always up to the one who thinks about a possible solution.

                  I hope I was able to demystify this a little bit, if not, let me know.

                  Cheers
                  Claudia

                  1 Reply Last reply Reply Quote 3
                  • Claudia FrankC
                    Claudia Frank @Mikhail V
                    last edited by

                    @Mikhail-V

                    I have uploaded a proof-of-concept regex based lexer here.

                    Cheers
                    Claudia

                    1 Reply Last reply Reply Quote 2
                    • Mikhail VM
                      Mikhail V
                      last edited by

                      @Claudia-Frank
                      Thanks for the input!

                      I got to do a lot of experimenting with it.
                      What comes to my mind - I could use some external libararies for
                      the lexical analysis, but then I have another task - to import and use 3d party
                      libraries with the PS plugin… but anyway, it seems that is not necessary – I have
                      regex and could loop over bytes as well, so it is not main problem here.

                      All in all, sounds like an interesting challenge.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors