Community
    • Login

    Regex for Searching <HEAD> Section

    Scheduled Pinned Locked Moved General Discussion
    3 Posts 3 Posters 422 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • aksarbenA
      aksarben
      last edited by

      I want to use Notepad++ to find soft hyphen characters (ISO 8859: 0xAD, Unicode U+00AD SOFT HYPHEN, HTML: ­ ­) in the <head> section of my HTML files. I tried the two regular expressions below, but both return zero hits.

      <head>.*?­.*?</head>
      <head>.*­.*</head>
      

      Curiously, the following regex does finds soft hyphens in <figcaption> sections:

      <figcaption>.*?­.*?</figcaption>
      

      I suspect the issue is that the <head> section contains newlines. I tried the search with the “. matches newline” both checked and unchecked. Still got zero hits both ways.

      Is there a way to do this kind of search in Notepad++?

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @aksarben
        last edited by

        @aksarben

        I think the code blocks you used above are hiding your soft-hyphen character, at least visually. I find that if I copy and paste them into Notepad++, the soft-hyphen character reappears.

        Anyway, I would try searching for: (?s)<head>.*?\x{00AD}.*?</head>

        I think there have been some recent postings about Unicode characters used explicitly in the Find-what box of the Find dialog not working correctly…?

        1 Reply Last reply Reply Quote 1
        • guy038G
          guy038
          last edited by guy038

          Hello, @aksarben, @alan-kilborn and All,

          Simply, use this regex S/R :

          SEARCH (?s)(.*?<head>|\G)((?!</head>).)*?\K\xAD

          REPLACE Any SINGLE character or STRING

          Notes :

          • I assume, of course, that there only one section <head>........</head> per file

          • The <head>........</head> section can be, either, in one line or splitted into several ones

          • Any soft hyphen, found, above the starting tag <head> is ignored

          • Any soft hyphen, between the starting and the ending tag is found, individually

          • Any soft hyphen, found, under the ending tag </head> is ignored

          • Preferably, when testing on a single file, tick the wrap around option, which forces to starts the S/R process from the very beginning of the file

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 1
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors