Html files with Charset = iso: I don't wanna see the dicritics (accent marks ) with bold
-
I made a text parsing with Python from an old website to a new website. The old website has
charset=iso-8859-1and the new one hascharset="utf-8"What is the best solution as not to see bold letters diacriitics (accent marks )? I try to change the
charset="utf-8tocharset="utf-8"and viceversa. The same thing. Diacritics are further highlighted.This is the code of the text in the image:
<p class="text_obisnuit">Într-o oarecare măsură, fără să întămpin vreo dificultate, dacă mi-aş măsura capacităţile de inventator prin experienţa confruntării cu moartea, cu privire la modul de a schimba o constantă a reprezentării cuceririi ştiinţei pe o traiectorie axiomatică unitară, atunci probabil m-aş ciocni de o latură trecută cu vederea, o excepţie, ceva ce nu aş fi gândit că pot face. O fi doar o problemă de credinţă şi de for interior?</p><p class="text_obisnuit">Totuşi, sunt un inventator-autodidact, şi pe această cale sunt îndreptăţit să accept modul de încadrare a invenţiilor mele în categoria celor ce nu se rostesc, dar se închipuiesc. Fără excepţie, lucrul cu materia se poate transforma într-o relaţie desfăşurată între ceea ce proiectez ca intenție, și efectivitatea protecţiei pe care natura mi-o asigură cu un singur scop: pentru a-i lărgi valenţele de “miracol” în afara materiei vizibile.</p>What should I do ?
-
This forum is for Notepad++ questions. Your question has nothing to do with Notepad++: the answer will be the same whether you are using Notepad++, MS notepad.exe, or
copy con. If you think “I am typing this with Notepad++, so it should be on topic,” then you haven’t read our FAQ which explains why that is a false interpretation, using the example of baking cookies.But I’ll give you a hint: on my machine, that HTML doesn’t display with bold characters:
.
(My guess is that it’s a font issue on your PC.)
Further, the snippet you showed has no characters outside of the ASCII range, so it doesn’t matter whether you have setcharset="iso-8859-1"orcharset="utf-8". If you do understand why having no characters outside of the ASCII range necessarily implies the “so…” part of my previous sentence, you need to go find a better tutorial on character encodings and HTML, because you obviously don’t understand the technology you are working with sufficiently. If you still don’t understand, you will have to find a forum that’s about HTML and web formatting, not one for a particular editor, and ask there. The Notepad++ Community Forum is not the right place for further discussion on this.You can even use Notepad++ to prove to yourself that it doesn’t matter which charset you pick, given the data you showed:
- FIND =
[^\x20-\x7e\r\n]– this will find any character that is not between ASCII 32 (0x20) and ASCII 126 (0x7E), or not a CR or LF newline character. - COUNT
In your snippet, it finds 0 characters outside of that range. That means there is nothing in that snippet which is not ASCII, and thus nothing that will be different between ISO-8859-1 and UTF-8.

OTOH, if I add the characters
ÀÁÅËËand do the COUNT again, it now counts 5 matches in the file, for those five characters.
- FIND =
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login