Using Notepad++ to remove accents
Today I share a tip to facilitate the replacement of accented characters using only Notepad++ and your HTML Tag Plugin.
Let’s do it!
First of all, we must have HTML Tag Plugin installed. If you already have it, jump to step 4.
- If HTML Tag is absent, open Plugin Manager->Show Plugin Manager and find HTML Tag Plugin on the list.
- Select the desired Plugins and hit Install button. Notepadd++ will restart to complete the install.
Now we can translate special characters to yours HTML entities. We need to do it first to remove accents from our text. - Put your text on the window and make Select All with Ctrl + A or Edit -> Select All in Main Menu.
- Open Plugins->HTML Tag->Encode Entities in Main Menu or CTRL + E.
Now, all special characters will be in html entities.
- Open the Replace Form (CTRL + H or Search->Replace) and fill the Find What field with
&([a-zA-Z])(grave|acute|circ|uml|aring|cedil|slash|tilde); and Replace with field with$1
Select Regular Expression in Search Mode and hit Replace All button.
The text will be replaced with no accents characters, but all other special chars will be in HTML entity.
- Now we can return all other chars to your originals but is better to use your text in UTF-8 encoding. Go to Encoding Menu and change text encoding if is not Utf-8. You can return to another encoding after this step.
To decode HTML use CTRL + Shift + E or Plugins->HTML Tag->Decode Entities.
I Hope that help someone.
11 Responses
Com certeza me ajudou, isso posso dizer. Muito obrigado pelo post Marcos.
Que bom Hans.
This was a lifesaver! Saved me hours of work, thank you!
Muito bom! Bem criativo e ajudou bastante.
You, Of course Si crawls. Muchísimas Gracias
Marcos, muito grato pelo seu post, acaba de me economizar muito tempo de trabalho.
Je vais beaucoup améliorer le SEO de ma galerie a moi de faire les macros qui vont bien j’;avance a petit pas …;
A big thank you !!!
Valeu me ajudou muitooo obrigado
You may need to adjust encoding before using this process so that the accents(diacritics) display correctly first. Use the Encoding menu to select another or convert to another. Like Marcos says it is best to use UTF-8. So convert to it.
Thanks for your comment Dan Lopez.
Parabéns pela “;sacada”; and by explanation: excellent.
Here is a possible contribution, in order to make the regular search expression simpler (Find):
&([a-zA-Z])(grave|acute|circ|uml|aring|cedil|slash|tilde);
The expression is the same as the original, except that it does not contain the characters ?: (question mark and colon), which are responsible for removing a group from capture (non-capturing group).