Thursday 14 November 2013

Bulk convert ANSI files to UTF-8

Bless Notepad++

If that is all you take from this post, it is enough.

Shoestring Code means no money and no time. And, true to form, for this I needed to convert 14000 ANSI files to UTF-8 files.

Step up Notepad++  **APPLAUSE**

First install the Python Script plug in for Notepad++
Plugins>Plugin Manager>Show Plugin Manager


Create your little script
Plugins>Python Script>New script

Paste in this code:

import os;
import sys;
filePathSrc="H:\\Doc management\\Migrating\\3.8 20001-22000" # Path to the folder with files to convert
for root, dirs, files in os.walk(filePathSrc):
    for fn in files:
        if fn[-5:] == '.html': # Specify type of the files, making note to change the fn[number] to correspond to length of fileextension with .
notepad.open(root + "\\" + fn)
notepad.runMenuCommand("Encoding", "Encode in ANSI")
notepad.runMenuCommand("Encoding", "Convert to UTF-8 without BOM")
notepad.save()
notepad.close()


Note the double \\ in the path
Save the code. VERY IMPORTANT- I mistakenly put my little script somewhere I felt like instead of the default directory that Notepad++ suggests. Bad move. Rather keep it there unless you intend on becoming good at this. (Which I did not- a quick easy fix, please. No epic learning curve)

Then to run:
Plugins>Python Script>Scripts>[Saved name of your script will appear in the list]

Give it a second- it takes a little while to begin- don't freak that it isn't working.
Boom! It'll start opening all the files in that directory- and running the Menu commands. In this case the menu Encoding then the command "Encode in ANSI" etc.
Saves the file, then closes it.

Job Done.
And the first time I've seen Python. Very nice.