iconv command is used to convert some text in one encoding into another encoding. If no input file is provided then it reads from standard input. Similarly, if no output file is given then it writes to standard output. If no from-encoding or to-encoding is provided then it uses current local’s character encoding.
Syntax:
iconv [options] [-f from-encoding] [-t to-encoding] [inputfile]...
Options:
- -f from-encoding, –from-code=from-encoding : Use from-encoding for input characters.
- -t to-encoding, –to-code=to-encoding : Use to-encoding for output characters.
- -l, –list : List all known character set encodings.
- -c : Silently discard characters that cannot be converted instead of terminating when encountering such characters.
- -o outputfile, –output=outputfile : Use outputfile for output.
- –verbose : Print progress information on standard error when processing multiple files.
Note:
- If the string //IGNORE is appended to to-encoding, characters that cannot be converted are discarded and an error is printed after conversion.
- If the string //TRANSLIT is appended to to-encoding, characters that cannot be represented in the target character set, it can be approximated through one or several similar looking characters.
Example Script :
echo "Hello World w.r.t email - gill.tyson332@gamil.com {Test}" > ASCI.txt echo "Hëllo World w.r.t email - gill.tyson332@gamil.com {Test}" > UTF-8.txt iconv -f ascii -t UTF-16 ASCI.txt -o UTF-16.txt iconv -f ascii -t UTF-32 ASCI.txt -o UTF-32.txt iconv -f utf-8 -t CP1252 UTF-8.txt -o CP1252.txt file -i * hexdump *
Filter Invalid Chracters in Any Encoding :
iconv -c -t UTF-8 < input.txt > output.txt diff input.txt output.txt
Find non ascii values in a file
grep --color='auto' -P -n "[^\x00-\x7f]" test_file.txt
Print Octal value of a file
grep --color='auto' -P -n "[^\x00-\x7f]" test_file.txt
Examples:
To convert from UTF-8 to ASCII :
echo abc ß ? € à?ç | iconv -f UTF-8 -t ASCII//TRANSLIT
Print the list of all character set encodings :
iconv -l
Reading and writing from a file :
iconv -f UTF-8 -t ASCII//TRANSLIT -o out.txt in.txt