Iconv command in linux

iconv command is used to convert some text in one encoding into another encoding. If no input file is provided then it reads from standard input. Similarly, if no output file is given then it writes to standard output. If no from-encoding or to-encoding is provided then it uses current local’s character encoding.

Syntax:

iconv [options] [-f from-encoding] [-t to-encoding] [inputfile]...

Options:

  • -f from-encoding, –from-code=from-encoding : Use from-encoding for input characters.
  • -t to-encoding, –to-code=to-encoding : Use to-encoding for output characters.
  • -l, –list : List all known character set encodings.
  • -c : Silently discard characters that cannot be converted instead of terminating when encountering such characters.
  • -o outputfile, –output=outputfile : Use outputfile for output.
  • –verbose : Print progress information on standard error when processing multiple files.

Note:

  • If the string //IGNORE is appended to to-encoding, characters that cannot be converted are discarded and an error is printed after conversion.
  • If the string //TRANSLIT is appended to to-encoding, characters that cannot be represented in the target character set, it can be approximated through one or several similar looking characters.

Example Script :

echo "Hello World w.r.t email - gill.tyson332@gamil.com {Test}" > ASCI.txt
echo "Hëllo World w.r.t email - gill.tyson332@gamil.com {Test}" > UTF-8.txt
iconv -f ascii -t UTF-16 ASCI.txt -o UTF-16.txt
iconv -f ascii -t UTF-32 ASCI.txt -o UTF-32.txt
iconv -f utf-8 -t CP1252 UTF-8.txt -o CP1252.txt
file -i *
hexdump *

Filter Invalid Chracters in Any Encoding :

iconv -c -t UTF-8 < input.txt > output.txt

diff input.txt output.txt

Find non ascii values in a file

grep --color='auto' -P -n "[^\x00-\x7f]" test_file.txt

Print Octal value of a file

grep --color='auto' -P -n "[^\x00-\x7f]" test_file.txt

Examples:

To convert from UTF-8 to ASCII :

echo abc ß ? € à?ç | iconv -f UTF-8 -t ASCII//TRANSLIT

Print the list of all character set encodings :

iconv -l

Reading and writing from a file :

iconv -f UTF-8 -t ASCII//TRANSLIT -o out.txt in.txt

Leave a Comment