htmlencode

htmlencode; V 2; 2009-02-22; converting from plain text to HTML or XHTML encoded string

Synopsis and description

htmlencode [[--br] | [--xml-br]] [--leave-cr] [--no-lf] [[--spaces] | [--all-spaces]] [text]...

The program reads plain text given in command line or from standard input and converts certain characters to HTML or XHTML entities.

The following characters are affected:

double quote	`"` → `"`
ampersand	`&` → `&`
lesser than	`<` → `<`
greater than	`>` → `>`

and optionally:

line break	U+000A → `<br> or <br />`
spaces	U+0020 → ` `
carriage return	U+000D → nothing

Downloading and compiling

Source code and manual pages:

Requirements:

libarguments

Compiling and installing:


    gcc -Wall -s -o/usr/local/bin/htmlencode htmlencode.c -larguments

Examples

$ htmlencode -bs $'Function "strcmp" returns:\n'\ > $'-1 if a < b,\n0 if a = b, and\n'\ > $'1 if a > b' ; echo Function "strcmp" returns:<br> -1 if a < b,<br> 0  if a = b, and<br> 1  if a > b

The example itself was generated by htmlencode ;)

Why not use sed?

Indeed, the same output could be generated with sed "s/\\&/\\&/g; s/\"/\\"/g; s/$/<br>/" or awk. The reason to use htmlencode is performance. It uses character array lookup algorithm which is much faster than string processors' regexp engines. Moreover it can convert text given in command line saving you from making extra pipes to the program.