Welcome to my blog!

Here I try to keep useful information about IT, mostly related to Web development and Linux stuff. Any comments or feedback that you might have will be much appreciated!

Thanks,
Tomi

Encoding Word characters on PHP

Filed Under (PHP) by admin on 23-02-2009

Use the following PHP code to encode Microsoft Office Word special formatting characters (like m dash, bullets, typographic quotation marks, etc) as HTML entities.


function get_html_translation_table_CP1252() {
$trans = get_html_translation_table(HTML_ENTITIES);
$trans[chr(130)] = '‚'; // Single Low-9 Quotation Mark
$trans[chr(131)] = 'ƒ'; // Latin Small Letter F With Hook
$trans[chr(132)] = '„'; // Double Low-9 Quotation Mark
$trans[chr(133)] = '…'; // Horizontal Ellipsis
$trans[chr(134)] = '†'; // Dagger
$trans[chr(135)] = '‡'; // Double Dagger
$trans[chr(136)] = 'ˆ'; // Modifier Letter Circumflex Accent
$trans[chr(137)] = '‰'; // Per Mille Sign
$trans[chr(138)] = 'Š'; // Latin Capital Letter S With Caron
$trans[chr(139)] = '‹'; // Single Left-Pointing Angle Quotation Mark
$trans[chr(140)] = 'Œ '; // Latin Capital Ligature OE
$trans[chr(145)] = '‘'; // Left Single Quotation Mark
$trans[chr(146)] = '’'; // Right Single Quotation Mark
$trans[chr(147)] = '“'; // Left Double Quotation Mark
$trans[chr(148)] = '”'; // Right Double Quotation Mark
$trans[chr(149)] = '•'; // Bullet
$trans[chr(150)] = '–'; // En Dash
$trans[chr(151)] = '—'; // Em Dash
$trans[chr(152)] = '˜'; // Small Tilde
$trans[chr(153)] = '™'; // Trade Mark Sign
$trans[chr(154)] = 'š'; // Latin Small Letter S With Caron
$trans[chr(155)] = '›'; // Single Right-Pointing Angle Quotation Mark
$trans[chr(156)] = 'œ'; // Latin Small Ligature OE
$trans[chr(159)] = 'Ÿ'; // Latin Capital Letter Y With Diaeresis
ksort($trans);
return $trans;
}

Leave a Reply

You must be logged in to post a comment.