Unicode Tools Unicode Tools - UTF-8 Converter
Latin to UTF-8 encoding conversion routines
 

 What is Unicode?

Unicode is a standard encoding system for computers to diplay text and symbols from all writing systems around the world. Unicode is coordinated by the Unicode Consortium. There are several Unicode encodings: the most popular is UTF-8, other examples are UTF-7 and UTF-16. UTF-8 uses a variable-length character encoding, and all basic Latin character codes are identical to ASCII. On the Unicode website you can read the following definition for Unicode: Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. More information



 Converting from Latin to UTF-8 and back in your code

PHP: Use utf8_decode($data) (convert from UTF-8 to ISO-8859-1 More information) and utf8_encode($data) (convert from ISO-8859-1 to UTF-8 More information). Some native PHP functions such as strtolower(), strtoupper() and ucfirst() do not always function correctly with UTF-8 strings. Possible solutions: convert to latin first or add the following line to your code setlocale(LC_CTYPE, 'C');  Make sure not to save your PHP files using a BOM (Byte-Order Marker) UTF-8 file marker (your browser might show these BOM characters between PHP pages on your site).

PERL: use Encode; from_to($data, "iso-8859-1", "utf8"); You can use is_utf8($data) to check if a string is valid UTF-8 More information

Python: To encode in UTF-8: utf8string = unicode(data,"utf-8");  To decode back to locale character set: utf8string.encode("utf-8");

MySQL: MySQL uses charachter sets on all levels, there are settings like: character_set_connection and collation_connection, and you can specify a character set at the database level, the table level and field level. To convert a charachter set inside a MySQL query use convert: SELECT CONVERT(latin1field USING utf8)  More information. If you are experiencing speed issues with table joins after converting character sets of tabels or fields make sure that all ID fields use the same COLLATE setting.

HTML: You can specify your prefered character set using the content-type meta tag (example: <meta http-equiv="content-type" content="text/html; charset=UTF-8">). To avoid problems with various character sets it is sometimes easier to convert your special charachters to (plain ASCII) HTML code. HTML encoded special characters are also readable by old browsers, whereas the content-type meta tag is not. You can use this special charachter to HTML code converter for this.

Unix systems: Use the character set conversion tool: iconv -f ISO-8859-1 -t UTF-8 filename.txt  More information

Windows systems: Most good text-editors offer Unicode support, such as UltraEdit (File → Conversions → ASCII to UTF-8 or ASCII to Unicode (16-Bit)).




 Convert UTF-8 to Latin or Latin to UTF-8

Copy your text below. This page is Latin encoded.

Conversion:     

 

 Sitemap

 
 
 
 
add to Del.icio.us
www.unicodetools.com - © 2008 Misja.com - privacy - also visit Epoch Converter