CharsetConverter implements SingletonInterface

Class for conversion between charsets

Table of Contents

Interfaces

SingletonInterface
"empty" interface for singletons (marker interface pattern)

Constants

FALLBACK_CHAR  = '?'
Fallback character for chars with no equivalent.

Properties

$toASCII  : array<string|int, mixed>
An array where charset-to-ASCII mappings are stored (cached)

Methods

UnumberToChar()  : string
Converts a UNICODE number to a UTF-8 multibyte character Algorithm based on script found at From: http://czyborra.com/utf/ Unit-tested by Kasper
utf8_char_mapping()  : string
Maps all characters of a UTF-8 string.
utf8_to_numberarray()  : array<string|int, mixed>
Converts all chars in the input UTF-8 string into integer numbers returned in an array.
utf8CharToUnumber()  : int
Converts a UTF-8 Multibyte character to a UNICODE number Unit-tested by Kasper
initUnicodeData()  : bool
This function initializes all UTF-8 character data tables.

Constants

FALLBACK_CHAR

Fallback character for chars with no equivalent.

protected mixed FALLBACK_CHAR = '?'

Properties

$toASCII

An array where charset-to-ASCII mappings are stored (cached)

protected array<string|int, mixed> $toASCII = []

Methods

UnumberToChar()

Converts a UNICODE number to a UTF-8 multibyte character Algorithm based on script found at From: http://czyborra.com/utf/ Unit-tested by Kasper

public UnumberToChar(int $unicodeInteger) : string

The binary representation of the character's integer value is thus simply spread across the bytes and the number of high bits set in the lead byte announces the number of bytes in the multibyte sequence:

bytes | bits | representation
    1 |    7 | 0vvvvvvv
    2 |   11 | 110vvvvv 10vvvvvv
    3 |   16 | 1110vvvv 10vvvvvv 10vvvvvv
    4 |   21 | 11110vvv 10vvvvvv 10vvvvvv 10vvvvvv
    5 |   26 | 111110vv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv
    6 |   31 | 1111110v 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv

Parameters
$unicodeInteger : int

UNICODE integer

Tags
see
utf8CharToUnumber()
Return values
string

UTF-8 multibyte character string

utf8_char_mapping()

Maps all characters of a UTF-8 string.

public utf8_char_mapping(string $str) : string
Parameters
$str : string

UTF-8 string

Return values
string

utf8_to_numberarray()

Converts all chars in the input UTF-8 string into integer numbers returned in an array.

public utf8_to_numberarray(string $str) : array<string|int, mixed>

All HTML entities (like & or £ or { or 㽝) will be detected as characters. Also, instead of integer numbers the real UTF-8 char is returned.

Parameters
$str : string

Input string, UTF-8

Return values
array<string|int, mixed>

Output array with the char numbers

utf8CharToUnumber()

Converts a UTF-8 Multibyte character to a UNICODE number Unit-tested by Kasper

public utf8CharToUnumber(string $str[, bool $hex = false ]) : int
Parameters
$str : string

UTF-8 multibyte character string

$hex : bool = false

If set, then a hex. number is returned.

Tags
see
UnumberToChar()
Return values
int

UNICODE integer

initUnicodeData()

This function initializes all UTF-8 character data tables.

protected initUnicodeData() : bool

PLEASE SEE: http://www.unicode.org/Public/UNIDATA/

Return values
bool

Returns FALSE on error, TRUE value on success


        
On this page

Search results