CharsetConverter implements SingletonInterface
Class for conversion between charsets
Table of Contents
Interfaces
- SingletonInterface
- "empty" interface for singletons (marker interface pattern)
Constants
- FALLBACK_CHAR = '?'
- Fallback character for chars with no equivalent.
Properties
- $toASCII : array<string|int, mixed>
- An array where charset-to-ASCII mappings are stored (cached)
Methods
- UnumberToChar() : string
- Converts a UNICODE number to a UTF-8 multibyte character Algorithm based on script found at From: http://czyborra.com/utf/ Unit-tested by Kasper
- utf8_char_mapping() : string
- Maps all characters of a UTF-8 string.
- utf8_to_numberarray() : array<string|int, mixed>
- Converts all chars in the input UTF-8 string into integer numbers returned in an array.
- utf8CharToUnumber() : int
- Converts a UTF-8 Multibyte character to a UNICODE number Unit-tested by Kasper
- initUnicodeData() : bool
- This function initializes all UTF-8 character data tables.
Constants
FALLBACK_CHAR
Fallback character for chars with no equivalent.
protected
mixed
FALLBACK_CHAR
= '?'
Properties
$toASCII
An array where charset-to-ASCII mappings are stored (cached)
protected
array<string|int, mixed>
$toASCII
= []
Methods
UnumberToChar()
Converts a UNICODE number to a UTF-8 multibyte character Algorithm based on script found at From: http://czyborra.com/utf/ Unit-tested by Kasper
public
UnumberToChar(int $unicodeInteger) : string
The binary representation of the character's integer value is thus simply spread across the bytes and the number of high bits set in the lead byte announces the number of bytes in the multibyte sequence:
bytes | bits | representation
1 | 7 | 0vvvvvvv
2 | 11 | 110vvvvv 10vvvvvv
3 | 16 | 1110vvvv 10vvvvvv 10vvvvvv
4 | 21 | 11110vvv 10vvvvvv 10vvvvvv 10vvvvvv
5 | 26 | 111110vv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv
6 | 31 | 1111110v 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv
Parameters
- $unicodeInteger : int
-
UNICODE integer
Tags
Return values
string —UTF-8 multibyte character string
utf8_char_mapping()
Maps all characters of a UTF-8 string.
public
utf8_char_mapping(string $str) : string
Parameters
- $str : string
-
UTF-8 string
Return values
stringutf8_to_numberarray()
Converts all chars in the input UTF-8 string into integer numbers returned in an array.
public
utf8_to_numberarray(string $str) : array<string|int, mixed>
All HTML entities (like & or £ or { or 㽝) will be detected as characters. Also, instead of integer numbers the real UTF-8 char is returned.
Parameters
- $str : string
-
Input string, UTF-8
Return values
array<string|int, mixed> —Output array with the char numbers
utf8CharToUnumber()
Converts a UTF-8 Multibyte character to a UNICODE number Unit-tested by Kasper
public
utf8CharToUnumber(string $str[, bool $hex = false ]) : int
Parameters
- $str : string
-
UTF-8 multibyte character string
- $hex : bool = false
-
If set, then a hex. number is returned.
Tags
Return values
int —UNICODE integer
initUnicodeData()
This function initializes all UTF-8 character data tables.
protected
initUnicodeData() : bool
PLEASE SEE: http://www.unicode.org/Public/UNIDATA/
Return values
bool —Returns FALSE on error, TRUE value on success