Unicode
Table of Contents
Classes
- SpoofDetector
- Class SpoofDetector
- Utf8String
- A class for manipulating UTF-8 strings.
Functions
- utf8_casefold_simple_maps() : array<string|int, mixed>
- Helper function for utf8_casefold.
- utf8_casefold_maps() : array<string|int, mixed>
- Helper function for utf8_casefold.
- utf8_strtolower_simple_maps() : array<string|int, mixed>
- Helper function for utf8_strtolower.
- utf8_strtolower_maps() : array<string|int, mixed>
- Helper function for utf8_strtolower.
- utf8_titlecase_simple_maps() : array<string|int, mixed>
- Helper function for utf8_convert_case.
- utf8_titlecase_maps() : array<string|int, mixed>
- Helper function for utf8_convert_case.
- utf8_strtoupper_simple_maps() : array<string|int, mixed>
- Helper function for utf8_strtoupper.
- utf8_strtoupper_maps() : array<string|int, mixed>
- Helper function for utf8_strtoupper.
- utf8_combining_classes() : array<string|int, mixed>
- Helper function for utf8_normalize_d.
- utf8_compose_maps() : array<string|int, mixed>
- Helper function for utf8_compose.
- utf8_confusables() : array<string|int, mixed>
- Helper function for SMF\Unicode\SpoofDetector::getSkeletonString.
- utf8_character_scripts() : array<string|int, mixed>
- Helper function for SpoofDetector::resolveScriptSet.
- utf8_regex_identifier_status() : array<string|int, mixed>
- Helper function for SpoofDetector::checkHomographNames.
- currencies() : array<string|int, mixed>
- Helper function for SMF\Localization\MessageFormatter::formatMessage.
- country_currencies() : array<string|int, mixed>
- Helper function for SMF\Localization\MessageFormatter::formatMessage.
- utf8_normalize_d_maps() : array<string|int, mixed>
- Helper function for utf8_normalize_d.
- utf8_normalize_kd_maps() : array<string|int, mixed>
- Helper function for utf8_normalize_kd.
- utf8_default_ignorables() : array<string|int, mixed>
- Helper function for utf8_normalize_kc_casefold.
- idna_maps() : array<string|int, mixed>
- Helper function for idn_to_* polyfills.
- idna_maps_deviation() : array<string|int, mixed>
- Helper function for idn_to_* polyfills.
- idna_regex() : array<string|int, mixed>
- Helper function for idn_to_* polyfills.
- plurals() : array<string|int, mixed>
- Helper function for SMF\Localization\MessageFormatter::formatMessage.
- utf8_regex_quick_check() : array<string|int, mixed>
- Helper function for utf8_is_normalized.
- utf8_regex_properties() : array<string|int, mixed>
- Helper function for utf8_sanitize_invisibles and utf8_convert_case.
- utf8_regex_variation_selectors() : array<string|int, mixed>
- Helper function for utf8_sanitize_invisibles.
- utf8_regex_joining_type() : array<string|int, mixed>
- Helper function for utf8_sanitize_invisibles.
- utf8_regex_indic() : array<string|int, mixed>
- Helper function for utf8_sanitize_invisibles.
Functions
utf8_casefold_simple_maps()
Helper function for utf8_casefold.
utf8_casefold_simple_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Casefolding maps.
utf8_casefold_maps()
Helper function for utf8_casefold.
utf8_casefold_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Casefolding maps.
utf8_strtolower_simple_maps()
Helper function for utf8_strtolower.
utf8_strtolower_simple_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Uppercase to lowercase maps.
utf8_strtolower_maps()
Helper function for utf8_strtolower.
utf8_strtolower_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Uppercase to lowercase maps.
utf8_titlecase_simple_maps()
Helper function for utf8_convert_case.
utf8_titlecase_simple_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Simple title case maps.
utf8_titlecase_maps()
Helper function for utf8_convert_case.
utf8_titlecase_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Full title case maps.
utf8_strtoupper_simple_maps()
Helper function for utf8_strtoupper.
utf8_strtoupper_simple_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Lowercase to uppercase maps.
utf8_strtoupper_maps()
Helper function for utf8_strtoupper.
utf8_strtoupper_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Lowercase to uppercase maps.
utf8_combining_classes()
Helper function for utf8_normalize_d.
utf8_combining_classes() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Combining Class data for Unicode normalization.
utf8_compose_maps()
Helper function for utf8_compose.
utf8_compose_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Composition maps for Unicode normalization.
utf8_confusables()
Helper function for SMF\Unicode\SpoofDetector::getSkeletonString.
utf8_confusables() : array<string|int, mixed>
Returns an array of "confusables" maps that can be used for confusable string detection.
Data compiled from: https://www.unicode.org/Public/security/latest/confusables.txt
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —"Confusables" maps.
utf8_character_scripts()
Helper function for SpoofDetector::resolveScriptSet.
utf8_character_scripts() : array<string|int, mixed>
Each key in the returned array defines the END of a range of characters that all have the same script set. For example, the first key, "\x40", means the range of characters from "\x0" to "\x40". Then the second key, "\x5A", means the range from "\x41" to "\x5A".
The first entry in each value array indicates the primary script (i.e. the value of the Script property) for that set of characters. If those characters can also occur in a limited number of other scripts (i.e. the Script_Extensions property for those characters is not empty), those additional scripts are listed after the first.
See https://www.unicode.org/reports/tr24/ for more info.
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Script data for ranges of Unicode characters.
utf8_regex_identifier_status()
Helper function for SpoofDetector::checkHomographNames.
utf8_regex_identifier_status() : array<string|int, mixed>
Returns an array of regexes that can be used to check the "identifier status" of characters in a string.
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Character classes for identifier statuses.
currencies()
Helper function for SMF\Localization\MessageFormatter::formatMessage.
currencies() : array<string|int, mixed>
Rules compiled from: https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-core/supplemental/currencyData.json
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Information about different currencies
country_currencies()
Helper function for SMF\Localization\MessageFormatter::formatMessage.
country_currencies() : array<string|int, mixed>
Rules compiled from: https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-core/supplemental/currencyData.json
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Information about currencies used in different countries
utf8_normalize_d_maps()
Helper function for utf8_normalize_d.
utf8_normalize_d_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Canonical Decomposition maps for Unicode normalization.
utf8_normalize_kd_maps()
Helper function for utf8_normalize_kd.
utf8_normalize_kd_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Compatibility Decomposition maps for Unicode normalization.
utf8_default_ignorables()
Helper function for utf8_normalize_kc_casefold.
utf8_default_ignorables() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Characters with the 'Default_Ignorable_Code_Point' property.
idna_maps()
Helper function for idn_to_* polyfills.
idna_maps() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Character maps for IDNA processing.
idna_maps_deviation()
Helper function for idn_to_* polyfills.
idna_maps_deviation() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —"Deviation" character maps for IDNA processing.
idna_regex()
Helper function for idn_to_* polyfills.
idna_regex() : array<string|int, mixed>
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Regular expressions useful for IDNA processing.
plurals()
Helper function for SMF\Localization\MessageFormatter::formatMessage.
plurals() : array<string|int, mixed>
Rules compiled from: https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-core/supplemental/plurals.json https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-core/supplemental/ordinals.json
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Pluralization rules for different languages
utf8_regex_quick_check()
Helper function for utf8_is_normalized.
utf8_regex_quick_check() : array<string|int, mixed>
Character class lists compiled from: https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedNormalizationProps.txt
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Character classes for disallowed characters in normalization forms.
utf8_regex_properties()
Helper function for utf8_sanitize_invisibles and utf8_convert_case.
utf8_regex_properties() : array<string|int, mixed>
Character class lists compiled from: https://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt https://www.unicode.org/Public/UCD/latest/ucd/emoji/emoji-data.txt https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedGeneralCategory.txt https://www.unicode.org/Public/UCD/latest/ucd/auxiliary/WordBreakProperty.txt
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Character classes for various Unicode properties.
utf8_regex_variation_selectors()
Helper function for utf8_sanitize_invisibles.
utf8_regex_variation_selectors() : array<string|int, mixed>
Character class lists compiled from: https://www.unicode.org/Public/UCD/latest/ucd/StandardizedVariants.txt https://www.unicode.org/Public/UCD/latest/ucd/emoji/emoji-variation-sequences.txt
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Character classes for filtering variation selectors.
utf8_regex_joining_type()
Helper function for utf8_sanitize_invisibles.
utf8_regex_joining_type() : array<string|int, mixed>
Character class lists compiled from: https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedJoiningType.txt
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Character classes for joining characters in certain scripts.
utf8_regex_indic()
Helper function for utf8_sanitize_invisibles.
utf8_regex_indic() : array<string|int, mixed>
Character class lists compiled from: https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedCombiningClass.txt https://www.unicode.org/Public/UCD/latest/ucd/IndicSyllabicCategory.txt
Developers: Do not update the data in this function manually. Instead, run "php -f other/update_unicode_data.php" on the command line.
Return values
array<string|int, mixed> —Character classes for Indic scripts that use viramas.