java.lang.Object | |
↳ | sun.text.normalizer.UCharacterProperty |
Internal class used for Unicode character property database.
This classes store binary data read from uprops.icu. It does not have the capability to parse the data into more high-level information. It only returns bytes of information when required.
Due to the form most commonly used for retrieval, array of char is used to store the binary data.
UCharacterPropertyDB also contains information on accessing indexes to significant points in the binary data.
Responsibility for molding the binary data into more meaning form lies on UCharacter.
Constants | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
int | EXCEPTION_MASK | Exception test mask | |||||||||
int | EXC_CASE_FOLDING_ | Exception indicator for case folding type | |||||||||
int | EXC_COMBINING_CLASS_ | EXC_COMBINING_CLASS_ is not found in ICU. | |||||||||
int | EXC_DENOMINATOR_VALUE_ | Exception indicator for denominator type | |||||||||
int | EXC_LOWERCASE_ | Exception indicator for lowercase type | |||||||||
int | EXC_MIRROR_MAPPING_ | Exception indicator for mirror type | |||||||||
int | EXC_NUMERIC_VALUE_ | Exception indicator for numeric type | |||||||||
int | EXC_SPECIAL_CASING_ | Exception indicator for special casing type | |||||||||
int | EXC_TITLECASE_ | Exception indicator for titlecase type | |||||||||
int | EXC_UNUSED_ | Exception indicator for digit type | |||||||||
int | EXC_UPPERCASE_ | Exception indicator for uppercase type | |||||||||
char | LATIN_SMALL_LETTER_I_ | Latin lowercase i | |||||||||
int | TYPE_MASK | Character type mask |
Fields | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
m_property_ | Character property table | ||||||||||
m_trieData_ | Optimization CharTrie data array | ||||||||||
m_trieIndex_ | Optimization CharTrie index array | ||||||||||
m_trieInitialValue_ | Optimization CharTrie data offset | ||||||||||
m_trie_ | Trie data | ||||||||||
m_unicodeVersion_ | Unicode version |
Public Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Gets the unicode additional properties.
| |||||||||||
Get the "age" of the code point. | |||||||||||
Gets the exception value at the index, assuming that data type is
available.
| |||||||||||
Getting the exception index for argument property
| |||||||||||
Gets the folded case value at the index
| |||||||||||
Called by com.ibm.icu.util.Trie to extract from a lead surrogate's
data the index array offset of the indexes for that lead surrogate.
| |||||||||||
Loads the property data and initialize the UCharacterProperty instance.
| |||||||||||
Gets the property value at the index.
| |||||||||||
Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the surrogate characters are done | |||||||||||
Getting the signed numeric value of a character embedded in the property
argument
| |||||||||||
Determines if the exception value passed in has the kind of information
which the indicator wants, e.g if the exception value contains the digit
value of the character
| |||||||||||
Checks if the argument c is to be treated as a white space in ICU
rules.
| |||||||||||
Java friends implementation
|
[Expand]
Inherited Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
![]() | |||||||||||
![]() |
Exception test mask
Exception indicator for case folding type
EXC_COMBINING_CLASS_ is not found in ICU. Used to retrieve the combining class of the character in the exception value
Exception indicator for denominator type
Exception indicator for lowercase type
Exception indicator for mirror type
Exception indicator for numeric type
Exception indicator for special casing type
Exception indicator for titlecase type
Exception indicator for digit type
Exception indicator for uppercase type
Latin lowercase i
Character type mask
Character property table
Optimization CharTrie data array
Optimization CharTrie index array
Optimization CharTrie data offset
Gets the unicode additional properties. C version getUnicodeProperties.
codepoint | codepoint whose additional properties is to be retrieved |
---|
Get the "age" of the code point.
The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.
This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
This API does not check the validity of the codepoint.
codepoint | The code point. |
---|
Gets the exception value at the index, assuming that data type is available. Result is undefined if data is not available. Use hasExceptionValue() to determine data's availability.
etype | exception data type |
---|
Getting the exception index for argument property
prop | character property |
---|
Gets the folded case value at the index
index | of the case value to be retrieved |
---|---|
count | number of characters to retrieve |
str | string buffer to which to append the result |
Called by com.ibm.icu.util.Trie to extract from a lead surrogate's data the index array offset of the indexes for that lead surrogate.
value | data value for a surrogate from the trie, including the folding offset |
---|
Loads the property data and initialize the UCharacterProperty instance.
RuntimeException | when data is missing or data has been corrupted |
---|
Gets the property value at the index. This is optimized. Note this is alittle different from CharTrie the index m_trieData_ is never negative.
ch | code point whose property value is to be retrieved |
---|
Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the
surrogate characters are done
lead | lead surrogate character |
---|---|
trail | trailing surrogate character |
Getting the signed numeric value of a character embedded in the property argument
prop | the character |
---|
Determines if the exception value passed in has the kind of information which the indicator wants, e.g if the exception value contains the digit value of the character
index | exception index |
---|---|
indicator | type indicator |
Checks if the argument c is to be treated as a white space in ICU rules. Usually ICU rule white spaces are ignored unless quoted.
c | codepoint to check |
---|