java.lang.Object | |
↳ | java.lang.Character |
The wrapper for the primitive type char
. This class also provides a
number of utility methods for working with characters.
Character data is kept up to date as Unicode evolves.
See the Locale data section of
the Locale
documentation for details of the Unicode versions implemented by current
and historical Android releases.
The Unicode specification, character tables, and other information are available at http://www.unicode.org/.
Unicode characters are referred to as code points. The range of valid
code points is U+0000 to U+10FFFF. The Basic Multilingual Plane (BMP)
is the code point range U+0000 to U+FFFF. Characters above the BMP are
referred to as Supplementary Characters. On the Java platform, UTF-16
encoding and char
pairs are used to represent code points in the
supplementary range. A pair of char
values that represent a
supplementary character are made up of a high surrogate with a value
range of 0xD800 to 0xDBFF and a low surrogate with a value range of
0xDC00 to 0xDFFF.
On the Java platform a char
value represents either a single BMP code
point or a UTF-16 unit that's part of a surrogate pair. The int
type
is used to represent all Unicode code points.
Unicode categories
Here's a list of the Unicode character categories and the corresponding Java constant,
grouped semantically to provide a convenient overview. This table is also useful in
conjunction with \p
and \P
in regular expressions
.
Cn Unassigned UNASSIGNED
Cc Control CONTROL
Cf Format FORMAT
Co Private use PRIVATE_USE
Cs Surrogate SURROGATE
Lu Uppercase letter UPPERCASE_LETTER
Ll Lowercase letter LOWERCASE_LETTER
Lt Titlecase letter TITLECASE_LETTER
Lm Modifier letter MODIFIER_LETTER
Lo Other letter OTHER_LETTER
Mn Non-spacing mark NON_SPACING_MARK
Me Enclosing mark ENCLOSING_MARK
Mc Combining spacing mark COMBINING_SPACING_MARK
Nd Decimal digit number DECIMAL_DIGIT_NUMBER
Nl Letter number LETTER_NUMBER
No Other number OTHER_NUMBER
Pd Dash punctuation DASH_PUNCTUATION
Ps Start punctuation START_PUNCTUATION
Pe End punctuation END_PUNCTUATION
Pc Connector punctuation CONNECTOR_PUNCTUATION
Pi Initial quote punctuation INITIAL_QUOTE_PUNCTUATION
Pf Final quote punctuation FINAL_QUOTE_PUNCTUATION
Po Other punctuation OTHER_PUNCTUATION
Sm Math symbol MATH_SYMBOL
Sc Currency symbol CURRENCY_SYMBOL
Sk Modifier symbol MODIFIER_SYMBOL
So Other symbol OTHER_SYMBOL
Zs Space separator SPACE_SEPARATOR
Zl Line separator LINE_SEPARATOR
Zp Paragraph separator PARAGRAPH_SEPARATOR
Nested Classes | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Character.Subset | |||||||||||
Character.UnicodeBlock | Represents a block of Unicode characters, as defined by the Unicode 4.0.1 specification. |
Constants | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
byte | COMBINING_SPACING_MARK | Unicode category constant Mc. | |||||||||
byte | CONNECTOR_PUNCTUATION | Unicode category constant Pc. | |||||||||
byte | CONTROL | Unicode category constant Cc. | |||||||||
byte | CURRENCY_SYMBOL | Unicode category constant Sc. | |||||||||
byte | DASH_PUNCTUATION | Unicode category constant Pd. | |||||||||
byte | DECIMAL_DIGIT_NUMBER | Unicode category constant Nd. | |||||||||
byte | DIRECTIONALITY_ARABIC_NUMBER | Unicode bidirectional constant AN. | |||||||||
byte | DIRECTIONALITY_BOUNDARY_NEUTRAL | Unicode bidirectional constant BN. | |||||||||
byte | DIRECTIONALITY_COMMON_NUMBER_SEPARATOR | Unicode bidirectional constant CS. | |||||||||
byte | DIRECTIONALITY_EUROPEAN_NUMBER | Unicode bidirectional constant EN. | |||||||||
byte | DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR | Unicode bidirectional constant ES. | |||||||||
byte | DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR | Unicode bidirectional constant ET. | |||||||||
byte | DIRECTIONALITY_LEFT_TO_RIGHT | Unicode bidirectional constant L. | |||||||||
byte | DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING | Unicode bidirectional constant LRE. | |||||||||
byte | DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE | Unicode bidirectional constant LRO. | |||||||||
byte | DIRECTIONALITY_NONSPACING_MARK | Unicode bidirectional constant NSM. | |||||||||
byte | DIRECTIONALITY_OTHER_NEUTRALS | Unicode bidirectional constant ON. | |||||||||
byte | DIRECTIONALITY_PARAGRAPH_SEPARATOR | Unicode bidirectional constant B. | |||||||||
byte | DIRECTIONALITY_POP_DIRECTIONAL_FORMAT | Unicode bidirectional constant PDF. | |||||||||
byte | DIRECTIONALITY_RIGHT_TO_LEFT | Unicode bidirectional constant R. | |||||||||
byte | DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC | Unicode bidirectional constant AL. | |||||||||
byte | DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING | Unicode bidirectional constant RLE. | |||||||||
byte | DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE | Unicode bidirectional constant RLO. | |||||||||
byte | DIRECTIONALITY_SEGMENT_SEPARATOR | Unicode bidirectional constant S. | |||||||||
byte | DIRECTIONALITY_UNDEFINED | Unicode bidirectional constant. | |||||||||
byte | DIRECTIONALITY_WHITESPACE | Unicode bidirectional constant WS. | |||||||||
byte | ENCLOSING_MARK | Unicode category constant Me. | |||||||||
byte | END_PUNCTUATION | Unicode category constant Pe. | |||||||||
byte | FINAL_QUOTE_PUNCTUATION | Unicode category constant Pf. | |||||||||
byte | FORMAT | Unicode category constant Cf. | |||||||||
byte | INITIAL_QUOTE_PUNCTUATION | Unicode category constant Pi. | |||||||||
byte | LETTER_NUMBER | Unicode category constant Nl. | |||||||||
byte | LINE_SEPARATOR | Unicode category constant Zl. | |||||||||
byte | LOWERCASE_LETTER | Unicode category constant Ll. | |||||||||
byte | MATH_SYMBOL | Unicode category constant Sm. | |||||||||
int | MAX_CODE_POINT | The maximum code point value, U+10FFFF . |
|||||||||
char | MAX_HIGH_SURROGATE | The maximum value of a high surrogate or leading surrogate unit in UTF-16
encoding, '?' . |
|||||||||
char | MAX_LOW_SURROGATE | The maximum value of a low surrogate or trailing surrogate unit in UTF-16
encoding, '?' . |
|||||||||
int | MAX_RADIX | The maximum radix used for conversions between characters and integers. | |||||||||
char | MAX_SURROGATE | The maximum value of a surrogate unit in UTF-16 encoding, '?' . |
|||||||||
char | MAX_VALUE | The maximum Character value. |
|||||||||
int | MIN_CODE_POINT | The minimum code point value, U+0000 . |
|||||||||
char | MIN_HIGH_SURROGATE | The minimum value of a high surrogate or leading surrogate unit in UTF-16
encoding, '?' . |
|||||||||
char | MIN_LOW_SURROGATE | The minimum value of a low surrogate or trailing surrogate unit in UTF-16
encoding, '?' . |
|||||||||
int | MIN_RADIX | The minimum radix used for conversions between characters and integers. | |||||||||
int | MIN_SUPPLEMENTARY_CODE_POINT | The minimum value of a supplementary code point, U+010000 . |
|||||||||
char | MIN_SURROGATE | The minimum value of a surrogate unit in UTF-16 encoding, '?' . |
|||||||||
char | MIN_VALUE | The minimum Character value. |
|||||||||
byte | MODIFIER_LETTER | Unicode category constant Lm. | |||||||||
byte | MODIFIER_SYMBOL | Unicode category constant Sk. | |||||||||
byte | NON_SPACING_MARK | Unicode category constant Mn. | |||||||||
byte | OTHER_LETTER | Unicode category constant Lo. | |||||||||
byte | OTHER_NUMBER | Unicode category constant No. | |||||||||
byte | OTHER_PUNCTUATION | Unicode category constant Po. | |||||||||
byte | OTHER_SYMBOL | Unicode category constant So. | |||||||||
byte | PARAGRAPH_SEPARATOR | Unicode category constant Zp. | |||||||||
byte | PRIVATE_USE | Unicode category constant Co. | |||||||||
int | SIZE | The number of bits required to represent a Character value
unsigned form. |
|||||||||
byte | SPACE_SEPARATOR | Unicode category constant Zs. | |||||||||
byte | START_PUNCTUATION | Unicode category constant Ps. | |||||||||
byte | SURROGATE | Unicode category constant Cs. | |||||||||
byte | TITLECASE_LETTER | Unicode category constant Lt. | |||||||||
byte | UNASSIGNED | Unicode category constant Cn. | |||||||||
byte | UPPERCASE_LETTER | Unicode category constant Lu. |
Fields | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
TYPE | The Class object that represents the primitive type char . |
Public Constructors | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Constructs a new
Character with the specified primitive char
value. |
Public Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Calculates the number of
char values required to represent the
specified Unicode code point. | |||||||||||
Gets the primitive value of this character.
| |||||||||||
Returns the code point at
index in the specified array of
character units. | |||||||||||
Returns the code point at
index in the specified array of
character units, where index has to be less than limit . | |||||||||||
Returns the code point at
index in the specified sequence of
character units. | |||||||||||
Returns the code point that precedes
index in the specified
sequence of character units. | |||||||||||
Returns the code point that precedes the
index in the specified
array of character units and is not less than start . | |||||||||||
Returns the code point that precedes
index in the specified
array of character units. | |||||||||||
Counts the number of Unicode code points in the subsequence of the
specified character sequence, as delineated by
beginIndex and
endIndex . | |||||||||||
Counts the number of Unicode code points in the subsequence of the
specified char array, as delineated by
offset and count . | |||||||||||
Compares this object to the specified character object to determine their
relative order.
| |||||||||||
Convenience method to determine the value of the specified character
c in the supplied radix. | |||||||||||
Convenience method to determine the value of the character
codePoint in the supplied radix. | |||||||||||
Compares this object with the specified object and indicates if they are
equal.
| |||||||||||
Returns the character which represents the specified digit in the
specified radix.
| |||||||||||
Gets the Unicode directionality of the specified character.
| |||||||||||
Gets the Unicode directionality of the specified character.
| |||||||||||
Gets the numeric value of the specified Unicode code point.
| |||||||||||
Returns the numeric value of the specified Unicode character.
| |||||||||||
Gets the general Unicode category of the specified character.
| |||||||||||
Gets the general Unicode category of the specified code point.
| |||||||||||
Returns an integer hash code for this object.
| |||||||||||
Indicates whether the specified code point is defined in the Unicode
specification.
| |||||||||||
Indicates whether the specified character is defined in the Unicode
specification.
| |||||||||||
Indicates whether the specified character is a digit.
| |||||||||||
Indicates whether the specified code point is a digit.
| |||||||||||
Indicates whether
ch is a high- (or leading-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding. | |||||||||||
Indicates whether the specified character is an ISO control character.
| |||||||||||
Indicates whether the specified code point is an ISO control character.
| |||||||||||
Indicates whether the specified character is ignorable in a Java or
Unicode identifier.
| |||||||||||
Indicates whether the specified code point is ignorable in a Java or
Unicode identifier.
| |||||||||||
Indicates whether the specified code point is a valid part of a Java
identifier other than the first character.
| |||||||||||
Indicates whether the specified character is a valid part of a Java
identifier other than the first character.
| |||||||||||
Indicates whether the specified character is a valid first character for
a Java identifier.
| |||||||||||
Indicates whether the specified code point is a valid first character for
a Java identifier.
| |||||||||||
This method is deprecated.
Use
isJavaIdentifierStart(char)
| |||||||||||
This method is deprecated.
Use
isJavaIdentifierPart(char)
| |||||||||||
Indicates whether the specified character is a letter.
| |||||||||||
Indicates whether the specified code point is a letter.
| |||||||||||
Indicates whether the specified character is a letter or a digit.
| |||||||||||
Indicates whether the specified code point is a letter or a digit.
| |||||||||||
Indicates whether
ch is a low- (or trailing-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding. | |||||||||||
Indicates whether the specified code point is a lower case letter.
| |||||||||||
Indicates whether the specified character is a lower case letter.
| |||||||||||
Indicates whether the specified character is mirrored.
| |||||||||||
Indicates whether the specified code point is mirrored.
| |||||||||||
This method is deprecated.
Use
isWhitespace(char)
| |||||||||||
Indicates whether the specified character is a Unicode space character.
| |||||||||||
Indicates whether the specified code point is a Unicode space character.
| |||||||||||
Indicates whether
codePoint is within the supplementary code
point range. | |||||||||||
Indicates whether the specified character pair is a valid surrogate pair.
| |||||||||||
Indicates whether the specified code point is a titlecase character.
| |||||||||||
Indicates whether the specified character is a titlecase character.
| |||||||||||
Indicates whether the specified code point is valid as part of a Unicode
identifier other than the first character.
| |||||||||||
Indicates whether the specified character is valid as part of a Unicode
identifier other than the first character.
| |||||||||||
Indicates whether the specified character is a valid initial character
for a Unicode identifier.
| |||||||||||
Indicates whether the specified code point is a valid initial character
for a Unicode identifier.
| |||||||||||
Indicates whether the specified code point is an upper case letter.
| |||||||||||
Indicates whether the specified character is an upper case letter.
| |||||||||||
Indicates whether
codePoint is a valid Unicode code point. | |||||||||||
Indicates whether the specified character is a whitespace character in
Java.
| |||||||||||
Indicates whether the specified code point is a whitespace character in
Java.
| |||||||||||
Determines the index in the specified character sequence that is offset
codePointOffset code points from index . | |||||||||||
Determines the index in a subsequence of the specified character array
that is offset
codePointOffset code points from index . | |||||||||||
Reverses the order of the first and second byte in the specified
character.
| |||||||||||
Converts the specified Unicode code point into a UTF-16 encoded sequence
and returns it as a char array.
| |||||||||||
Converts the specified Unicode code point into a UTF-16 encoded sequence
and copies the value(s) into the char array
dst , starting at
index dstIndex . | |||||||||||
Converts a surrogate pair into a Unicode code point.
| |||||||||||
Returns the lower case equivalent for the specified character if the
character is an upper case letter.
| |||||||||||
Returns the lower case equivalent for the specified code point if it is
an upper case letter.
| |||||||||||
Converts the specified character to its string representation.
| |||||||||||
Returns a string containing a concise, human-readable description of this
object.
| |||||||||||
Returns the title case equivalent for the specified character if it
exists.
| |||||||||||
Returns the title case equivalent for the specified code point if it
exists.
| |||||||||||
Returns the upper case equivalent for the specified character if the
character is a lower case letter.
| |||||||||||
Returns the upper case equivalent for the specified code point if the
code point is a lower case letter.
| |||||||||||
Returns a
Character instance for the char value passed. |
[Expand]
Inherited Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
From class
java.lang.Object
| |||||||||||
From interface
java.lang.Comparable
|
Unicode category constant Mc.
Unicode category constant Pc.
Unicode category constant Cc.
Unicode category constant Sc.
Unicode category constant Pd.
Unicode category constant Nd.
Unicode bidirectional constant AN.
Unicode bidirectional constant BN.
Unicode bidirectional constant CS.
Unicode bidirectional constant EN.
Unicode bidirectional constant ES.
Unicode bidirectional constant ET.
Unicode bidirectional constant L.
Unicode bidirectional constant LRE.
Unicode bidirectional constant LRO.
Unicode bidirectional constant NSM.
Unicode bidirectional constant ON.
Unicode bidirectional constant B.
Unicode bidirectional constant PDF.
Unicode bidirectional constant R.
Unicode bidirectional constant AL.
Unicode bidirectional constant RLE.
Unicode bidirectional constant RLO.
Unicode bidirectional constant S.
Unicode bidirectional constant.
Unicode bidirectional constant WS.
Unicode category constant Me.
Unicode category constant Pe.
Unicode category constant Pf.
Unicode category constant Cf.
Unicode category constant Pi.
Unicode category constant Nl.
Unicode category constant Zl.
Unicode category constant Ll.
Unicode category constant Sm.
The maximum code point value, U+10FFFF
.
The maximum value of a high surrogate or leading surrogate unit in UTF-16
encoding, '?'
.
The maximum value of a low surrogate or trailing surrogate unit in UTF-16
encoding, '?'
.
The maximum radix used for conversions between characters and integers.
The maximum value of a surrogate unit in UTF-16 encoding, '?'
.
The maximum Character
value.
The minimum code point value, U+0000
.
The minimum value of a high surrogate or leading surrogate unit in UTF-16
encoding, '?'
.
The minimum value of a low surrogate or trailing surrogate unit in UTF-16
encoding, '?'
.
The minimum radix used for conversions between characters and integers.
The minimum value of a supplementary code point, U+010000
.
The minimum value of a surrogate unit in UTF-16 encoding, '?'
.
The minimum Character
value.
Unicode category constant Lm.
Unicode category constant Sk.
Unicode category constant Mn.
Unicode category constant Lo.
Unicode category constant No.
Unicode category constant Po.
Unicode category constant So.
Unicode category constant Zp.
Unicode category constant Co.
The number of bits required to represent a Character
value
unsigned form.
Unicode category constant Zs.
Unicode category constant Ps.
Unicode category constant Cs.
Unicode category constant Lt.
Unicode category constant Cn.
Unicode category constant Lu.
The Class
object that represents the primitive type char
.
Constructs a new Character
with the specified primitive char
value.
value | the primitive char value to store in the new instance. |
---|
Calculates the number of char
values required to represent the
specified Unicode code point. This method checks if the codePoint
is greater than or equal to 0x10000
, in which case 2
is
returned, otherwise 1
. To test if the code point is valid, use
the isValidCodePoint(int)
method.
codePoint | the code point for which to calculate the number of required chars. |
---|
2
if codePoint >= 0x10000
; 1
otherwise.Gets the primitive value of this character.
Returns the code point at index
in the specified array of
character units. If the unit at index
is a high-surrogate unit,
index + 1
is less than the length of the array and the unit at
index + 1
is a low-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index
is returned.
seq | the source array of char units. |
---|---|
index | the position in seq from which to retrieve the code
point. |
char
value at index
in
seq
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if the index is negative or greater than or equal to
the length of seq . |
Returns the code point at index
in the specified array of
character units, where index
has to be less than limit
.
If the unit at index
is a high-surrogate unit, index + 1
is less than limit
and the unit at index + 1
is a
low-surrogate unit, then the supplementary code point represented by the
pair is returned; otherwise the char
value at index
is
returned.
seq | the source array of char units. |
---|---|
index | the position in seq from which to get the code point. |
limit | the index after the last unit in seq that can be used. |
char
value at index
in
seq
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if index < 0 , index >= limit ,
limit < 0 or if limit is greater than the
length of seq . |
Returns the code point at index
in the specified sequence of
character units. If the unit at index
is a high-surrogate unit,
index + 1
is less than the length of the sequence and the unit at
index + 1
is a low-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index
is returned.
seq | the source sequence of char units. |
---|---|
index | the position in seq from which to retrieve the code
point. |
char
value at index
in
seq
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if the index is negative or greater than or equal to
the length of seq . |
Returns the code point that precedes index
in the specified
sequence of character units. If the unit at index - 1
is a
low-surrogate unit, index - 2
is not negative and the unit at
index - 2
is a high-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index - 1
is returned.
seq | the source sequence of char units. |
---|---|
index | the position in seq following the code
point that should be returned. |
char
value before index
in seq
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if the index is less than 1 or greater than the
length of seq . |
Returns the code point that precedes the index
in the specified
array of character units and is not less than start
. If the unit
at index - 1
is a low-surrogate unit, index - 2
is not
less than start
and the unit at index - 2
is a
high-surrogate unit, then the supplementary code point represented by the
pair is returned; otherwise the char
value at index - 1
is returned.
seq | the source array of char units. |
---|---|
index | the position in seq following the code point that
should be returned. |
start | the index of the first element in seq . |
char
value before index
in seq
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if the index <= start , start < 0 ,
index is greater than the length of seq , or
if start is equal or greater than the length of
seq . |
Returns the code point that precedes index
in the specified
array of character units. If the unit at index - 1
is a
low-surrogate unit, index - 2
is not negative and the unit at
index - 2
is a high-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index - 1
is returned.
seq | the source array of char units. |
---|---|
index | the position in seq following the code
point that should be returned. |
char
value before index
in seq
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if the index is less than 1 or greater than the
length of seq . |
Counts the number of Unicode code points in the subsequence of the
specified character sequence, as delineated by beginIndex
and
endIndex
. Any surrogate values with missing pair values will be
counted as one code point.
seq | the CharSequence to look through. |
---|---|
beginIndex | the inclusive index to begin counting at. |
endIndex | the exclusive index to stop counting at. |
NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if beginIndex < 0 , beginIndex > endIndex or
if endIndex is greater than the length of seq . |
Counts the number of Unicode code points in the subsequence of the
specified char array, as delineated by offset
and count
.
Any surrogate values with missing pair values will be counted as one code
point.
seq | the char array to look through |
---|---|
offset | the inclusive index to begin counting at. |
count | the number of char values to look through in
seq . |
NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if offset < 0 , count < 0 or if
offset + count is greater than the length of
seq . |
Compares this object to the specified character object to determine their relative order.
c | the character object to compare this object to. |
---|
0
if the value of this character and the value of
c
are equal; a positive value if the value of this
character is greater than the value of c
; a negative
value if the value of this character is less than the value of
c
.Convenience method to determine the value of the specified character
c
in the supplied radix. The value of radix
must be
between MIN_RADIX and MAX_RADIX.
c | the character to determine the value of. |
---|---|
radix | the radix. |
Convenience method to determine the value of the character
codePoint
in the supplied radix. The value of radix
must
be between MIN_RADIX and MAX_RADIX.
codePoint | the character, including supplementary characters. |
---|---|
radix | the radix. |
Compares this object with the specified object and indicates if they are
equal. In order to be equal, object
must be an instance of
Character
and have the same char value as this object.
object | the object to compare this double with. |
---|
true
if the specified object is equal to this
Character
; false
otherwise.
Returns the character which represents the specified digit in the
specified radix. The radix
must be between MIN_RADIX
and
MAX_RADIX
inclusive; digit
must not be negative and
smaller than radix
. If any of these conditions does not hold, 0
is returned.
digit | the integer value. |
---|---|
radix | the radix. |
digit
in the
radix
.
Gets the Unicode directionality of the specified character.
codePoint | the Unicode code point to get the directionality of. |
---|
codePoint
.
Gets the Unicode directionality of the specified character.
c | the character to get the directionality of. |
---|
c
.
Gets the numeric value of the specified Unicode code point. For example, the code point 'Ⅻ' stands for the Roman number XII, which has the numeric value 12.
There are two points of divergence between this method and the Unicode specification. This method treats the letters a-z (in both upper and lower cases, and their full-width variants) as numbers from 10 to 35. The Unicode specification also supports the idea of code points with non-integer numeric values; this method does not (except to the extent of returning -2 for such code points).
codePoint | the code point |
---|
codePoint
exists, -1 if there is no numeric value for
codePoint
, -2 if the numeric value can not be
represented with an integer.
Returns the numeric value of the specified Unicode character.
See getNumericValue(int)
.
c | the character |
---|
c
exists, -1 if there is no numeric value for c
,
-2 if the numeric value can not be represented as an integer.
Gets the general Unicode category of the specified character.
c | the character to get the category of. |
---|
c
.
Gets the general Unicode category of the specified code point.
codePoint | the Unicode code point to get the category of. |
---|
codePoint
.
Returns an integer hash code for this object. By contract, any two
objects for which equals(Object)
returns true
must return
the same hash code value. This means that subclasses of Object
usually override both methods or neither method.
Note that hash values must not change over time unless information used in equals comparisons also changes.
See Writing a correct
hashCode
method
if you intend implementing your own hashCode
method.
Indicates whether the specified code point is defined in the Unicode specification.
codePoint | the code point to check. |
---|
true
if the general Unicode category of the code point is
not UNASSIGNED
; false
otherwise.
Indicates whether the specified character is defined in the Unicode specification.
c | the character to check. |
---|
true
if the general Unicode category of the character is
not UNASSIGNED
; false
otherwise.
Indicates whether the specified character is a digit.
c | the character to check. |
---|
true
if c
is a digit; false
otherwise.
Indicates whether the specified code point is a digit.
codePoint | the code point to check. |
---|
true
if codePoint
is a digit; false
otherwise.
Indicates whether ch
is a high- (or leading-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding.
ch | the character to test. |
---|
true
if ch
is a high-surrogate code unit;
false
otherwise.Indicates whether the specified character is an ISO control character.
c | the character to check. |
---|
true
if c
is an ISO control character;
false
otherwise.
Indicates whether the specified code point is an ISO control character.
c | the code point to check. |
---|
true
if c
is an ISO control character;
false
otherwise.
Indicates whether the specified character is ignorable in a Java or Unicode identifier.
c | the character to check. |
---|
true
if c
is ignorable; false
otherwise.
Indicates whether the specified code point is ignorable in a Java or Unicode identifier.
codePoint | the code point to check. |
---|
true
if codePoint
is ignorable; false
otherwise.
Indicates whether the specified code point is a valid part of a Java identifier other than the first character.
codePoint | the code point to check. |
---|
true
if c
is valid as part of a Java identifier;
false
otherwise.
Indicates whether the specified character is a valid part of a Java identifier other than the first character.
c | the character to check. |
---|
true
if c
is valid as part of a Java identifier;
false
otherwise.
Indicates whether the specified character is a valid first character for a Java identifier.
c | the character to check. |
---|
true
if c
is a valid first character of a Java
identifier; false
otherwise.
Indicates whether the specified code point is a valid first character for a Java identifier.
codePoint | the code point to check. |
---|
true
if codePoint
is a valid start of a Java
identifier; false
otherwise.
This method is deprecated.
Use isJavaIdentifierStart(char)
Indicates whether the specified character is a Java letter.
c | the character to check. |
---|
true
if c
is a Java letter; false
otherwise.
This method is deprecated.
Use isJavaIdentifierPart(char)
Indicates whether the specified character is a Java letter or digit character.
c | the character to check. |
---|
true
if c
is a Java letter or digit;
false
otherwise.Indicates whether the specified character is a letter.
c | the character to check. |
---|
true
if c
is a letter; false
otherwise.
Indicates whether the specified code point is a letter.
codePoint | the code point to check. |
---|
true
if codePoint
is a letter; false
otherwise.
Indicates whether the specified character is a letter or a digit.
c | the character to check. |
---|
true
if c
is a letter or a digit; false
otherwise.
Indicates whether the specified code point is a letter or a digit.
codePoint | the code point to check. |
---|
true
if codePoint
is a letter or a digit;
false
otherwise.
Indicates whether ch
is a low- (or trailing-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding.
ch | the character to test. |
---|
true
if ch
is a low-surrogate code unit;
false
otherwise.Indicates whether the specified code point is a lower case letter.
codePoint | the code point to check. |
---|
true
if codePoint
is a lower case letter;
false
otherwise.
Indicates whether the specified character is a lower case letter.
c | the character to check. |
---|
true
if c
is a lower case letter; false
otherwise.
Indicates whether the specified character is mirrored.
c | the character to check. |
---|
true
if c
is mirrored; false
otherwise.
Indicates whether the specified code point is mirrored.
codePoint | the code point to check. |
---|
true
if codePoint
is mirrored, false
otherwise.
This method is deprecated.
Use isWhitespace(char)
Indicates whether the specified character is a Java space.
c | the character to check. |
---|
true
if c
is a Java space; false
otherwise.Indicates whether the specified character is a Unicode space character. That is, if it is a member of one of the Unicode categories Space Separator, Line Separator, or Paragraph Separator.
c | the character to check. |
---|
true
if c
is a Unicode space character,
false
otherwise.
Indicates whether the specified code point is a Unicode space character. That is, if it is a member of one of the Unicode categories Space Separator, Line Separator, or Paragraph Separator.
codePoint | the code point to check. |
---|
true
if codePoint
is a Unicode space character,
false
otherwise.
Indicates whether codePoint
is within the supplementary code
point range.
codePoint | the code point to test. |
---|
true
if codePoint
is within the supplementary
code point range; false
otherwise.Indicates whether the specified character pair is a valid surrogate pair.
high | the high surrogate unit to test. |
---|---|
low | the low surrogate unit to test. |
true
if high
is a high-surrogate code unit and
low
is a low-surrogate code unit; false
otherwise.Indicates whether the specified code point is a titlecase character.
codePoint | the code point to check. |
---|
true
if codePoint
is a titlecase character,
false
otherwise.
Indicates whether the specified character is a titlecase character.
c | the character to check. |
---|
true
if c
is a titlecase character, false
otherwise.
Indicates whether the specified code point is valid as part of a Unicode identifier other than the first character.
codePoint | the code point to check. |
---|
true
if codePoint
is valid as part of a Unicode
identifier; false
otherwise.
Indicates whether the specified character is valid as part of a Unicode identifier other than the first character.
c | the character to check. |
---|
true
if c
is valid as part of a Unicode
identifier; false
otherwise.
Indicates whether the specified character is a valid initial character for a Unicode identifier.
c | the character to check. |
---|
true
if c
is a valid first character for a
Unicode identifier; false
otherwise.
Indicates whether the specified code point is a valid initial character for a Unicode identifier.
codePoint | the code point to check. |
---|
true
if codePoint
is a valid first character for
a Unicode identifier; false
otherwise.
Indicates whether the specified code point is an upper case letter.
codePoint | the code point to check. |
---|
true
if codePoint
is a upper case letter;
false
otherwise.
Indicates whether the specified character is an upper case letter.
c | the character to check. |
---|
true
if c
is a upper case letter; false
otherwise.
Indicates whether codePoint
is a valid Unicode code point.
codePoint | the code point to test. |
---|
true
if codePoint
is a valid Unicode code point;
false
otherwise.Indicates whether the specified character is a whitespace character in Java.
c | the character to check. |
---|
true
if the supplied c
is a whitespace character
in Java; false
otherwise.
Indicates whether the specified code point is a whitespace character in Java.
codePoint | the code point to check. |
---|
true
if the supplied c
is a whitespace character
in Java; false
otherwise.
Determines the index in the specified character sequence that is offset
codePointOffset
code points from index
.
seq | the character sequence to find the index in. |
---|---|
index | the start index in seq . |
codePointOffset | the number of code points to look backwards or forwards; may be a negative or positive value. |
seq
that is codePointOffset
code
points away from index
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if index < 0 , index is greater than the
length of seq , or if there are not enough values in
seq to skip codePointOffset code points
forwards or backwards (if codePointOffset is
negative) from index . |
Determines the index in a subsequence of the specified character array
that is offset codePointOffset
code points from index
.
The subsequence is delineated by start
and count
.
seq | the character array to find the index in. |
---|---|
start | the inclusive index that marks the beginning of the subsequence. |
count | the number of char values to include within the
subsequence. |
index | the start index in the subsequence of the char array. |
codePointOffset | the number of code points to look backwards or forwards; may be a negative or positive value. |
seq
that is codePointOffset
code
points away from index
.NullPointerException | if seq is null . |
---|---|
IndexOutOfBoundsException | if start < 0 , count < 0 ,
index < start , index > start + count ,
start + count is greater than the length of
seq , or if there are not enough values in
seq to skip codePointOffset code points
forward or backward (if codePointOffset is
negative) from index . |
Reverses the order of the first and second byte in the specified character.
c | the character to reverse. |
---|
Converts the specified Unicode code point into a UTF-16 encoded sequence and returns it as a char array.
codePoint | the Unicode code point to encode. |
---|
codePoint
is a
supplementary code point
,
then the returned array contains two characters, otherwise it
contains just one character.IllegalArgumentException | if codePoint is not a valid code point. |
---|
Converts the specified Unicode code point into a UTF-16 encoded sequence
and copies the value(s) into the char array dst
, starting at
index dstIndex
.
codePoint | the Unicode code point to encode. |
---|---|
dst | the destination array to copy the encoded value into. |
dstIndex | the index in dst from where to start copying. |
char
value units copied into dst
.IllegalArgumentException | if codePoint is not a valid code point. |
---|---|
NullPointerException | if dst is null . |
IndexOutOfBoundsException | if dstIndex is negative, greater than or equal to
dst.length or equals dst.length - 1 when
codePoint is a
supplementary code point . |
Converts a surrogate pair into a Unicode code point. This method assumes
that the pair are valid surrogates. If the pair are not valid
surrogates, then the result is indeterminate. The
isSurrogatePair(char, char)
method should be used prior to this
method to validate the pair.
high | the high surrogate unit. |
---|---|
low | the low surrogate unit. |
Returns the lower case equivalent for the specified character if the character is an upper case letter. Otherwise, the specified character is returned unchanged.
c | the character |
---|
c
is an upper case character then its lower case
counterpart, otherwise just c
.
Returns the lower case equivalent for the specified code point if it is an upper case letter. Otherwise, the specified code point is returned unchanged.
codePoint | the code point to check. |
---|
codePoint
is an upper case character then its lower
case counterpart, otherwise just codePoint
.
Converts the specified character to its string representation.
value | the character to convert. |
---|
Returns a string containing a concise, human-readable description of this object. Subclasses are encouraged to override this method and provide an implementation that takes into account the object's type and data. The default implementation is equivalent to the following expression:
getClass().getName() + '@' + Integer.toHexString(hashCode())
See Writing a useful
toString
method
if you intend implementing your own toString
method.
Returns the title case equivalent for the specified character if it exists. Otherwise, the specified character is returned unchanged.
c | the character to convert. |
---|
c
if it exists, otherwise
c
.
Returns the title case equivalent for the specified code point if it exists. Otherwise, the specified code point is returned unchanged.
codePoint | the code point to convert. |
---|
codePoint
if it exists,
otherwise codePoint
.
Returns the upper case equivalent for the specified character if the character is a lower case letter. Otherwise, the specified character is returned unchanged.
c | the character to convert. |
---|
c
is a lower case character then its upper case
counterpart, otherwise just c
.
Returns the upper case equivalent for the specified code point if the code point is a lower case letter. Otherwise, the specified code point is returned unchanged.
codePoint | the code point to convert. |
---|
codePoint
is a lower case character then its upper
case counterpart, otherwise just codePoint
.
Returns a Character
instance for the char
value passed.
If it is not necessary to get a new Character
instance, it is
recommended to use this method instead of the constructor, since it
maintains a cache of instances which may result in better performance.
c | the char value for which to get a Character instance. |
---|
Character
instance for c
.