ICU 64.2
64.2
|
C++ API: Unicode String. More...
#include <cstddef>
#include "unicode/utypes.h"
#include "unicode/char16ptr.h"
#include "unicode/rep.h"
#include "unicode/std_string.h"
#include "unicode/stringpiece.h"
#include "unicode/bytestream.h"
Go to the source code of this file.
Data Structures | |
class | icu::UnicodeString |
UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String and StringBuffer/StringBuilder classes. More... | |
Namespaces | |
icu | |
File coll.h. | |
Macros | |
#define | US_INV icu::UnicodeString::kInvariant |
Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string. More... | |
#define | UNICODE_STRING(cs, _length) icu::UnicodeString(TRUE, u ## cs, _length) |
Unicode String literals in C++. More... | |
#define | UNICODE_STRING_SIMPLE(cs) UNICODE_STRING(cs, -1) |
Unicode String literals in C++. More... | |
#define | UNISTR_FROM_CHAR_EXPLICIT |
This can be defined to be empty or "explicit". More... | |
#define | UNISTR_FROM_STRING_EXPLICIT |
This can be defined to be empty or "explicit". More... | |
#define | UNISTR_OBJECT_SIZE 64 |
Desired sizeof(UnicodeString) in bytes. More... | |
Typedefs | |
typedef int32_t | UStringCaseMapper(int32_t caseLocale, uint32_t options, icu::BreakIterator *iter, char16_t *dest, int32_t destCapacity, const char16_t *src, int32_t srcLength, icu::Edits *edits, UErrorCode &errorCode) |
Internal string case mapping function type. More... | |
Functions | |
int32_t | u_strlen (const UChar *s) |
Determine the length of an array of UChar. More... | |
U_COMMON_API UnicodeString | icu::operator+ (const UnicodeString &s1, const UnicodeString &s2) |
Create a new UnicodeString with the concatenation of two others. More... | |
C++ API: Unicode String.
Definition in file unistr.h.
#define UNICODE_STRING | ( | cs, | |
_length | |||
) | icu::UnicodeString(TRUE, u ## cs, _length) |
Unicode String literals in C++.
Note: these macros are not recommended for new code. Prior to the availability of C++11 and u"unicode string literals", these macros were provided for portability and efficiency when initializing UnicodeStrings from literals.
They work only for strings that contain "invariant characters", i.e., only latin letters, digits, and some punctuation. See utypes.h for details.
The string parameter must be a C string literal. The length of the string, not including the terminating NUL
, must be specified as a constant.
#define UNICODE_STRING_SIMPLE | ( | cs | ) | UNICODE_STRING(cs, -1) |
Unicode String literals in C++.
Dependent on the platform properties, different UnicodeString constructors should be used to create a UnicodeString object from a string literal. The macros are defined for improved performance. They work only for strings that contain "invariant characters", i.e., only latin letters, digits, and some punctuation. See utypes.h for details.
The string parameter must be a C string literal.
#define UNISTR_FROM_CHAR_EXPLICIT |
#define UNISTR_FROM_STRING_EXPLICIT |
This can be defined to be empty or "explicit".
If explicit, then the UnicodeString(const char *) and UnicodeString(const char16_t *) constructors are marked as explicit, preventing their inadvertent use.
In particular, this helps prevent accidentally depending on ICU conversion code by passing a string literal into an API with a const UnicodeString & parameter.
Definition at line 166 of file unistr.h.
Referenced by icu::UnicodeString::UnicodeString().
#define UNISTR_OBJECT_SIZE 64 |
Desired sizeof(UnicodeString) in bytes.
It should be a multiple of sizeof(pointer) to avoid unusable space for padding. The object size may want to be a multiple of 16 bytes, which is a common granularity for heap allocation.
Any space inside the object beyond sizeof(vtable pointer) + 2 is available for storing short strings inside the object. The bigger the object, the longer a string that can be stored inside the object, without additional heap allocation.
Depending on a platform's pointer size, pointer alignment requirements, and struct padding, the compiler will usually round up sizeof(UnicodeString) to 4 * sizeof(pointer) (or 3 * sizeof(pointer) for P128 data models), to hold the fields for heap-allocated strings. Such a minimum size also ensures that the object is easily large enough to hold at least 2 char16_ts, for one supplementary code point (U16_MAX_LENGTH).
sizeof(UnicodeString) >= 48 should work for all known platforms.
For example, on a 64-bit machine where sizeof(vtable pointer) is 8, sizeof(UnicodeString) = 64 would leave space for (64 - sizeof(vtable pointer) - 2) / U_SIZEOF_UCHAR = (64 - 8 - 2) / 2 = 27 char16_ts stored inside the object.
The minimum object size on a 64-bit machine would be 4 * sizeof(pointer) = 4 * 8 = 32 bytes, and the internal buffer would hold up to 11 char16_ts in that case.
Definition at line 204 of file unistr.h.
Referenced by icu::UnicodeString::UnicodeString().
#define US_INV icu::UnicodeString::kInvariant |
Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string.
About invariant characters see utypes.h. This constructor has no runtime dependency on conversion code and is therefore recommended over ones taking a charset name string (where the empty string "" indicates invariant-character conversion).
typedef int32_t UStringCaseMapper(int32_t caseLocale, uint32_t options, icu::BreakIterator *iter, char16_t *dest, int32_t destCapacity, const char16_t *src, int32_t srcLength, icu::Edits *edits, UErrorCode &errorCode) |