Skip Headers
Oracle® Database Globalization Support Guide
11g Release 2 (11.2)

Part Number E10729-07
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub
NLSPG015

B Unicode Character Code Assignments

This appendix offers an introduction to Unicode character assignments. This appendix contains these topics:

NLSPG602

Unicode Code Ranges

Table B-1 contains code ranges that have been allocated in Unicode for UTF-16 character codes.

NLSPG969Table B-1 Unicode Character Code Ranges for UTF-16 Character Codes

Types of Characters First 16 Bits Second 16 Bits

ASCII

0000-007F

-

European (except ASCII), Arabic, Hebrew

0080-07FF

-

Iindic, Thai, certain symbols (such as the euro symbol), Chinese, Japanese, Korean

0800-0FFF

1000 - CFFF

D000 - D7FF

F900 - FFFF

-

Private Use Area #1

E000 - EFFF

F000 - F8FF

-

Supplementary characters: Additional Chinese, Japanese, and Korean characters; historic characters; musical symbols; mathematical symbols

D800 - D8BF

D8CO - DABF

DAC0 - DB7F

DC00 - DFFF

DC00 - DFFF

DC00 - DFFF

Private Use Area #2

DB80 - DBBF

DBC0 - DBFF

DC00 - DFFF

DC00 - DFFF


Table B-2 contains code ranges that have been allocated in Unicode for UTF-8 character codes.

NLSPG970Table B-2 Unicode Character Code Ranges for UTF-8 Character Codes

Types of Characters First Byte Second Byte Third Byte Fourth Byte

ASCII

00 - 7F

-

-

-

European (except ASCII), Arabic, Hebrew

C2 - DF

80 - BF

-

-

Indic, Thai, certain symbols (such as the euro symbol), Chinese, Japanese, Korean

E0

E1 - EC

ED

EF

A0 - BF

80 - BF

80 - 9F

A4 - BF

80 - BF

80 - BF

80 - BF

80 - BF

-

Private Use Area #1

EE

EF

80 - BF

80 - A3

80 - BF

80 - BF

-

Supplementary characters: Additional Chinese, Japanese, and Korean characters; historic characters; musical symbols; mathematical symbols

F0

F1 - F2

F3

90 - BF

80 - BF

80 - AF

80 - BF

80 - BF

80 - BF

80 - BF

80 - BF

80 - BF

Private Use Area #2

F3

F4

B0 - BF

80 - 8F

80 - BF

80 - BF

80 - BF

80 - BF


Note:

Blank spaces represent nonapplicable code assignments. Character codes are shown in hexadecimal representation.
NLSPG603

UTF-16 Encoding

As shown in Table B-1, UTF-16 character codes for some characters (Additional Chinese/Japanese/Korean characters and Private Use Area #2) are represented in two units of 16-bits. These are supplementary characters. A supplementary character consists of two 16-bit values. The first 16-bit value is encoded in the range from 0xD800 to 0xDBFF. The second 16-bit value is encoded in the range from 0xDC00 to 0xDFFF. With supplementary characters, UTF-16 character codes can represent more than one million characters. Without supplementary characters, only 65,536 characters can be represented. The AL16UTF16 character set in Oracle Database supports supplementary characters.

NLSPG604

UTF-8 Encoding

The UTF-8 character codes in Table B-2 show that the following conditions are true:

In Oracle Database, the AL32UTF8 character set supports 1-byte, 2-byte, 3-byte, and 4-byte values. In Oracle Database, the UTF8 character set supports 1-byte, 2-byte, and 3-byte values, but not 4-byte values.

Reader Comment

   

Comments, corrections, and suggestions are forwarded to authors every week. By submitting, you confirm you agree to the terms and conditions. Use the OTN forums for product questions. For support or consulting, file a service request through My Oracle Support.

Hide Navigation

Quick Lookup

Database Library · Master Index · Master Glossary · Book List · Data Dictionary · SQL Keywords · Initialization Parameters · Advanced Search · Error Messages

Main Categories

This Page

This Document

New and changed documents:
RSS Feed HTML RSS Feed PDF