native2unicode (MATLAB Functions)

Convert numeric bytes to Unicode characters

Syntax

unicodestr = native2unicode(bytes)
unicodestr = native2unicode(bytes, charset)

Description

unicodestr = native2unicode(bytes) takes a vector containing numeric values in the range [0,255] and converts these values as a stream of 8-bit bytes to Unicode characters. The stream of bytes is assumed to be in the native character set of the machine. Return value unicodestr is a char vector that has the same general array shape as bytes.

The bytes input can be a vector of numeric or character data. You can use the functions fread, fgetl, and fscanf to generate input to this function.

unicodestr = native2unicode(bytes, charset) does the conversion with the assumption that the byte stream is composed of characters from character set charset.

Examples

This example reads and displays some Japanese text. For the final command, disp(str), to display this text correctly, the contents of str must consist entirely of Unicode characters. The call to native2unicode converts text read from the file to Unicode and returns it in str:

fid = fopen('japanese.txt');
b = fread(fid, '*char')';
fclose(fid);
str = native2unicode(b, 'Shift_JIS');
disp(str);

The 'Shift_JIS' argument ensures that str contains the same string on any computer, regardless of how it is configured for language. Note that the computer must be configured to display Japanese (e.g., a Japanese Windows machine) for the output of disp(str) to be correct.

Common names for charset are 'US-ASCII', and 'Shift_JIS'. In this example, the charset string must use 'US-ASCII' characters. Letter case does not matter. For the preferred list of names for charset consult the Web site http://www.iana.org/assignments/character-sets.

See Also

unicode2native

nargoutchk nchoosek