MATLAB Function Reference |
Convert numeric bytes to Unicode characters
Syntax
Description
unicodestr = native2unicode(bytes)
takes a vector containing numeric values in the range [0,255] and converts these values as a stream of 8-bit bytes to Unicode characters. The stream of bytes is assumed to be in the native character set of the machine. Return value unicodestr
is a char
vector that has the same general array shape as bytes
.
The bytes
input can be a vector of numeric or character data. You can use the functions fread
, fgetl
, and fscanf
to generate input to this function.
unicodestr = native2unicode(bytes, charset)
does the conversion with the assumption that the byte stream is composed of characters from character set charset
.
Examples
This example reads and displays some Japanese text. For the final command, disp
(str)
, to display this text correctly, the contents of str
must consist entirely of Unicode characters. The call to native2unicode
converts text read from the file to Unicode and returns it in str
:
fid = fopen('japanese.txt'); b = fread(fid, '*char')'; fclose(fid); str = native2unicode(b, 'Shift_JIS'); disp(str);
The 'Shift_JIS'
argument ensures that str
contains the same string on any computer, regardless of how it is configured for language. Note that the computer must be configured to display Japanese (e.g., a Japanese Windows machine) for the output of disp(str)
to be correct.
Common names for charset
are 'US-ASCII'
, and 'Shift_JIS'
. In this example, the charset
string must use 'US-ASCII'
characters. Letter case does not matter. For the preferred list of names for charset
consult the Web site http://www.iana.org/assignments/character-sets.
See Also
nargoutchk | nchoosek |
© 1994-2005 The MathWorks, Inc.