2ef49c56f0
2000-10-03 Benjamin Kosnik <bkoz@purist.soma.redhat.com> * docs/22_locale/howto.html: Add link to proto-documentation on locales. * docs/documentation.html: Rename links for clarity. * src/Makefile.am (headers): Remove unistd.h, wrap_unistd.h. Add fcntl.h, iolibio.h, libioP.h, pthread.h, iconv.h. * src/Makefile.in: Regenerate. From-SVN: r36708
237 lines
8.6 KiB
HTML
237 lines
8.6 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
|
|
<META NAME="AUTHOR" CONTENT="pme@sources.redhat.com (Phil Edwards)">
|
|
<META NAME="KEYWORDS" CONTENT="HOWTO, libstdc++, egcs, g++, libg++, STL">
|
|
<META NAME="DESCRIPTION" CONTENT="HOWTO for the libstdc++ chapter 22.">
|
|
<META NAME="GENERATOR" CONTENT="vi and eight fingers">
|
|
<TITLE>libstdc++-v3 HOWTO: Chapter 22</TITLE>
|
|
<LINK REL="home" HREF="http://sources.redhat.com/libstdc++/docs/22_locale/">
|
|
<LINK REL=StyleSheet HREF="../lib3styles.css">
|
|
<!-- $Id: howto.html,v 1.5 2000/09/19 21:44:30 pme Exp $ -->
|
|
</HEAD>
|
|
<BODY>
|
|
|
|
<H1 CLASS="centered"><A NAME="top">Chapter 22: Localization</A></H1>
|
|
|
|
<P>Chapter 22 deals with the C++ localization facilities.
|
|
</P>
|
|
|
|
|
|
<!-- ####################################################### -->
|
|
<HR>
|
|
<H1>Contents</H1>
|
|
<UL>
|
|
<LI><A HREF="#1">Bjarne Stroustrup on Locales</A>
|
|
<LI><A HREF="#2">Nathan Myers on Locales</A>
|
|
<LI><A HREF="#3">class locale</A>
|
|
<LI><A HREF="#4">class codecvt</A>
|
|
<LI><A HREF="#5">class ctype</A>
|
|
<LI><A HREF="#6">Correct Transformations</A>
|
|
</UL>
|
|
|
|
<HR>
|
|
|
|
<!-- ####################################################### -->
|
|
|
|
<H2><A NAME="1">Stroustrup on Locales</A></H2>
|
|
<P>Dr. Bjarne Stroustrup has released a
|
|
<A HREF="http://www.research.att.com/~bs/3rd_loc0.html">pointer</A>
|
|
to Appendix D of his book,
|
|
<A HREF="http://www.research.att.com/~bs/3rd.html">The C++
|
|
Programming Language (3rd Edition)</A>. It is a detailed
|
|
description of locales and how to use them.
|
|
</P>
|
|
<P>He also writes:
|
|
<BLOCKQUOTE><EM>
|
|
Please note that I still consider this detailed description of
|
|
locales beyond the needs of most C++ programmers. It is written
|
|
with experienced programmers in mind and novices will do best to
|
|
avoid it.
|
|
</EM></BLOCKQUOTE>
|
|
</P>
|
|
<P>Return <A HREF="#top">to top of page</A> or
|
|
<A HREF="../faq/index.html">to the FAQ</A>.
|
|
</P>
|
|
|
|
<HR>
|
|
<H2><A NAME="2">Nathan Myers on Locales</A></H2>
|
|
<P> An article entitled "The Standard C++ Locale" was published in
|
|
Dr. Dobb's Journal and can be found
|
|
<A HREF="http://www.cantrip.org/locale.html">here</A>
|
|
</P>
|
|
<P>Return <A HREF="#top">to top of page</A> or
|
|
<A HREF="../faq/index.html">to the FAQ</A>.
|
|
</P>
|
|
|
|
<HR>
|
|
<H2><A NAME="5">class locale</A></H2>
|
|
<P> Notes made during the implementation of locales can be found
|
|
<A HREF="locale.html">here</A>.
|
|
</P>
|
|
<P>Return <A HREF="#top">to top of page</A> or
|
|
<A HREF="../faq/index.html">to the FAQ</A>.
|
|
</P>
|
|
|
|
<HR>
|
|
<H2><A NAME="4">class codecvt</A></H2>
|
|
<P> Notes made during the implementation of codecvt can be found
|
|
<A HREF="codecvt.html">here</A>.
|
|
</P>
|
|
|
|
<P> The following is the abstract from the implementation notes:
|
|
<BLOCKQUOTE>
|
|
The standard class codecvt attempts to address conversions
|
|
between different character encoding schemes. In particular, the
|
|
standard attempts to detail conversions between the
|
|
implementation-defined wide characters (hereafter referred to as
|
|
wchar_t) and the standard type char that is so beloved in classic
|
|
"C" (which can now be referred to as narrow characters.)
|
|
This document attempts to describe how the GNU libstdc++-v3
|
|
implementation deals with the conversion between wide and narrow
|
|
characters, and also presents a framework for dealing with the huge
|
|
number of other encodings that iconv can convert, including Unicode
|
|
and UTF8. Design issues and requirements are addressed, and examples
|
|
of correct usage for both the required specializations for wide and
|
|
narrow characters and the implementation-provided extended
|
|
functionality are given.
|
|
</BLOCKQUOTE>
|
|
|
|
<P>Return <A HREF="#top">to top of page</A> or
|
|
<A HREF="../faq/index.html">to the FAQ</A>.
|
|
</P>
|
|
|
|
<HR>
|
|
<H2><A NAME="5">class ctype</A></H2>
|
|
<P> Notes made during the implementation of ctype can be found
|
|
<A HREF="ctype.html">here</A>.
|
|
</P>
|
|
<P>Return <A HREF="#top">to top of page</A> or
|
|
<A HREF="../faq/index.html">to the FAQ</A>.
|
|
</P>
|
|
|
|
<HR>
|
|
<H2><A NAME="6">Correct Transformations</A></H2>
|
|
<!-- Jumping directly here from chapter 21. -->
|
|
<P>A very common question on newsgroups and mailing lists is, "How
|
|
do I do <foo> to a character string?" where <foo> is
|
|
a task such as changing all the letters to uppercase, to lowercase,
|
|
testing for digits, etc. A skilled and conscientious programmer
|
|
will follow the question with another, "And how do I make the
|
|
code portable?"
|
|
</P>
|
|
<P>(Poor innocent programmer, you have no idea the depths of trouble
|
|
you are getting yourself into. 'Twould be best for your sanity if
|
|
you dropped the whole idea and took up basket weaving instead. No?
|
|
Fine, you asked for it...)
|
|
</P>
|
|
<P>The task of changing the case of a letter or classifying a character
|
|
as numeric, graphical, etc, all depends on the cultural context of the
|
|
program at runtime. So, first you must take the portability question
|
|
into account. Once you have localized the program to a particular
|
|
natural language, only then can you perform the specific task.
|
|
Unfortunately, specializing a function for a human language is not
|
|
as simple as declaring
|
|
<TT> extern "Danish" int tolower (int); </TT>.
|
|
</P>
|
|
<P>The C++ code to do all this proceeds in the same way. First, a locale
|
|
is created. Then member functions of that locale are called to
|
|
perform minor tasks. Continuing the example from Chapter 21, we wish
|
|
to use the following convenience functions:
|
|
<PRE>
|
|
namespace std {
|
|
template <class charT>
|
|
charT
|
|
toupper (charT c, const locale& loc) const;
|
|
template <class charT>
|
|
charT
|
|
tolower (charT c, const locale& loc) const;
|
|
}</PRE>
|
|
This function extracts the appropriate "facet" from the
|
|
locale <EM>loc</EM> and calls the appropriate member function of that
|
|
facet, passing <EM>c</EM> as its argument. The resulting character
|
|
is returned.
|
|
</P>
|
|
<P>For the C/POSIX locale, the results are the same as calling the
|
|
classic C <TT>toupper/tolower</TT> function that was used in previous
|
|
examples. For other locales, the code should Do The Right Thing.
|
|
</P>
|
|
<P>Of course, these functions take a second argument, and the
|
|
transformation algorithm's operator argument can only take a single
|
|
parameter. So we write simple wrapper structs to handle that.
|
|
</P>
|
|
<P>The next-to-final version of the code started in Chapter 21 looks like:
|
|
<PRE>
|
|
#include <iterator> // for back_inserter
|
|
#include <locale>
|
|
#include <string>
|
|
#include <algorithm>
|
|
#include <cctype> // old <ctype.h>
|
|
|
|
struct Toupper
|
|
{
|
|
Toupper (std::locale const& l) : loc(l) {;}
|
|
char operator() (char c) { return std::toupper(c,loc); }
|
|
private:
|
|
std::locale const& loc;
|
|
};
|
|
|
|
struct Tolower
|
|
{
|
|
Tolower (std::locale const& l) : loc(l) {;}
|
|
char operator() (char c) { return std::tolower(c,loc); }
|
|
private:
|
|
std::locale const& loc;
|
|
};
|
|
|
|
int main ()
|
|
{
|
|
std::string s ("Some Kind Of Initial Input Goes Here");
|
|
Toupper up ( std::locale("C") );
|
|
Tolower down ( std::locale("C") );
|
|
|
|
// Change everything into upper case
|
|
std::transform (s.begin(), s.end(), s.begin(),
|
|
up
|
|
);
|
|
|
|
// Change everything into lower case
|
|
std::transform (s.begin(), s.end(), s.begin(),
|
|
down
|
|
);
|
|
|
|
// Change everything back into upper case, but store the
|
|
// result in a different string
|
|
std::string capital_s;
|
|
std::transform (s.begin(), s.end(), std::back_inserter(capital_s),
|
|
up
|
|
);
|
|
}</PRE>
|
|
</P>
|
|
<P>The final version of the code uses <TT>bind2nd</TT> to eliminate
|
|
the wrapper structs, but the resulting code is tricky. I have not
|
|
shown it here because no compilers currently available to me will
|
|
handle it.
|
|
</P>
|
|
<P>Return <A HREF="#top">to top of page</A> or
|
|
<A HREF="../faq/index.html">to the FAQ</A>.
|
|
</P>
|
|
|
|
|
|
|
|
|
|
<!-- ####################################################### -->
|
|
|
|
<HR>
|
|
<P CLASS="fineprint"><EM>
|
|
Comments and suggestions are welcome, and may be sent to
|
|
<A HREF="mailto:pme@sources.redhat.com">Phil Edwards</A> or
|
|
<A HREF="mailto:gdr@egcs.cygnus.com">Gabriel Dos Reis</A>.
|
|
<BR> $Id: howto.html,v 1.5 2000/09/19 21:44:30 pme Exp $
|
|
</EM></P>
|
|
|
|
|
|
</BODY>
|
|
</HTML>
|