Character Set News ////////////////////////////////////////////////////////////////////////////// The culture of computer technology started out primarily speaking the English language, but, as computers become more a part of everyday life, computer technology has been forced to pay attention to the diverse languages that humans around the globe speak and write. Most of these languages need an alphabet bigger than what ASCII provides. Therefore, programmers have learned* the painstaking techniques of supporting diverse character sets in their software. Thus we have Unicode, ISO-8859-1, UTF-8, etc. Other variations in computer character sets occur because vendors have wished to support the drawing of lines and simple graphic images using character-cell glyphs. With GUI environments now deployed everywhere, such uses of different character sets are less common than they once were, but the issue still sometimes arises. This file contains a varying collection of notes and folklore on how to set up and use different character sets. .............................................................................. * Although some programmers, such as the inventors of PHP, ignored the issue as long as possible. ////////////////////////////////////////////////////////////////////////////// A useful book for such matters is "Creating WorldWide Software: Solaris International Developer's Guide" (2nd. ed.) by Bill Tuthill and David Smallberg Sun Microsystems Press, 1997, ISBN 0-13-494493-3, 382 pages, $65 US. http://vig.prenhall.com/catalog/academic/product?ISBN=0134944933 The book treats Unix, CDE, Motif, X11, but these apply to Linux also. Topics include: * establishing locale environments * encoding character sets * displaying localized text * messaging for program translation * handling language input * localizing software after internationalization * gettext() vs. catgets() * discussion of EUC, Unicode, ISO-8859-X * diagrams of hand gestures that could be obscene in some cultures * how the mysterious abbreviations "I18N" and "L10N" were formed :-) Sun Microsystems Press/Prentice-Hall Professional, Technical, and Reference 1 Lake Street Upper Saddle River, NJ 07458 USA Web: http://www.prenhall.com/ US Orders: 1 800/282-0693 Fax: +1 201/236-7141 US Bulk: 1 800/382-3419 UK Fax: +44 1279 414130 AU Fax: +61 02 9453 0117 SG Fax: +65 378 0370 ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.dcom.telecom Path: cs.utk.edu!emory!europa.eng.gtefsd.com!howland.reston.ans.net !spool.mu.edu!telecom-request Date: 29 Oct 1993 12:20 -0600 Message-ID: X-Telecom-Digest: Volume 13, Issue 725, Message 1 of 8 From: Rob Slade Subject: Book Review: "The Unicode Standard" BKUNICOD.RVW 980921 Addison-Wesley Publishing Co. P.O. Box 520 26 Prince Andrew Place Don Mills, Ontario M3C 2T8 416-447-5101 fax: 416-443-0948 or 1 Jacob Way Reading, MA 01867-9984 800-527-5210 617-944-3700 or 5851 Guion Road Indianapolis, IN 46254 800-447-2226 or Unicode, Inc. 1965 Charleston Road Mountain View, CA 94043 (415) 961-4189 Fax: (415) 966-1637 "The Unicode Standard", U$32.95/C$42.95 In the dim and distant past, the late (and generally unlamented) SUZY Information System was born in Vancouver. Rather an oddball as far as online services went, one "feature" was that the programmer had tried to allow for the use of all of the IBM graphics characters. This led to an entirely new field of "smiley" or "emoticon" (emotional icon) endeavours. Instead of the usual sideways happy face of the colon, hyphen and right parenthesis; ":-)"; we were able to use the "Ctrl-A" alternative of the IBM PC character set. Having a decimal value of one, this character is an upright happy face. This allowed other expansions, such as Ctrl-A and the right square bracket, which looks like a face and a telephone handset, and was used (usually in the "chat" modes) for "I am on the phone." "How nice," I hear you mutter between clenched teeth. "Can we now get on with the review?" Patience, stout nerds. This *is* the review. As SUZY users, particularly those who had been introduced to computer communications on the system, moved on to other services or local bulletin boards, they were usually quite shocked to find that their favourite symbols no longer worked. The little diamond (Ctrl-C) would kill a message on a VAX. Fidonet users might find that the cute tagline they had formed from graphics characters completely disappeared when they sent the message through an Internet gateway. ASCII (the American Standard Code for Information Interchange) is widely, and mistakenly, believed to define two hundred and fifty-six characters. It doesn't. Furthermore, of the hundred and twenty-eight characters it does define, many are "control" rather than printable characters. (The "card suit" symbols on the IBM PC graphics set are defined as "end of text", "end of transmission", "enquiry" and "acknowledgement" under the real ASCII standard.) In addition, many believe ASCII to be a universal standard; also not true. An octet with the decimal value thirty-five, for example, is the number sign (sometimes called an "octothorpe") in the United States, but a pound sign (the British currency) in Britain. As with most fields of computer endeavor, the nice thing about standards is that there are so many to choose from. Many vary only slightly--but they vary. The point is that there are a number of symbols which we commonly know, but which cannot be consistently displayed on terminals or printers. Certain terminals will have certain "international" character sets, but not all are identical. Accents and other phonetic modifiers may be difficult to handle: entire character sets are given over strictly to accented characters. (In Canada we are acutely aware of the problems, with "French" keyboards used at many sites. On one, I was having difficulty finding some necessary punctuation marks for network addressing, and asked a Francophone programmer for help. "Who knows," he growled, "I never use the ____ things!") Unicode seeks to address this problem. Including not only the variations on the Latin alphabet, Unicode incorporates Greek, Cyrillic, Hebrew and other alphabets. It also includes punctuation, diacriticals, mathematical and scientific symbols and miscellaneous graphics. Asian ideographs are also assigned codes. This is no longer suitable, of course, for a seven-bit code, and Unicode is based on a sixteen-bit address space. The book gives some background and plans (chapter one), general principles and rules for conformance (chapter two). To comment on these in any meaningful way would be to rewrite these chapters. This is technical material, though not the same technology that computer types are used to. Some background study in linguistics would be a good idea, although it is not strictly necessary to understand and use the Unicode standard. There are, however, a wealth of symbols, punctuation marks and typesetting codes which Unicode gives standardized access to. On the other hand, any application which used the standard in a significant way would likely require a linguistics background in any case. The bulk of the books (two volumes) is, of course, taken up with the actual code charts. (Volume two, in fact, is almost completely concerned with Han ideographs. In spite of the recent widespread use of the English alphabet, this is still the standard written language of Chinese, Japanese and Korean: CJK in Unicode terminology.) The charts are augmented with verbal definitions of the symbols, and with cross references to similar forms. The Unicode standard is recent. In comparative terms its current [1993] usage is negligible. However, it is the defacto standard for broadly based international character sets. With the recent rejection of the proposed ISO thirty-two bit standard, and the recasting of that standard to follow Unicode's lead, Unicode is a significant factor in the development of any international applications. copyright Robert M. Slade, 1993 BKUNICOD.RVW 980921 (Postscriptum - Unicode Inc. maintains an FTP site at unicode.org (192.195.185.2). Some of the mapping tables, and the Han cross reference lists are available. Some tables are also available on IBM PC or Mac compatible floppy disks.) http://www.unicode.org/ Permission granted to distribute only with unedited copies of TELECOM Digest and associated newsgroups/mailing lists. DECUS Canada Communications, Desktop, Education and Security group newsletters Editor and/or reviewer ROBERTS@decus.ca, RSlade@sfu.ca, Rob Slade at 1:153/733 DECUS Symposium '94, Vancouver, BC, Mar 1-3, 1994, contact: rulag@decus.ca .............................................................................. .............................................................................. An older introductory book on this subject is "Coded Character Sets: History and Development" by C. E. MacKenzie. Reading: Addison-Wesley, 1980. ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals References: <8aoi4p$uv1$1@nnrp1.deja.com> Message-ID: <8aooiv$2i5$1@newsmaster.cc.columbia.edu> Organization: Columbia University Date: 15 Mar 2000 19:33:51 GMT From: Jeffrey Altman Subject: Re: change from ASCII to ANSI character set in DOS window In article <8aoi4p$uv1$1@nnrp1.deja.com>, wrote: : Hi, : : I am running NT 4.0 SP3. My DOS window currently is displaying : the ASCII character set. However I want it to display the ANSI : character set. How do I do this? The Console window is Unicode based. The font that is displayed is Unicode if you are using a TrueType font such as LucidaConsole or Code Page based (CP437, CP850, ...) if you are using raster fonts. The console application has a choice of writing to the screen using the active Code Page or Unicode. NT provides the proper translations. CP1252 is the Windows variation of ISO-Latin1 that you refer to as ANSI. To use this code page in your application, use SetConsoleCP() and SetConsoleOutputCP(). -- Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * .............................................................................. Message-ID: References: NNTP-Posting-Host: mail.pharmapartners.nl Newsgroups: comp.mail.pine Date: 6 Oct 2000 07:57:19 GMT From: Villy Kruse Subject: Re: Pine and French characters On Thu, 5 Oct 2000 13:29:02 -0400, Gopi Sundaram wrote: >On Thu, 5 Oct 2000, Samuel W. Heywood wrote: > >> If the character set used in Windows is not backward-compatible >> with DOS, then Windows does not adhere to the standard. > >I don't know what standards you are talking about, but I'm glad that >Windows finally used the ISO standard, whereas DOS didn't. Well, actualy Windows tries to "improve" on iso-8859-1 and calls that windows-1252. The difference is that some values in the range 0x80 to 0x9f has been assigned to characters, which are missing in iso-8859-1 For example the euro sign is 0x80 in win1252, but doesn't exist in iso-8859-1. However it will be 0xA4 in iso-8859-15 aka latin-9. Check the alphabet soup at http://www.czyborra.com/ and see how standard the various standards really are. Villy .............................................................................. Newsgroups: comp.mail.pine Message-ID: References: Date: Mon, 9 Oct 2000 12:54:19 +0200 Organization: Knights of the Round Tuit From: "Alan J. Flavell" Subject: Re: Pine and French characters On Sat, 7 Oct 2000, Samuel W. Heywood wrote: > Thanks a lot for the URL. Now that I've read about QUOTED PRINTABLE I > understand that it probably would be best to load a code page for > handling this ISO-8859-1 character set. Excuse me but you're not quite with us yet. You would certainly be advised to load a code page that covers the Latin-1 repertoire; but the recommendation would be to load the cp850 code page, which covers this repertoire but it's _not_ the iso-8859-1 character coding itself. PINE knows how to mediate between the two, as we've already covered in this thread. There is a relatively obscure code page, cp819, which represents the iso-8859-1 character coding. However, if you load it, you are going to find quite a number of conventional DOS applications displaying bizarre characters in their menus etc, instead of the DOS "box drawing" characters which they expected. You'll find some brief (and old) notes of mine here http://ppewww.ph.gla.ac.uk/~flavell/iso8859/iso8859-pointers.html#cp819 but I don't recommend that. Unless you have some special requirement that we haven't discussed here, I recommend that you use cp850. cheers Alan ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris NNTP-Posting-Host: polaris.nada.kth.se NNTP-Posting-Date: Tue, 22 Mar 2005 21:00:45 +0000 (UTC) References: <1ba43d95.0503220854.7fabcd11@posting.google.com> Message-ID: Organization: Dept of Numerical Analysis and Computer Science, KTH Date: Tue, 22 Mar 2005 22:00:35 +0100 From: Mårten Svantesson Subject: Re: UNIX - Locale - conversion:char to UNICODE - glitch:code page is different ! eelagain@yahoo.com (Neel) writes: > > Hi, > > As subject says, I want to convert string from char to unicode but > with differnt code page ! > > I wanted to know as we have mbstowcs in windows, do we have any call > which will convert char to UNICODE in UNIX/Solaris/Linux ? > > Additionally does this take care of differnet code pages too ? For > instance if the string which needs to be converted is from different > code page than current code page ! No problem. Though in Solaris I've never seen the term "code page". I would think that you mean "code set". Anyway, the functions you are looking for are iconv_open iconv and iconv_close The man pages iconv(3C) (contains a decent example) and iconv_unicode(5) should get you started. (That is, you execute "man -s 3c iconv" and "man -s 5 iconv_unicode".) The functions are standardised in UNIX98 and should work in other unices as well. -- - Mårten mail: msv@kth.se *** ICQ: 4356928 *** mobile: +46 (0)707390385 ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris Message-ID: References: <163a514a.0303291014.43db1538@posting.google.com> <3E85EAFE.AFC1579E@alcyone.com> Date: Sat, 29 Mar 2003 11:42:04 -0800 From: Yongtao You Subject: Re: iconv from Extended ASCII to UTF-8 "Erik Max Francis" wrote in message news:3E85EAFE.AFC1579E@alcyone.com... > Yongtao wrote: > > > I have a program that calls iconv() to do conversion from 8859-1 to > > UTF-8. Everything works fine when the inputs are standard ASCII chars > > (0-127). However, it failed with an errno of 88 (EILSEQ) when there > > are Extended ASCII chars (>128) present. What's interesting is, if I > > use the /usr/bin/iconv program to do the exact same conversion (with > > exactly the same Extended ASCII chars), it works. > > > You are converting from Latin-1 to UTF-8, but you say the data is > actually "extended ASCII." The problem is that "extended ASCII" just > means "ASCII with unspecified 8-bit characters included," i.e., it means > "ASCII and some other unknown stuff." That's not Latin-1, so a Latin-1 > conversion utility is almost certain to have problem with arbitrary > "extended ASCII" data. > > The key to doing the proper conversion is to find out precisely what the > data is. It's not Latin-1, and it's not ASCII, so what is it? Once you > find out what it really is, you'll be able to convert it properly. > Conversion algorithms can only do the right thing when they're given > valid data; you're not giving it valid data. > > > The question is, how can I make the iconv() call do the same thing the > > /usr/bin/iconv program does? Before this, I thought they were doing > > exactly the same thing. > > Presumably the standalone program is running in a more permissive mode, > where invalid conversions are suppressed rather than ignored. Look for > a part of the low-level API that allows you to do this. > > -- > Erik Max Francis / max@alcyone.com / http://www.alcyone.com/max/ > __ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE > / \ Sit loosely in the saddle of life. > \__/ Robert Louis Stevenson > Discord / http://www.alcyone.com/pyos/discord/ > Convert dates from Gregorian to Discordian. Erik, Thanks for your reply. The two "Extended ASCII chars" I was talking about are 0xA7 and 0xDA. According to this page: http://www.utoronto.ca/webdocs/HTMLdocs/NewHTML/iso_table.html both are listed as valid ISO8859-1 chars. Should I expect them to be converted correctly? BTW, I am using Solaris 8. Thanks. Yongtao .............................................................................. Newsgroups: comp.unix.solaris Date: 29 Mar 2003 21:38:28 -0800 Organization: Twin Sun Inc, El Segundo, CA, USA Message-ID: <7wn0jdwh0r.fsf@sic.twinsun.com> References: <163a514a.0303291014.43db1538@posting.google.com> <3E85EAFE.AFC1579E@alcyone.com> From: Paul Eggert Subject: Re: iconv from Extended ASCII to UTF-8 "Yongtao You" writes: > The two "Extended ASCII chars" I was talking about are 0xA7 and 0xDA. > According to this page: > > http://www.utoronto.ca/webdocs/HTMLdocs/NewHTML/iso_table.html > > both are listed as valid ISO8859-1 chars. Should I expect them to be > converted correctly? Sure, if you specify 8859-1 rather than ASCII. > BTW, I am using Solaris 8. Then I suggest that you install Sun patch 113261, if you're messing with this stuff. It's freely available from http://sunsolve.sun.com/ The current patch revision is 113261-02. ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.protocols.tcp-ip Organization: Sun Microsystems Inc. - BDC Message-ID: References: <%bEka.12$H12.409@paloalto-snr1.gtei.net> Date: 11 Apr 2003 08:24:37 -0400 From: James Carlson Subject: right to left [was Re: Mystic questions about TCP/IP] "Glen Herrmannsfeldt" writes: > > Are Hebrew numbers written MSB to the left or right? When I was at Data General, one of the projects I worked on was a semitic-mode (Arabic and Hebrew) terminal. Right-to-left is really quite special. Numbers and foreign (i.e., English) text are written left-to-right, but native text is written right-to-left. This means that when you're typing on such a terminal, the cursor starts at the right and moves left as you type. When you start typing a number, though, the cursor stops moving and the text shifts off to the left. When you stop typing the number or foreign text, the cursor jumps left over the text you've typed to start going right-to-left again. In addition to that, there's usually a big "mode switch" that allows the terminal to be used in right-to-left or left-to-right modes and can be switched on the fly. And in addition to that, Arabic (at least) has left- and right- connected forms for each of the 40 basic characters, and the connectedness of each one depends on what character is to the left and right of that one. (I.e., most characters have four forms: not connected, connected left, connected right, and connected both.) To say that it's hard to implement correctly (imagine what 'insert character' and 'delete character' do) is putting it mildly. (This is all from 12+ year old memory now ... so some of it might be slightly off. Corrections welcome, of course.) -- James Carlson, Solaris Networking Sun Microsystems / 1 Network Drive 71.234W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.497N Fax +1 781 442 1677 ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals References: <14ad014.0504190527.332fa6da@posting.google.com> Message-ID: <116ac9hbkra6nc5@corp.supernews.com> Date: Tue, 19 Apr 2005 16:26:25 -0000 From: Thomas Dickey Subject: Re: Putty input characters Bjoern Wolfgardt wrote: > Hi, > I have a problem with Putty. I have a test tool on our host that > displays special characters (umlaute, 'ä' ae, 'ü' ue...). > They are displayed correcly. But if I press the 'ä' key, the character > is not displayed. The host uses my input as a control key or something > else. > So my question is: > How do I get input and output to work with german keyboard (and > umlaute)? See PuTTY's configuration (window/translations). Your session is probably assuming that input is UTF-8 rather than ISO-8859-1. (this should be in PuTTY's faq). -- Thomas E. Dickey http://invisible-island.net/ ftp://invisible-island.net/ .............................................................................. Newsgroups: comp.terminals NNTP-Posting-Host: hb-server-02.buhlmann.de [217.7.105.122] NNTP-Posting-Date: Wed, 20 Apr 2005 07:23:28 +0000 (UTC) References: <14ad014.0504190527.332fa6da@posting.google.com> <116ac9hbkra6nc5@corp.supernews.com> Message-ID: <14ad014.0504192323.1fd18816@posting.google.com> Date: 20 Apr 2005 00:23:27 -0700 From: Bjoern Wolfgardt Subject: Re: Putty input characters Thomas Dickey wrote in message news:<116ac9hbkra6nc5@corp.supernews.com>... > > see PuTTY's configuration (window/translations). Your session is > probably assuming that input is UTF-8 rather than ISO-8859-1. > > (this should be in PuTTY's faq). Thank you, It is not in the FAQ (or I didn't find it). So it is not in Putty? It is a host configuration? cu Bjoern .............................................................................. Newsgroups: comp.terminals NNTP-Posting-Host: rapun.sel.cam.ac.uk References: <14ad014.0504190527.332fa6da@posting.google.com> <116ac9hbkra6nc5@corp.supernews.com> Message-ID: <837jixzqsw.fsf@chiark.greenend.org.uk> Organization: University of Cambridge, England Date: 20 Apr 2005 11:27:27 +0100 From: Owen Dunn Subject: Re: Putty input characters Thomas Dickey writes: > > see PuTTY's configuration (window/translations). Your session is > probably assuming that input is UTF-8 rather than ISO-8859-1. > > (this should be in PuTTY's faq). Shockingly, we reserve our FAQ for questions which really are frequently asked :-). (S) ////////////////////////////////////////////////////////////////////////////// \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ Newsgroups: comp.std.internat,comp.protocols.tcp-ip Path: utkcs2!emory!samsung!cs.utexas.edu!sun-barr!decwrl!mcnc!uvaarpa!murdoch Message-ID: <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU> References: <16968@hoptoad.uucp> <1110@sranha.sra.co.jp> Sender: usenet@murdoch.acc.Virginia.EDU Organization: University of Virginia Lines: 60 Date: 10 Apr 1991 17:27:56 GMT From: randall@Virginia.EDU (Randall Atkinson) Subject: Re: universality of Latin-1 John Gilmore originally wrote: % % And my windows all use ISO Latin 1. If Torbj|rn would send the % umlauted letter in that standardized character set, it would look right % in both the States and in Sweden. In article <1110@sranha.sra.co.jp>, Erik M. van der Poel responded: > > Have you ever tried to send yourself a message in Latin-1? Did it > work? And even if *you* have a reasonable version of sendmail (one > that doesn't strip the 8th bit), what makes you so certain that > Torbj|rn's message and anyone else's won't pass through a site that > *does* strip the 8th bit? It does work for a fair and ever increasing subset of the Internet. BITNET doesn't do very well with it. Clearly we need to move towards 8-bit and 16-bit and 32-bit transparent mail-transport mechanisms. Fortunately there are a number of possible transport mechanisms out there to choose from, some of which are already 8-bit transparent. > Also, what's so "standardized" about ISO Latin-1? What makes it more > standard than, say, Latin-2? ISO 8859/1 is NOT any "more standard" than ISO 8859/2, however sites in the US are in fact migrating towards ISO 8859/1 from US ASCII and most sites in the US are NOT migrating towards ISO 8859/2 (though they might support it on the side as vendors begin to). The languages that are most commonly used in the US are in ISO 8859/1 and the languages supported by ISO 8859/2 are less commonly used (again in the US as a whole). Note that ISO Latin-1 is ISO 8859/1 which is the 8-bit character set used for Western European languages. ISO Latin-2 is ISO 8859/2 which is the 8-bit character set for Eastern European languages. Clearly we need to add additional information to the header of mail messages to indicate which character set to use. I'm not sure of the current state of the Internet protocols (RFC 822 et. al.) with respect to this. If there isn't the equivalent of a "Character-set:" header yet, serious consideration should be given to adding one with clearly defined values for at least existing ANSI and ISO character sets. [ARCHIVER'S NOTE: the Multipurpose Internet Mail Extensions (MIME) protocol defines character-set-selection headers for SMTP e-mail. See the Internet standards RFC1521, RFC1523, and RFC1425.] Character sets that should have a defined string to use with such a header field include at least: ASCII ISO 8859/1 ... ISO 8859/N (where N is the last defined set) ISO 10646 (once it gets completed) The Internet is the dominant mail transport network at present, partly because so many other networks gateway with it. Getting the Internet to convert to supporting such needs would be a big step in the right direction. Perhaps someone on the IETF can comment on their current activities in this area ?? Ran Atkinson randall@Virginia.EDU .............................................................................. Newsgroups: comp.std.internat,comp.protocols.tcp-ip Path: utkcs2!emory!swrinde!cs.utexas.edu!sun-barr!newstop!sun!amdcad!dgcad !dg-rtp!chutney!eliot Message-ID: <1991Apr12.124741.11555@dg-rtp.dg.com> References: <16968@hoptoad.uucp> <1110@sranha.sra.co.jp> <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU> Organization: Data General Corporation, Research Triangle Park, NC Date: 12 Apr 1991 12:47:41 GMT From: eliot@chutney.rtp.dg.com (Topher Eliot) Subject: Re: universality of Latin-1 In article <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU>, randall@Virginia.EDU (Randall Atkinson) writes: |> |> In article <1110@sranha.sra.co.jp>, |> Erik M. van der Poel responded: |> >Have you ever tried to send yourself a message in Latin-1? Did it |> >work? And even if *you* have a reasonable version of sendmail (one |> >that doesn't strip the 8th bit), what makes you so certain that |> >Torbj|rn's message and anyone else's won't pass through a site that |> >*does* strip the 8th bit? |> It does work for a fair and ever increasing subset of the Internet. |> BITNET doesn't do very well with it. Clearly we need to move towards |> 8-bit and 16-bit and 32-bit transparent mail transport mechanisms. I expected to see someone else post a more authoritative answer, but since none has been forthcoming, I will venture. The folks who work on such things have been considering the 8-bit, different-codeset issues, as part of a much larger picture of including such things as graphics and other binary information in mail. Since those are harder problems, it means that they won't have solutions all that quickly. There is a mailing list on this subject; if you really need it I can probaly dig out a lead on how to get onto that mailing list. |> Fortunately there are a number of possible transport mechanisms out |> there to choose from, some of which are already 8-bit transparent. Ack! "Fortunately"? There is an ancient curse: "may you live in interesting times". I think it's modern equivalent is "may you have many standards to choose from". -- Topher Eliot Data General DG/UX Internationalization (919) 248-6371 62 T. W. Alexander Dr., Research Triangle Park, NC 27709 eliot@dg-rtp.dg.com {backbone}!mcnc!rti!dg-rtp!eliot Obviously, I speak for myself, not for DG. ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.folklore.computers,bit.listserv.ibm-main References: <9c9cmi$1e0d$2@news.ums.edu> <9c9vv1$2kqd$1@news.ums.edu> <3AE90D9C.14B35A71@trailing-edge.com> <9ckhon$2spv$1@news.ums.edu> NNTP-Posting-Host: user-33qtp48.dialup.mindspring.com [199.174.228.136] Organization: Wheeler&Wheeler Message-ID: Reply-To: Anne & Lynn Wheeler Date: Mon, 14 May 2001 17:40:51 GMT From: Anne & Lynn Wheeler Subject: Re: Pre ARPAnet email? Anne & Lynn Wheeler writes: > new STD1 (2800) is out today with new format for some sections > ... note verbage for STD4, STD10 and a couple others. also showed up today were a number of "old" rfcs recently converted to machine readable from hardcopy rfc3, rfc5, rfc6, rfc21, rfc23, rfc24, rfc25, rfc27, rfc28, rfc29, rfc30, rfc344, rfc567, rfc593 RFC6 ... discussion about BB&N providing character code conversion. This isn't an easy problem (in many cases). While undergraduate in '68 I had put TTY/ASCII support into CP/67 ... which was incorporated and distributed as part of the standard release. There were some codes that it was very difficult to provide symmetric conversion for ... at least in one case, I tried to map characters in ASCII to valid EBCDIC because I needed some character in ASCII. On the 2741, "at"-sign and "cent"-sign were on the same key and CP/67 had a convention that used (lowercase) "at"-sign (in line editing) for character delete and "cent"-sign for line delete. The TTY keyboard didn't have cent-sign ... so I mapped (been a number of years) "left" bracket. Then in late '68 because of various difficiences in the mainframe 2702 terminal controller, four of us started a project to build the first mainframe PCM control unit using Interdata3s. Had to build our own channel attach card that attached the Interdata3 to the mainframe I/O channel. An emulated line-scanner was built in the Interdata3 that was targeted at supporting both dynamic line-speed recognition as well as dynamic terminal-type recognition (as part of the original TTY support in CP/67, I had expanded the existing dynamic terminal type recognition to TTY ... however 2702 had a difficiency that while the line-scanner could be changed for each line ... the hardware oscillator setting the line speed was hard wired). random refs: http://www.garlic.com/~lynn/subtopic.html#subtopic Network Working Note Steve Crocker, UCLA RFC-6 10 April 1969 CONVERSATION WITH BOB KAHN I talked with Bob Kahn at BB&N yesterday. We talked about code conversion in the IMP's, IMP-HOST communication, and HOST software. BB&N is prepared to convert 6, 7, 8, or 9 bit character codes into 8-bit ASCII for transmission and convert again upon assembly at the destination IMP. BB&N plans a one for one conversion scheme with tables unique to the HOST. I suggested that places with 6-bit codes may also want case shifting. Bob said this may result in overflow if too many case shifts are necessary. I suggested that this is rare and we could probably live with an overflow indication instead of a guarantee. With respect to HOST-IMP communication, we now have a five bit link field and a bit to indicate conversion. Also possible is a 2-bit conversion indicator, one for converting before sending and one for converting after. This would allow another handle for checking or controlling the system. -- Anne & Lynn Wheeler | lynn@garlic.com - http://www.garlic.com/~lynn/ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.misc Path: utkcs2!emory!sol.ctr.columbia.edu!spool.mu.edu!agate!sunkist.berkeley.edu Message-ID: <1991May29.000449.19048@agate.berkeley.edu> Date: 29 May 1991 00:04:49 GMT References: <10599@castle.ed.ac.uk> Reply-To: raymond@math.berkeley.edu (Raymond Chen) In-Reply-To: eanv20@castle.ed.ac.uk (John Woods) From: raymond@math.berkeley.edu (Raymond Chen) Subject: Re: Name that character! (definitive list) Why does everyone feel compelled to post their favorite pronunciations? In article <10599@castle.ed.ac.uk>, eanv20@castle (John Woods) writes: >I wonder if there is a definitive list... Indeed there is. It used to be part of the comp.unix.questions Frequently Asked Questions file, but it has since moved into the `Jargon File'. Many thanks to Maarten Litmath for maintaining the USENET ASCII Pronunciation Guide for many years. (Though the list below does seem to be missing some of the cleverer names in Maarten's list. Like `Donald Duck' for `&'.) [American Standard Code for Information Interchange] /as'kee/ n. Common slang names for ASCII characters are collected here. See individual entries for , , , , , , , , , , , and . This list derives from revision 2.2 of the USENET ASCII pronunciation guide. Single characters are listed in ASCII order, character pairs are sorted in by first member. For each character, "official" names appear first, then others in order of popularity (more or less). ! exclamation point, exclamation, bang, factorial, excl, ball-bat, pling, smash, shriek, cuss, wow, hey, wham " double quote, quote, dirk, literal mark, rabbit ears # number sign, sharp, crunch, mesh, hex, hash, flash, grid, pig-pen, tictactoe, scratchmark, octothorpe, thud $ dollar sign, currency symbol, buck, cash, string (from BASIC), escape (from ), ding, big-money, cache % percent sign, percent, mod, double-oh-seven & ampersand, amper, and, address (from C), andpersand ' apostrophe, single quote, quote, prime, tick, irk, pop, spark () open/close parenthesis, left/right parenthesis, paren/thesis, lparen/rparen, parenthisey, unparenthisey, open/close round bracket, ears, so/already, wax/wane * asterisk, star, splat, wildcard, gear, dingle, mult + plus sign, plus, add, cross, intersection , comma, tail - hyphen, dash, minus sign, worm . period, dot, decimal point, radix point, point, full stop, spot / virgule, slash, stroke, slant, diagonal, solidus, over, slat : colon ; semicolon, semi <> angle brackets, brokets, left/right angle, less/greater than, read from/write to, from/into, from/toward, in/out, comesfrom/ gozinta (all from UNIX), funnel, crunch/zap, suck/blow = equal sign, equals, quadrathorp, gets, half-mesh ? question mark, query, whatmark, what, wildchar, ques, huh, hook @ at sign, at, each, vortex, whorl, whirlpool, cyclone, snail, ape, cat V vee, book [] square brackets, left/right bracket, bracket/unbracket, bra/ket, square/unsquare, U turns \ reversed virgule, backslash, bash, backslant, backwhack, backslat, escape (from UNIX), slosh. ^ circumflex, caret, uparrow, hat, chevron, sharkfin, to ("to the power of"), fang _ underscore, underline, underbar, under, score, backarrow ` grave accent, grave, backquote, left quote, open quote, backprime, unapostrophe, backspark, birk, blugle, back tick, push {} open/close brace, left/right brace, brace/unbrace, curly bracket, curly/uncurly, leftit/rytit, embrace/bracelet | vertical bar, bar, or, or-bar, v-bar, pipe, gozinta, thru, pipesinta (last four from UNIX) ~ tilde, squiggle, approx, wiggle, twiddle, swung dash, enyay Some other common usages cause odd overlaps. The ``$'', ``#'', and ``&'' chars, for example, are all pronunced `hex' in different communities because various assemblers use them as a prefix tag for hexadecimal constants (in particular, $ in the 6502 world and & on the Sinclair and some other Z80 machines). ................................................ ARCHIVER'S NOTE The jest about Donald Duck comes from the name used for this Disney character in Denmark: "Anders And". ................................................ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.std.internat Path: utkcs2!emory!att!bu.edu!wang!ice Message-ID: Date: 14 Jun 1991 22:02:07 GMT References: <5565@mrmarx.UUCP> Organization: Addictive Technologies and Various Magick From: ice@wang.com (Fredrik Nyman) Subject: Re: HELP requested on internationalization sgh@mrmarx.msc.com (Satyen Harve) writes: > >I have just been given the responsibility of coming up with a >plan to internationalize our product. As a first step, I have >to identify all the issues that are involved and determine >their impact on our product. I would very much appreciate >hearing from someone who has gone through or is going through >this process. >I'd particularly like to get any tips or information on what >all is involved and where to go to read more about it. We are >hoping to address both Europe and Asian markets. I'd like to suggest that you get: "Digital Guide to Developing International Software" from Digital Press. Order # EY-F577E-DP ISBN # 1-55558-063-7 The book is geared towards the DEC platforms and the various libraries available to VMS, Ultrix and DECwindows programmers. Even if you couldn't care less about these platforms, the book is very valuable. Among other things, it describes common character sets and has quite extensive guidelines fort dealing with internationalization which are valid no matter what platform you're using. DEC can be reached at 1-800-DIGITAL if you want to order this manual. Outside the US, in New Hampshire, Alaska and Puerto Rico: 1-603-884-6660 -- Fredrik Nyman [Surgically Enhanced Cyberdweeb] DoD #0328 Global Adaptation Center, Wang, M/S 019-490, NeXT: One Industrial Ave., Lowell MA 01851, USA BITNET: ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.vms Path: utkcs2!emory!swrinde!cs.utexas.edu!sun-barr!newstop!west!texsun!smunews!txsil!danmc Message-ID: <475@txsil.lonestar.org> Date: 15 Jun 1991 23:00:32 GMT References: <199113.1053.9712@canrem.uucp> Distribution: comp.os.vms Organization: Summer Institute of Linguistics, Dallas From: danmc@txsil.lonestar.org (Dan McDonald) Subject: Re: vt3xx soft fonts?? In article <199113.1053.9712@canrem.uucp> "jonathan harley" writes: > >Do you know of any available packages that provide VT3xx (or better) >downloadable soft fonts to emulate the IBM PCs graphics character set? As for ones that emulate the IBM PC'sm no, but I would probably only take a couple of hours to make it - there are only 128 characters to set up. > >If so, where might I obtain the soft fonts, how much $ etc. > I wrote a program (in DCL - my favorite programming language) that would take bitmaps in a form like: A 65 1 X 2 X X 3 X X 4 X X 5 XXXXXXX 6 X X 7 X X 8 and would convert them to the down-line loadable format. I use it mainly when I need to design another International Phonetic Alphabet softfont for someone writing a thesis around here. If you would like code and an example of how to use it, send me e-mail and I will be happy to dig it up and send it to you. ****************************************************************************** Dan McDonald * UUCP ...utafll!txsil!dalsil!mcdonald Summer Institute of Linguistics * Internet mcdonald@dallas.sil.org Dallas Computer Services * -OR- danmc@txsil.lonestar.org 7500 W Camp Wisdom Rd * SILnet DAN.MCDONALD@A1@DALLAS Dallas, TX 75236 * POTSnet (214)709-3389 USA * FAXnet (214)709-3387 ////////////////////////////////////////////////////////////////////////////// April 2003 SIL maintains a "Fonts in Cyberspace" resource page. http://www.sil.org/computing/fonts/ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.fonts Path: cs.utk.edu!ornl!fnnews.fnal.gov!mp.cs.niu.edu!news.ecn.bgu.edu!wupost !howland.reston.ans.net!usc!elroy.jpl.nasa.gov!ames!pacbell.com!pacbell !boo!seer!ariel Summary: Hungarian alphabet is Latin alphabet Message-ID: <1993Apr22.153120.2440@seer.gentoo.com> Date: Thu, 22 Apr 1993 15:31:20 GMT References: <1993Apr21.150237.1930@wheaton.wheaton.edu> Organization: Brad Lanam, Walnut Creek, CA From: ariel@seer.gentoo.com (Cathy Hampton) Subject: Re: Hungarian Keyboard Layout The Hungarian language, or Magyar, uses the Latin alphabet. If no one here responds by tomorrow with the keyboard layout, I have it at home in one of language books, I think. (I lived in Vienna for quite a while and learned a little Hungarian.) Catherine Hampton ================================================================ Compuserve: 71601,3130 GEnie: ARIEL GEnie: AMNESTY Internet: ariel@seer.gentoo.com Internet/IGC: cah@igc.apc.org ================================================================ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.fonts Path: cs.utk.edu!ornl!fnnews.fnal.gov!lll-winken.llnl.gov!uwm.edu!wupost !howland.reston.ans.net!ira.uka.de!Germany.EU.net!news.netmbx.de !mailgzrz.TU-Berlin.DE!fub!spoolbag.in-berlin.de!rainbow.in-berlin.de !rainbow.in-berlin.de!not-for-mail Message-ID: <1ra7df$pg0@rainbow.in-berlin.de> References: <1993Apr22.115504.17537@news.columbia.edu> NNTP-Posting-Host: rainbow.in-berlin.de Date: 24 Apr 1993 04:07:43 +0200 From: rj@rainbow.in-berlin.de (Robert Joop) Subject: Re: Latin 1 and Latin 3? pcj1@cunixf.cc.columbia.edu (Pierre Jelenc) writes: >I am looking for the assignments of characters to bytes in the Latin 1 >and Latin 3 character sets. In particular, I am concerned with the >discrepancies between the tables found in DOS and windows manuals and the >actual Latin 1 character set, and with the differences between Latin 1 and >Latin 3. from rfc1345 (Character Mnemonics & Character Sets): [...] &charset ISO_8859-1:1987 &rem source: ECMA registry &alias iso-ir-100 &g1esc x2d41 &g2esc x2e41 &g3esc x2f41 &alias ISO_8859-1 &alias ISO-8859-1 &alias latin1 &alias l1 &alias IBM819 &alias CP819 &code 0 NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _ '! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT PA HO BH NH IN NL SA ES HS HJ VS PD PU RI S2 S3 DC P1 P2 TS CC MW SG EG SS GC SC CI ST OC PM AC NS !I Ct Pd Cu Ye BB SE ': Co -a << NO -- Rg '- DG +- 2S 3S '' My PI .M ', 1S -o >> 14 12 34 ?I A! A' A> A? A: AA AE C, E! E' E> E: I! I' I> I: D- N? O! O' O> O? O: *X O/ U! U' U> U: Y' TH ss a! a' a> a? a: aa ae c, e! e' e> e: i! i' i> i: d- n? o! o' o> o? o: -: o/ u! u' u> u: y' th y: [...] &charset ISO_8859-3:1988 &rem source: ECMA registry &alias iso-ir-109 &g1esc x2d43 &g2esc x2e43 &g3esc x2f43 &alias ISO_8859-3 &alias ISO-8859-3 &alias latin3 &alias l3 &code 0 NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _ '! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT PA HO BH NH IN NL SA ES HS HJ VS PD PU RI S2 S3 DC P1 P2 TS CC MW SG EG SS GC SC CI ST OC PM AC NS H/ '( Pd Cu ?? H> SE ': I. S, G( J> -- ?? Z. DG h/ 2S 3S '' My h> .M ', i. s, g( j> 12 ?? z. A! A' A> ?? A: C. C> C, E! E' E> E: I! I' I> I: ?? N? O! O' O> G. O: *X G> U! U' U> U: U( S> ss a! a' a> ?? a: c. c> c, e! e' e> e: i! i' i> i: ?? n? o! o' o> g. o: -: g> u! u' u> u: u( s> '. [...] the mnemonics are explained in the rfc. rfc's can be found on many ftp sites. rj -- __________________________________________________ Robert Joop rj@{rainbow.in-berlin,fokus.gmd,cs.tu-berlin}.de s=joop;ou=fokus;ou=berlin;p=gmd;a=dbp;c=de ////////////////////////////////////////////////////////////////////////////// Newsgroups: bit.listserv.win3-l Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU Path: cs.utk.edu!darwin.sura.net!newsserver.jvnc.net!news.cac.psu.edu!psuvm !auvm!LUCS-01.NOVELL.LEEDS.AC.UK!ECL6TAM Return-Path: <@AUVM.AMERICAN.EDU,@VTBIT.BITNET:WIN3-L@UICVM.BITNET> Via: UK.AC.LEEDS.GPS; 2 JUL 93 8:56:26 BST Message-ID: Date: Fri, 2 Jul 1993 08:54:23 GMT Reply-To: T.A.McAllister@mailer.leeds.ac.uk Sender: Microsoft Windows Version 3 Forum From: Alec McAllister Subject: Re: Foreign language keyboards (German) Apologies if you already seen this. It was returned, implying that it had never reached the list. >Date: Thu, 1 Jul 1993 16:52:25 GMT >From: Alec McAllister >Subject: Re: Foreign language keyboards (German) > >>Date: Thu, 1 Jul 1993 10:16:50 -0500 >>From: Brian Madsen >>Subject: Foreign language keyboards (German) >> >>I occasionally use Windows for writing in German, and when I do, I switch >>the keyboard definition from US to German. This makes it lots easier to >>get at German foreign language characters (double ss's, umlauts, etc.) >> > >There's a better way. There's a piece of shareware called WinGreek. >That includes a program called Beta which "watches" your keyboard and >substitutes accented characters if you type certain combinations of >keys, e.g. if you type u followed by the plus-key on the numeric >keypad, Beta substitutes ANSI character 0252, u-umlaut. Similarly, >typing A followed by the plus-key makes Beta substitute ANSI 0196, A- >umlaut. The accents used in French, Spanish etc are just as quick and >easy to obtain. > >WinGreek and Beta work with any Windows product, not just word >processors. > >Beta plus a single font, the Times New Roman that comes with Windows, >can produce text in every major European language except Welsh (there >are no w-circumflex or y-circumflex characters). > >The beauty of this system is that you only have to learn one set of >special keys for all the languages: >/ = acute, >* = grave, >- = circumflex, >+ = umlaut, >tilde = tilde (Hurray!) and >the vertical gapped line = everything else (e.g. s followed by that >character gives you the German SZ that looks like a capital B, but A >followed by that character gives you the A with a ring above it which >is used in Scandinavian languages). > >WinGreek also gives you a superb Greek font with all the accents and >breathing-marks, a Hebrew font with (limited) right-to-left >processing, and even a font for Coptic. > >WinGreek is on archives such as CICA, but the authors are on email. I >can send their address if anyone is interested. > >. Alec McAllister Arts Computing Development Officer Computing Service University of Leeds LS2 9JT tel 0532 335399 ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Path: cs.utk.edu!gatech!howland.reston.ans.net!Germany.EU.net!news.dfn.de !news.rwth-aachen.de!urmel.informatik.rwth-aachen.de!fangorn!michael Date: Tue, 10 Jan 1995 19:06:02 MET Organization: An old and gray machine, somewhere in Moria. Message-ID: <9501103491@fangorn> References: <3ekfe6$3ed@news1.shell> NNTP-Posting-Host: akela.informatik.rwth-aachen.de From: Michael Haardt Subject: Re: What is a lantern symbol... kshaw@shell.portal.com (kendall thomason shaw) writes: > My question > is what similar symbols might there be in McDOS code pages 850 or 437 > for the following symbols: > > lantern symbol > checker board (stipple) > board of squares > scan line 1 > scan line 9 > plus I don't know about DOS, but the characters look as following: checker board: # # # # # # # # # # # # # # # # # # # # # # # scan line 1 is a horizontal line at the top of a character, scan line 9 is a horizontal line at the bottom of a character. A vt100 has various such horizontal lines. plus is indeed a big cross, like used in conjuction with the corner and line symbols. lantern and board of squares I can not tell you right now, my vt100 is at home. It may be that it does not have them, at least the wyse 60 I am using does not have those in its emulation. The mapping characters are very closely connected to the vt100 and the AT&T4410. > And then I am still (of course?) baffled by the acsc/ac capability > syntax. Am I to put an octal escape for the literal character there? > (after the corresponding character expected, e.g. \305 for center line > drawing criss-cross type symbol? Yes, indeed you can do it that way. I used it a few years ago with Minix. ac=n\305 would map n to such a cross for native PC fonts. Michael -- Twiggs and root are a wonderful tree (tm) Twiggs & root 1992 :-) d? H- s(+)/(-) g! au a- w v(---) C++(+++) UL++++S++++?++++ L++ 3 E- N+++ tv b+ e+ h f+ m@ r++ n@ y+ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.programmer,comp.terminals Path: cs.utk.edu!gatech!howland.reston.ans.net!pipex!sunic!news.funet.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Followup-To: comp.terminals Date: 16 Jan 1995 13:20:24 GMT Organization: Finnish Meteorological Institute (FMI) Lines: 36 Message-ID: <3fdrqo$ca4@kronos.fmi.fi> References: NNTP-Posting-Host: dionysos.fmi.fi In-Reply-To: Article of Ryan Groth From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: ASCII CODES > 127 under VT100/ANSI & CURSES [ Followups to comp.terminals ] snakec@larry.wyvern.com (Ryan Groth) writes comp.unix.programmer: | |I am writing a few application under SCO unix (AT&T System V, POSIX...) |using curses. I would like to use line drawing characters in the application |which I am positive my terminal supports. I do not want to use the box() |function however. If I addstr() with line drawing characters in the string I |get M's and D's on the screen. Box does draw lines. Is there a way to use |addstr() and send line characters? I am positive that my application will These line drawing characters are from different character set: Usual assignment (with curses and VT100) may be: Bank G0 US-ASCII Assigned with ESC ( B Bank G1 Special Graphics Assigned with ESC ) 0 Selecting bank G0 for characters 32-127 with SI Selecting bank G1 for characters 32-127 with SO ESC is 0x1B or Ctrl-[ SI is 0x0F or Ctrl-O SO is 0x0E or Ctrl-N As you can see drawing of line characters don't be so simple (VT100 DON'T use characters > 127 -- VT100 don't support them). You can't do it with addstr() only, because task includes charcter set assigments also. (If terminal supports 8-bit characters you perhaps can assing Special Graphics to bank G1 and select bank G1 for characters 128-255 with ESC ~ I however don't be sure that this Special Graphics characters are duplicated to upper range -- perhaps they are. ) -- - Kari E. Hurtta / Elämä on monimutkaista Kari.Hurtta@Fmi.FI puh. (90) 1929 658 {hurtta,root,Postmaster}@dionysos.fmi.fi ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals NNTP-Posting-Host: panix3.panix.com NNTP-Posting-Date: Wed, 12 Aug 2009 01:53:41 +0000 (UTC) References: <9Mpfm.103168$rg4.48662@newsfe02.iad> <05j*9b8Ns@news.chiark.greenend.org.uk> Message-ID: Organization: United Individualist Date: 11 Aug 2009 21:53:41 -0400 From: Keith F. Lynch Subject: Re: Shift Out Escape Sequence Jacob Nevins wrote: > > (I don't think the line-drawing character set is actually part of > the ISO 2022 standard as such -- I think it's a DEC invention -- but > it's widely implemented in terminal emulators.) I don't think it has anything to do with ISO 2022, which involves two-byte characters. There were two competing line-drawing character sets: The one used by DEC, and the one used by DOS. DOS also had various accented letters whose encoding was incompatible with anything else. -- Keith F. Lynch - http://keithlynch.net/ Please see http://keithlynch.net/email.html before emailing me. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newsgroups: comp.terminals NNTP-Posting-Host: rapun.sel.cam.ac.uk References: <9Mpfm.103168$rg4.48662@newsfe02.iad> <05j*9b8Ns@news.chiark.greenend.org.uk> Message-ID: Organization: Tartarus.Org Date: 12 Aug 2009 09:02:15 +0100 (BST) From: Simon Tatham Subject: Re: Shift Out Escape Sequence Keith F. Lynch wrote: > > I don't think it has anything to do with ISO 2022, which involves > two-byte characters. No, ISO 2022 is the standard which deals with the switching of character sets: sequences like ESC ( and ESC ) to designate specified character sets into G0-G3, control codes like SI and SO and SS2 that select particular ones of G0-G3 permanently or temporarily into the two actual halves of the byte space GL and GR, and a big list of actual character sets together with identifiers to select them by. _Some_ of the individual character sets are two-byte, but by no means all. There are also cut-down subsets of the full ISO 2022 standard (ISO-2022-JP, ISO-2022-KR and so on) which define a specific initial state and restrict the available control sequences; all of _those_ that I know of involve at least one two-byte sub-character-set, but full ISO 2022 has broader applicability. -- for k in [pow(x,37,0x13AC59F3ECAC3127065A9) for x in [0x195A0BCE1C2F0310B43C, 0x73A0CE584254AB23D5A0, 0x12878657EA814421CC92, 0x7373445BB3DA69996F4A, 0x77A7ED5BC3AA700E80B2, 0xE9C71C94ED87ADCF7367, 0xFE920395F414C1A5DB50]]: print "".join([chr(32+3*((k>>x)&1))for x in range(79)]) # ////////////////////////////////////////////////////////////////////////////// 2009 update: See this O'Reilly & Associates page on line-drawing characters: http://oreilly.com/catalog/docbook/chapter/book/iso-box.html ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.lang.cobol, comp.terminals, comp.unix.aix, comp.periphs Date: 5 Sep 1996 14:46:07 -0400 From: Richard Shuford Subject: Re: terminfo files for AIX 2.3 In article <322EB2E4.594F@lincsys.com>, Jim Egerton writes: > > Anyone have terminfo files (xterm, vt100, aixterm) that work > with the Microfocus toolbox on AIX? > > Using the files shipped with Microfocus V.3.2.37 I have tried > using an aixterm as well an xterm with TERM set to xterm and > vt100. With the aixterm or xterm and TERM=xterm, the video > didn't work properly (line's were displayed as qqqqqq). The display of a row of "qqqqqqq..." is a symptom of the client application wanting to use the DEC Line-Drawing Character Set, which is built into VT100s, VT320s, and any other DEC-like terminal built since 1980. With the proper character set mapped into the "alternate" character set, and if the terminal (or emulation) properly honors codeset switching, a horizontal line is displayed, instead of "qqqqqqqq...". (By the way, this is *not* the same as DEC's "advanced video option", or AVO. AVO on a VT100 gave you 24-line-by-132-column mode and the full four video attributes: underline, reverse, bold, & blink. Later DEC terminals had support for this as standard.) > With an xterm and TERM=vt100, the video is great (appears to use > the vt100 graphics character set to draw frames), but the > function keys didn't work. You don't say what kind of keyboard you are using. Makes a difference. > After copying the terminfo files to a local directory, > pointing COBTERMINFO and TERMINFO at the local directory, and > running the .src files through tic, the situation improved > slightly. The video for the aixterm and xterm with > TERM=xterm is better, but frames are drawn using +---+ > instead of the vt100 graphics characters. A reasonable thing to do, if the client cannot be certain that your xterm emulation supports the line-drawing characters. > I was able to modify the kf1 settings in vt100.src so that the function > keys are recognized, but the frames are drawn the same as > with the aixterm and xterm with TERM=xterm. > > I also pulled the example vt100 file from the Microfocus > Cobol home page and tried using this with an xterm. Same > results--no advanced video. > > If anyone has any terminfo files that appear to work in this > environment, or online documentation for the settings of sgr, > sgr0, enacs, rmacs, and acsc I'd really appreciate it. The global master database for terminfo and termcap descriptions is now maintained by Eric S. Raymond and is available from: http://www.ccil.org/~esr/ncurses.html [2004 working link: http://catb.org/%7Eesr/terminfo/index.html ] ........................................ Addendum: the master terminfo/termcap files contain a "klone+acs" entry that tries to use the line-drawing characters from the IBM PC alternate character set. This might work with any Intel console. ........................................ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!news-res.gsl.net !news.gsl.net!news.mathworks.com!newsfeed.internetmci.com!demos !news.uni-stuttgart.de!uniol!uni-erlangen.de!lrz-muenchen.de !news.rz.uni-passau.de! Message-ID: <32102731.87@fmi.uni-passau.de> Organization: University of Passau, Germany To: Mike Ching X-Mailer: Mozilla 3.0b5 (X11; I; SunOS 5.5 sun4u) References: NNTP-Posting-Host: 132.231.20.18 Date: Tue, 13 Aug 1996 08:56:49 +0200 From: Martin Ramsch Subject: Re: I want lines, not q's! Mike Ching wrote: > > I'm trying to write a VT-100/ANSI terminal emulator in QuickBasic, but > I'm getting a bunch of > > qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq > > where there are supposed to be horizontal lines. How am I supposed to > recognize when there are supposed to be lines instead of q's? I've > noticed even some commercial programs with the same problem. I don't know exactly about VT-100/ANSI, but xterm's behaviour should be quite similiar (BTW, what are the differences?). What you observe is the switching between charsets: Control-N (SO, Shift Out): Switch to Alternate Charater Set: invokes the G1 character set Control-O (SI, Shift In): Switch to Standard Character Set: invokes the G0 character set (the default) To character sets G0 and G1 actually refer is controlled by ESC ( : Designate G0 Character Set ESC ( B = Unites States (USASCII) ESC ( 0 = DEC Special Character and Line Drawing Set ESC ) : Designate G1 Character Set ESC ) B = Unites States (USASCII) ESC ) 0 = DEC Special Character and Line Drawing Set I guess as default G0 should refer to USASCII and G1 to the Line Drawing Set. So, in a nutshell, you have to pay attention to these code sequences! See and -- Sincerly/Mit freundlichen Gruessen Martin Ramsch Inbox/Fax: 02561/91371-6364 ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.linux.development,comp.terminals Followup-To: comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!swrinde!pipex!sunic !sunic.sunet.se!news.funet.fi!news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Date: 7 Apr 1995 07:21:53 GMT Organization: Finnish Meteorological Institute (FMI) Message-ID: <3m2p6h$kll@kronos.fmi.fi> In-Reply-To: Article <3bjdl0$lfd@nyx10.cs.du.edu> of Colin Plumb References: <784.2EDBB0B0@purplet.demon.co.uk> <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi> <3bjdl0$lfd@nyx10.cs.du.edu> From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: 8-bit charset in C1-C3 banks (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...) [ This is comment to very old article from my archive :-) ] colin@nyx10.cs.du.edu (Colin Plumb) writes in comp.terminals: |I just went through RFC 1345 and the CCITT Red Book Recommendation T.51. |It seems that the standard escape sequence looks like: |CSI P P P ... P I...I F |Where P are "parameters" taken from the 0x30..0x3F range (0123456789:;<=>?) |I are magic modifier flags that can totally change the meaning of the escape |sequence, taken from 0x20..0x2F ( !"#$%&'()*+,-./) |And F is a final letter from 0x40..0x7E (@A..Z[\]^_`a..z{|}_) which specifies |what the escape sequence is all about. |The parameters P are decimal numbers separated by semicolons in the usual |way. An all-zero field is synonymous with an empty field. Trailing empty |fields and the separating semicolons can be stripped. Using a colon (:) |is reserved for future standardizatoin. If the parameters start with any |of 0x3C..0x3F (<=>?), it's private-use. |The top bit is ignored if set, although it's not supposed to be, in all |the arguments. |(That is taken from ISO 6429. It also says that F in the range of 0x70..0x7E |is not to be standardized, but is for experimental use.) |This applies to CSI, also known as ESC [. However, some of the ESC sequences |described below also seem to use a similar pattern, although the last |group of final characters isn't reserved and none of the sequences discussed |here have parameters. |As I understand it, you have two control sets available, C0 and C1. |Characters from 0..0x1F are in C0, and 0x80..0x9F are in C1. In case you |can't send 8-bit characters, ESC-@ through ESC-_ are synonyms for |128 through 159. (ESC-x means x+64, for 64 <= x < 96.) |You can select a C0 set with ESC ! F, where F is one of the final |characters discussed above, and a C1 set with ESC " F. |There are 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F). |You can have 4 of these floating around, G0, G1, G2 and G3. The 0x20..0x7F |and 0xA0..0xFF ranges are available to have these sets mapped into them. |When you see a "0x3F", for example, you have to figure out which set (G0, |G1, G2 or G3) is mapped into that space, and then figure out which character |set is in force there. |It's a bit like the 4 segment registers on the 8086. |94-character sets are mapped in with ESC ( F, ESC ) F, ESC * F and ESC + F. |These are the G0..G3 slots, respectively. There's also an overflow range |which is used, ESC ( ! F, etc. |96-character sets can only be mapped to the G1..G3 slots. That uses |ESC - F, ESC . F and ESC / F. The "F" assignments are independent of |the assignments for the 94-character sets. |I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in |0xA0..0xFF, but I'm not finding it documented. |Anyway, you can then choose the mapping of bytes to graphic character |sets. This is done with LS0, LS1, LS2 and LS3 (locking Shift N) |to place G0..G3 in the 0x20..0x7F range, and LS1R, LS2R and LS3R for |the 0xA0..0xFF range. There's also SS2 and SS3 to shift the next character |from G2 or G3 into the 0x20..0x7F range. |In the document I have, SS2 ix 0x19 (EM) and SS3 is 0x1D (GS). |LS0 is 0x0F (SI), and LS1 is 0x0E (SO). LS2 is ESC n and LS3 is |ESC o. LS1R is ESC ~, LS2R is ESC } and LS3R is ESC |. |There are also multi-byte character sets, using either 94 or 96 |characters, selected with ESC $ F, ESC $ ) F, ESC $ * F and ESC $ + F |for the 94-character case, and ESC $ - F, ESC $ . F and ESC $ / F for |the 960-character case. |You can have "dynamically reconfigurable character sets" (downloadable fonts), |which are specified by inserting a space (0x20) between the character-set |specifier and the final character. (If 63 is not enough, overflow using |the ! hack is a possibility.) |Oh, and finally, you can replace everything (all 128 or 256 characters) |with ESC % F. What happens after that depends on the new character set, |which may or may not define ESC to get at the old things. |Now, what I don't understand is how 8-bit character sets work. RFC 1345 |specifies rather a lot of them, and generally uses the 96-character escapes |for them, but there are a few 94-character escapes specified. |In particular, ESC ( t and ESC ( | specify the NAPLPS and T.101-G2 |character sets, which are 8 bits. |I could reconcile this if the G sets had room for two banks of characters |(low and high), and 7-bit sets loaded both identically, while 8-bit |sets loaded them differently, and the various shift functions fetched |from the corresponding bank. But I can't find it referred to anywhere. Seems that in 94-banks really are only 94-charcters and 96-banks have only 96 characters. In case on 8-bit characters in banks have characters 161-254 (94-bank) or 160-255 (96-bank). So after what bank is selected higest bit of char is ignored. That higgest bit affect only selection of GR/GL. And selection of GR/GL affect is that bank G0-G3. But after that caharcter is indexed from bank as (char & 127) -- or this is my impression from some documents (specially from: draft-ohta-text-encoding-01.txt). Can you comfirm this? |Anyway, I don't think I've made any suggestions or asked any questions, |but maybe this information dump will help some other people. |-- | -Colin [ CC'ed to colin@nyx10.cs.du.edu ] -- - Kari E. Hurtta / Elämä on monimutkaista Kari.Hurtta@FMI.FI puh. (90) 1929 658 {hurtta,root,Postmaster}@dionysos.FMI.FI ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!news.alpha.net !news.mathworks.com!transfer.stratus.com!xylogics.com!Xylogics.COM!carlson Organization: Xylogics Incorporated Message-ID: <3m39v7$2es@newhub.xylogics.com> References: <784.2EDBB0B0@purplet.demon.co.uk> <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi> <3bjdl0$lfd@nyx10.cs.du.edu> <3m2p6h$kll@kronos.fmi.fi> NNTP-Posting-Host: newhub.xylogics.com Date: 7 Apr 1995 12:08:07 GMT From: carlson@Xylogics.COM (James Carlson) Subject: Re: 8-bit charset in C1-C3 banks (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...) In article <3m2p6h$kll@kronos.fmi.fi>, hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes: |> [...] |> |You can select a C0 set with ESC ! F, where F is one of the final |> |characters discussed above, and a C1 set with ESC " F. Do you have a reference for that? I've never seen those described or used. (I'm not even sure what it would mean to have a "C0 set" ...) |> |There are 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F). |> |You can have 4 of these floating around, G0, G1, G2 and G3. The 0x20..0x7F |> |and 0xA0..0xFF ranges are available to have these sets mapped into them. |> |When you see a "0x3F", for example, you have to figure out which set (G0, |> |G1, G2 or G3) is mapped into that space,and then figure out which character |> |set is in force there. You left out GL and GR. GL (Graphics Left) is the pointer which maps the 20-7E characters into one of the Gx sets. Thus, GL has one of the values 0, 1, 2 or 3. GR (Graphics Right) is the pointer for the A0-FF set. This is usually restricted to 1, 2 or 3 (not 0). |> |I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in |> |0xA0..0xFF, but I'm not finding it documented. The default (at least for VT-series terminals) is GL=0, GR=2, G0=ascii, G1=ascii, G2=multinational and G3=multinational. |> |Anyway, you can then choose the mapping of bytes to graphic character |> |sets. This is done with LS0, LS1, LS2 and LS3 (locking Shift N) |> |to place G0..G3 in the 0x20..0x7F range, and LS1R, LS2R and LS3R for |> |the 0xA0..0xFF range. There's also SS2 and SS3 to shift the next character |> |from G2 or G3 into the 0x20..0x7F range. Actually, the locking-shift operators just change the GL and GR pointers. -- James Carlson Tel: +1 617 272 8140 Annex Software Support / Xylogics, Inc. +1 800 225 3317 53 Third Avenue / Burlington MA 01803-4491 Fax: +1 617 272 2618 ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!atglab.bls.com!gatech !newsjunkie.ans.net!newstf01.news.aol.com!newsbf02.news.aol.com Organization: America Online, Inc. (1-800-827-6364) Message-ID: <3mjemn$5j5@newsbf02.news.aol.com> References: <3m2p6h$kll@kronos.fmi.fi> NNTP-Posting-Host: newsbf02.mail.aol.com Date: 13 Apr 1995 11:07:03 -0400 From: "Peter Sichel" Subject: Re: 8-bit charset in C1-C3 banks (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...) In Message-ID: <3m2p6h$kll@kronos.fmi.fi> you wrote: >Now, what I don't understand is how 8-bit character sets work. 8-bit character sets that follow the ISO structure (ISO 2022) are made up of two 7-bit "halves". For example, ASCII in GL and ISO Latin-1 Supplemental in GR. The combined 8-bit set is called "ISO Latin Alphabet Nr 1" or "ISO Latin-1" for short. [Ignoring the control sets C0 & C1 for simplicity] ISO 8859/1 (Latin-1) through ISO 8859/9 define additional 8-bit sets by specifying the supplemental part to be used in GR along with ASCII in GL. IBM Code Pages are different in that they have no structure for designating and invoking (switching) character sets or components. Each code page defines a fixed application specific repertiore. The term "code page" refers to the page number on which the character set is described in IBM's master book of character encodings. - Peter ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.protocols.kermit.misc Path: cs.utk.edu!cssun.mathcs.emory.edu!hobbes.cc.uga.edu!news-feed-1.peachnet.edu!news.netins.net!newshost.marcam.com!uunet!psinntp!nntp.hk.super.net!news.ust.hk!apang Message-ID: <1995Apr24.142214.28377@uxmail.ust.hk> Sender: usenet@uxmail.ust.hk Nntp-Posting-Host: cssu81.cs.ust.hk Organization: The Hong Kong University of Science and Technology X-Newsreader: TIN [version 1.2 PL2] Date: Mon, 24 Apr 1995 14:22:14 GMT From: apang@cs.ust.hk (Albert PANG) Subject: How To read/write Chinese at a remote host using UNIX C-Kermit How to read/write Chinese at a remote host using UNIX C-Kermit ============================================================== Software required: ----------------- 1) cxterm 2) kermit 'cxterm' is available at anonymous ftp ftp://cs.purdue.edu:/pub/ygz/cxterm-??.??.??.tar.Z Linuxers can also get a binary version on Linux at ftp://sunsite.unc.edu:/pub/Linux/X11/xutils/terms/cxterm-??.tar.gz C-Kermit 5A for your version of UNIX is available from ftp://kermit.columbia.edu/kermit/archives/cku190.tar.{Z,gz} Setup procedure: ---------------- 1. Make sure you have cxterm properly installed and can display/write Chinese characters in your local host. To get cxterm properly installed, the FAQ for cxterm, which is available at anonymous ftp: cs.purdue.edu:/pub/ygz/CXTERM.FAQ will be helpful. There are currently a few encoding methods for Chinese characters. They are Big5, GB and HZ. In HK and Taiwan, Big5 is more popular and in Mainland China, GB and HZ are more popular. 'cxterm' can be configured to support all of them. Anyway, this will not be relevant to kermit, as long as they are 8-bit code. 'cxterm' configured to a particular encoding will recognize that encoding only. 2. Open a cxterm and run kermit. 3. Configure kermit. Before you connect your modem to kermit, you need some parameter settings: set parity none set command bytesize 8 set terminal bytesize 8 set terminal character-set transparent Then connect as usual and log in to your remote host. 4. At your remote host, set the terminal to allow 8-bit character by UNIX-Prompt> stty pass8 This example works on SunOS, but the syntax might differ for other UNIX systems, for example "stty cs8" or "stty -parity". On non-UNIX systems use the appropriate command (like "set terminal /eightbit" on VMS). If you don't do this, you can still read Chinese, but you can't type, since your terminal will truncate the highest bit of your code. (unless of course, your terminal has already been configured) You might like to include the above line in your shell rc script, so that you won't have to type it in every time you log in. 5. Voila! You should now be able to read/write Chinese in your cxterm. Go get a cup of tea or something and try read some Chinese newsgroups. alt.chinese.txt.big5 alt.chinese.txt tw.bbs.talk.joke Make sure you have the right kind of cxterm. cxterm configured to read Big5 will not recognize a passage written in GB, and vice versa. And for information about how to read/write Chinese using MS-DOS Kermit, see "Circumnavigating the Web" in Kermit News #6: ftp://kermit.columbia.edu/kermit/e/newsn6.{txt,ps} http://www.columbia.edu/kermit/newsn6.html -- Albert Pang ////////////////////////////////////////////////////////////////////////////// Kterm Announcement Sat May 4 14:11:37 1991 Internet: mleisher@nmsu.edu Bitnet : mleisher@nmsu.bitnet Mark Leisher Computing Research Lab New Mexico State University Box 3CRL Las Cruces, NM 88001-0001 +1 505 646-5711 INTRODUCTION ------------ Kterm is a modified version of xterm that is capable of displaying text from character sets requiring 2-bytes per character as well as the standard single byte character sets. The original kterm was designed to support display of Japanese text. This capability has been expanded to include Chinese and Korean as well. CHARACTER SETS AND CODINGS -------------------------- Version 4.1.2 of kterm can display Chinese, Japanese, and Korean text in a number of coding systems. With the exception of the Korean N-byte coding, all of the coding systems described below require two bytes per character. 1. Chinese A. GB2312-1980 (GuoBiao) PRC standard GB is a seven bit standard that requires two bytes per character. It is most often used with the high (most significant) bit set on each byte of the character to distinguish the Chinese text from other seven bit text. The eight bit usage of GB is also used in CCDOS, the Chinese version of MS-DOS. NOTE: Perhaps the eight bit usage should be refered to as EUC (Extended Unix Code). CODE RANGE: 0xA1A1-0xFEFE B. Shift-GB Shift-GB is a mixed seven and eight bit coding, with the first byte always having the high (most significant) bit set to distinguish it from other seven bit text. Shift-GB was used by the Chinese Macintosh OS until recently. NOTE: I'm not sure if it is an official standard. CODE RANGE: 0x8140-0xAFFC (excluding 0x7F as a second byte) C. Big5 Big5 is a mixed seven and eight bit coding, with the first byte always having the high (most significant) bit set to distinguish it from the other seven bit text. Big5 is at least a de facto standard in places like Hong Kong and Taiwan where the Traditional Chinese ideographs are used. NOTE: Rumor has it that it is, or will be a standard in Taiwan. I don't have any facts on this yet. CODE RANGE: 0xA140-0xF9FE 2. Japanese A. JIS (Japanese Industrial Standard X0208-1983) JIS is a seven bit standard that is usually distinguished from other seven bit text by a starting and ending escape sequence. START ESCAPE SEQUENCE: $B (NEW-JIS) @B (OLD-JIS) END ESCAPE SEQUENCE : (B CODE RANGE: 0x2121-0x7E7E B. Shift-JIS Shift-JIS is a mixed seven and eight bit coding, with the high (most significant) bit of the first byte set to distinguish it from the other seven bit text. CODE RANGE: FIRST BYTE : 0x81-0x9F and 0xE0-0xEF SECOND BYTE: 0x40-0xFC (excluding 0x7F) C. EUC EUC is an eight bit usage of JIS, with the high (most significant) bit of each byte set to distinguish it from other seven bit text. CODE RANGE: 0xA1A1-0xFEFE 3. Korean A. KSC5601-1987 (Jamos and Hangul) This version of kterm only supports the Jamos (Hangul elements) and Hangul portion of the KSC5601-1987 standard. The Hanja portion will come later. KS is a seven bit standard that requires two bytes per Hangul character. It is most often used with the high (most significant) bit set on each byte of the character to distinguish the Korean text from other seven bit text. NOTE: Perhaps the eight bit usage should be refered to as EUC (Extended Unix Code). CODE RANGE: JAMOS : 0xA4A1-0xA4FE HANGUL: 0xB0A1-0xC8FE B. N-byte N-byte code is a way of representing Hangul text using only ASCII characters. It uses a variable number of bytes to select a particular Hangul syllable and is distinguished from other seven bit text by the SO (Shift Out) sequence and the SI (Shift In) sequence. START ESCAPE SEQUENCE: ^N (0x0E) END ESCAPE SEQUENCE : ^O (0x0F) CODE RANGE: 0x41-0x7C (full range) NOTE: The code range actually varies. See the file "hgutil.c" for details. 4. X11 Compound Text Version 4.1.2 of kterm now recognizes most of the Compound Text approved standard encodings. It does not recognize the non-standard character set encodings or the directionality indicators. Even though the approved standard encodings are recognized, this is no guarantee that they will display text appropriately, specifically the right-to-left encodings. Code will have to be added to support this. The 94^N Compound Text sequences for GB 2312-1980, JIS X0208-1983, and KS C5601-1987 will be interpreted correctly if the appropriate language is chosen when starting kterm, or if it is set in the application defaults file, KTerm.ad. FONTS ----- There are a number of freely available Chinese, Japanese and Korean X11 fonts available. Here are some anonymous ftp sites where the fonts are available: 1. HOST: crl.nmsu.edu [128.123.1.14] CRL has a relatively complete collection of the freely available Chinese, Japanese, and Korean X11 fonts. They are located in the subdirectories pub/chinese/fonts, pub/japanese/fonts, and pub/korean/. The CRL site also has lists of known anonymous ftp sites for software related to the language of interest. 2. HOST: miki.cs.titech.ac.jp [131.112.16.39] HOST: utsun.s.u-tokyo.ac.jp [133.11.11.11] These ftp sites have large collections of many Usenet and JUNET newsgroup archives. The fj.sources archives contain many of the Japanese X11 fonts that have been posted on JUNET. There are Index files in most of the directories describing which archive file has the font sources. 3. HOST: kum.kaist.ac.kr [137.68.1.65] There are a few Korean utilities available from this site as well as archives of a number of Usenet news groups. Most of the Korean related code and fonts are located in pub/hangul/. AUTHORS AND CONTRIBUTORS ------------------------ The initial conversion work on xterm for displaying Japanese text was done by kagotani@cs.titech.ac.jp (Hiroto Kagotani). The ANSI color support was added using the kterm 4.1.0 patches provided by mukawa@tn-sec.ntt.junet (Susumu Mukawa). The Multi-Byte Character Set Word Select feature was added using a modified version of Kiyoshi KANAZAWA's 4.1.0 MBCS_WSEL patches. The Chinese and Korean support was added by mleisher@nmsu.edu (Mark Leisher). CLOSING NOTES ------------- The {character set,font set,language,conversion} mechanisms are a little clumsy and should eventually be modified to be more in line with XPG3 locale specifications and the up-coming X11 i18n specifications. Hopefully, this won't be too far away. BUG REPORTS ----------- Please send bug reports and/or fixes for kterm 4.1.2 to mleisher@nmsu.edu or mleisher@nmsu.bitnet. THANKS ------ I would like to express my thanks to Mr. Kagotani for doing the initial conversion work. His code made it a lot easier for me to add support for Chinese and Korean. Thanks go to Ricky Yeung and F. F. Lee for making their Chinese code conversion programs freely available. I would also like to thank ujsung@solgai.kaist.ac.kr (UnJae Sung) for having the patience to answer my questions about Korean coding. And last but not least, thanks go to these people for significant bug reports and fixes: John Melby of Fujitsu Martin C. Fong of Sybase Yang Zhiwei of the German National Research Center for Computer Science Alton Harkcom (for help updating the Japanese manual page) ////////////////////////////////////////////////////////////////////////////// \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!vixen.cso.uiuc.edu !news-peer.sprintlink.net!news.sprintlink.net!Sprint!newsfeed.nacamar.de !wuff.mayn.de!wuff.franken.de!news-nue1.dfn.de!news-mue1.dfn.de !rzg.mpg.de!lrz-muenchen.de!not-for-mail Newsgroups: comp.unix.questions,comp.unix.admin,comp.windows.x, comp.std.internat,comp.software.international,at.general, soc.culture.german,soc.culture.french,soc.culture.belgium, soc.culture.quebec,soc.culture.nordic,soc.culture.spain, soc.culture.portuguese,soc.culture.latin-american, soc.culture.brazil,soc.culture.argentina,soc.culture.mexico, soc.culture.italian,soc.culture.colombia,soc.culture.venezuela, soc.culture.peru,soc.culture.chile,bit.listserv.catala Distribution: world References: Message-ID: <5r028v$4fn$1@sparcserver.lrz-muenchen.de> Organization: Leibniz-Rechenzentrum, Muenchen (Germany) Date: 21 Jul 1997 16:20:47 GMT From: Helmut.Richter@lrz-muenchen.de (Helmut Richter) Subject: Re: ISO 8859-1 National Character Set FAQ mike@vlsivie.tuwien.ac.at writes: >*****If you can confirm or deny this, please let me know.***** >Currently, each system vendor has his own set of locale names, which >makes portability a bit problematic. Supposedly there is some X/Open >document specifying a > _. >syntax for environment variables specifying a locale, but I'm unable >to confirm this. POSIX 1003.1 recommends (in the informative annex E.1.3) to use the following syntax of locale names: language_TERRITORY.Code, e.g.: de_AT.ISO8859-1 hu_HU.ISO8859-2 ja_JP.AJEC The funny thing is that they use a different syntax in the example in section B.8.1.2 (also an informative annex). ==== I think one should add some info on redefining a keyboard under X11 as to include additional characters. I have written a lengthy paper on the topic, albeit in German language (http://www.lrz-muenchen.de/services/software/x11/xmodmap/). I am ready to translate a part of it into English, but certainly not all of it. This is also interesting for emacs under X11: emacs does make a difference between a key combination like Meta-d and a key combination that has been redefined to mean a non-ASCII character (of course you must not use the Meta key, which is typically the same as the Alt key, as Mode_switch key). It is thus not necessary to quote such characters with Ctrl-Q to prevent them from being taken for emacs commands. Helmut Richter ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.linux.development.apps, comp.os.linux.development.system,comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net !demon!doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Organization: Finnish Meteorological Institute (FMI) Lines: 53 Message-ID: <3pn802$sc1@kronos.fmi.fi> References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu> In-Reply-To: Article <3p9gne$mu7@uahcs2.cs.uah.edu> of Chris Ford NNTP-Posting-Host: dionysos.fmi.fi Date: 21 May 1995 11:25:54 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Character set assigments (Re: How do I display IBM PC characters?) [ Added comp.os.linux.developments.system as receiver because terminal driver is part of kernel -- right? Added comp.terminals as receiver, because that is terminal (or terminal emulation) issue. ] cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps: | |Peter Koenig (koenig@interaccess) wrote: |: I'm trying to figure out how to display IBM PC characters. I know it's |: possible, but doing a simple printf() with the value gets it masked to |: 7-bits, and when I tried ncurses, it put the wrong character up... Any |: pointers to more info on this? | Before you do your printf, print this: "\033(U" and it will switch |to the DOS character set. "\033(B" will switch back. Or vice versa. Just a comment (and some surprising notes :-)) These ESC ( U is quite odd code in standards view as far I understands. ESC ( assigns bank G0. And bank G0 is on accessible in range 128-255. That is GR (right side; characters (128)160-255) can newer point to to bank G0. Only to banks G1-G3. It should be more understandable if code is ESC - A Assing Latin/1 (area (128)160-255) to G1 ESC - U Assign DOS character set (area 128-255) to G1 But it isn't that way :-) And codes 'ESC ( U', 'ESC ) U', 'ESC * U' and 'ESC + U' have already another standard meaning (see later). (both ESC - and ESC ) assigns G1 -- charset names are different. Hmm. ESC - can assign areas 160-255 (32-127), ESC ( can assign area 161-254 (33-126) -- yeas these are very confusing.) By to way -- from where that ident "U" comes for DOS character set? Just curious. Oops. Letter "U" is reserved for Latin-greek-1 (iso-ir-27) according of RFC 1345 (that is informal RFC). RFC 1345 lists following codes: ESC ( U Assigns iso-ir-27 to G0 ESC ) U Assigns iso-ir-27 to G1 ESC * U Assigns iso-ir-27 to G2 ESC + U Assigns iso-ir-27 to G3 RFC 1345 don't list codes ESC - U Assign {something} (160-255 (32-127)) to G1 ESC . U Assign {something} (160-255 (32-127)) to G2 ESC / U Assign {something} (160-255 (32-127)) to G3 [ Hmm. Perhaps I comment some other issues later. ] ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.linux.development.system, comp.os.linux.development,,comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net !demon!doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3pnf1l$nc@kronos.fmi.fi> In-Reply-To: Article <3pn802$sc1@kronos.fmi.fi> of "Kari E. Hurtta" References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu> <3pn802$sc1@kronos.fmi.fi> NNTP-Posting-Host: dionysos.fmi.fi Organization: Finnish Meteorological Institute (FMI) Date: 21 May 1995 13:26:13 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: Character set assigments (Re: How do I display IBM PC characters?) hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes: [ I promised to followup myself :-) ] |[ Added comp.os.linux.developments.system as receiver because terminal driver | is part of kernel -- right? Added comp.terminals as receiver, because that | is terminal (or terminal emulation) issue. ] [ Dropped comp.os.linux.development.apps from receivers. Added comp.os.linux.development as receiver :-) ] |cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps: || Before you do your printf, print this: "\033(U" and it will switch ||to the DOS character set. "\033(B" will switch back. Or vice versa. |These ESC ( U is guite odd code in standards view as |far I understands. ESC ( assigns bank G0. And bank G0 is on accessible |in range 128-255. That is GR (right side; characters (128)160-255) can newer |point to to bank G0. Only to banks G1-G3. |It should be more understandable if code is | | ESC - A Assing Latin/1 (area (128)160-255) to G1 | ESC - U Assign DOS character set (area 128-255) to G1 Because you want keep DEC special graphics in G1 (which is default for VT100), and GR is bydefault pointed to bank G2. Better use following codes: ESC . A Assign Latin/1 (area 160-255) to G2 ESC . U Assign DOS character set (area 160-255(*)) to G2 (*) There is still problem that C1 (128-159) is for control codes. At least some versions of Linux terminal driver interpreter one of these: CSI (9/11 or 0x9b) -- (IMHO -- it should interpreter all codes in C1 range or nothing them -- current situation confusing. Notice specially cursor control codes: IND (8/4 or 0x84), RI (8/13 or 0x8d) and NEL (8/5 or 0x85).) |But it isn't that way :-) |And codes 'ESC ( U', 'ESC ) U', 'ESC * U' and 'ESC + U' have already another |standard meaning (see later). |(both ESC - and ESC ) assigns G1 -- charset names are different. | Hmm. ESC - can assign areas 160-255 (32-127), | ESC ( can assign area 161-254 (33-126) -- yeas these are very confusing.) <...> |RFC 1345 lists following codes: | ESC ( U Assigns iso-ir-27 to G0 | ESC ) U Assigns iso-ir-27 to G1 | ESC * U Assigns iso-ir-27 to G2 | ESC + U Assigns iso-ir-27 to G3 |RFC 1345 don't list codes | ESC - U Assign {something} (160-255 (32-127)) to G1 | ESC . U Assign {something} (160-255 (32-127)) to G2 | ESC / U Assign {something} (160-255 (32-127)) to G3 RFC 1345 lists MS-DOS character set (charset: IBM437), but don't give character set assigment codes for this. |[ Hmm. Perhaps I comment some other issues later. ] [ I still seems to be some issue not to be covered yet. :-) ] ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.linux.development.system, comp.os.linux.development,comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net!demon !doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi!news.csc.fi !kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3pp9va$8je@kronos.fmi.fi> References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu> <3pn802$sc1@kronos.fmi.fi> <3pnf1l$nc@kronos.fmi.fi> In-Reply-To: Article <3pnf1l$nc@kronos.fmi.fi> of "Kari E. Hurtta" Organization: Finnish Meteorological Institute (FMI) Date: 22 May 1995 06:11:54 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: Character set assigments (Re: How do I display IBM PC characters?) hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes: [ I'm still followuping myself :-) ] ||cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps: ||| Before you do your printf, print this: "\033(U" and it will switch |||to the DOS character set. "\033(B" will switch back. Or vice versa. <...> ||It should be more understandable if code is || ESC - A Assing Latin/1 (area (128)160-255) to G1 || ESC - U Assign DOS character set (area 128-255) to G1 { ESC - U is just my suggestion, only prefix ESC - is standard } |Because you want keep DEC special graphics in G1 (which is default for VT100), |and GR is bydefault pointed to bank G2. Better use following codes: | ESC . A Assign Latin/1 (area 160-255) to G2 | ESC . U Assign DOS character set (area 160-255(*)) to G2 { ESC . U is just my suggestion, only prefix ESC . is standard } |(*) There is still problem that C1 (128-159) is for control codes. | At least some versions of Linux terminal driver interpreter one | of these: CSI (9/11 or 0x9b) -- (IMHO -- it should interpreter | all codes in C1 range or nothing them -- current situation confusing. Notice specially cursor control codes: IND (8/4 or 0x84), | RI (8/13 or 0x8d) and NEL (8/5 or 0x85).) In article "Re: DO use ESC [ 11 m (was: Don't use ESC 11 m"... in groups comp.os.linux.development and comp.terminals Colin Plumb (at 30 Nov 1994 19:50:08 -0700) was giving information what indicates that perhaps correct prefix is ESC % which changes whole set (all 128 or 255 characters). So perhaps yeat better codes are something like: ESC % A Assigns Latin/1 to G2, enables C1 (128-159) as control range, Assigns US-ASCII to G0 ESC % U Assigns MS-DOS to range 32-255 (G0,G2 and C1), disables C1 as control range { previous codes are just my suggestions, not from many specification. Only prefix ESC % can be taken from ISO 6429 } Hmm. According same article prefix ESC ! can be used for assign C0 (0-31) and prefix ESC " can be used to assign C1 (128-159). By to way, what codes was to assign UTF-8 and UTF-1 Was it ESC % {something} I think that I have hear code for UTF-1 to be assigned officially. <...> ||RFC 1345 lists following codes: || ESC ( U Assigns iso-ir-27 to G0 || ESC ) U Assigns iso-ir-27 to G1 || ESC * U Assigns iso-ir-27 to G2 || ESC + U Assigns iso-ir-27 to G3 ||RFC 1345 don't list codes || ESC - U Assign {something} (160-255 (32-127)) to G1 || ESC . U Assign {something} (160-255 (32-127)) to G2 || ESC / U Assign {something} (160-255 (32-127)) to G3 |RFC 1345 lists MS-DOS character set (charset: IBM437), but don't give |character set assigment codes for this. ||[ Hmm. Perhaps I comment some other issues later. ] |[ I still seems to be some issue not to be covered yet. :-) ] [ Perhaps I not followup myself -- I think that is going to be monology :-) ] ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.linux.development,comp.terminals Path: cs.utk.edu!gatech!swrinde!pipex!sunic!news.tele.fi!news.csc.fi !kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3bjv6b$mf4@kronos.fmi.fi> References: <784.2EDBB0B0@purplet.demon.co.uk> <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi> <3bjdl0$lfd@nyx10.cs.du.edu> Organization: Finnish Meteorological Institute (FMI) Date: 1 Dec 1994 07:49:31 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m was: Re: using the V colin@nyx10.cs.du.edu (Colin Plumb) writes: > It seems that the standard escape sequence looks like: > CSI P P P ... P I...I F > Where P are "parameters" taken from the 0x30..0x3F range (0123456789:;<=>?) > I are magic modifier flags that can totally change the meaning of the escape > sequence, taken from 0x20..0x2F ( !"#$%&'()*+,-./) > And F is a final letter from 0x40..0x7E (@A..Z[\]^_`a..z{|}_) which specifies > what the escape sequence is all about. Thanks. Yes. I was little careless. For character set changing DEC uses I modifiers and that F final letters. > There are 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F). > You can have 4 of these floating around, G0, G1, G2 and G3. The 0x20..0x7F > and 0xA0..0xFF ranges are available to have these sets mapped into them. > When you see a "0x3F", for example, you have to figure out which set (G0, > G1, G2 or G3) is mapped into that space, and then figure out which character > set is in force there. > It's a bit like the 4 segment registers on the 8086. > 94-character sets are mapped in with ESC ( F, ESC ) F, ESC * F and ESC + F. > These are the G0..G3 slots, respectively. There's also an overflow range > which is used, ESC ( ! F, etc. 94 -character sets seems to be (in VT420): B US-ASCII %5 DEC Multinational Following character sets haven't mentioned are they 94 or 96 character set -- I think that these are 94 -character sets: 0 DEC special graphics > DEC Technical < user-preferred supplemental (*) And also following national character sets (available only in national mode): A UK-ASCII (ISO United Kingdom) 4 DEC Dutch 5 DEC Finnish R ISO French 9 DEC French Canadian K ISO German Y ISO Italian 6 DEC Norwegian/Danish ' ISO Norwegian/Danish %6 DEC Portuguese Z ISO Spanish = DEC Swiss (*) DEC Multinational or ISO Latin/1 (selectable with DCS ... ST codes). > 96-character sets can only be mapped to the G1..G3 slots. That uses > ESC - F, ESC . F and ESC / F. The "F" assignments are independent of > the assignments for the 94-character sets. 96 -character sets seems to be (in VT420): A ISO Latin/1 > I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in > 0xA0..0xFF, but I'm not finding it documented. That is how VTxxx -series terminals does it. > There are also multi-byte character sets, using either 94 or 96 > characters, selected with ESC $ F, ESC $ ) F, ESC $ * F and ESC $ + F > for the 94-character case, and ESC $ - F, ESC $ . F and ESC $ / F for > the 960-character case. You mean: ... and ESC $ / F for the 96-character case. > Now, what I don't understand is how 8-bit character sets work. RFC 1345 > specifies rather a lot of them, and generally uses the 96-character escapes > for them, but there are a few 94-character escapes specified. > In particular, ESC ( t and ESC ( | specify the NAPLPS and T.101-G2 > character sets, which are 8 bits. > I could reconcile this if the G sets had room for two banks of characters > (low and high), and 7-bit sets loaded both identically, while 8-bit > sets loaded them differently, and the various shift functions fetched > from the corresponding bank. But I can't find it referred to anywhere. At least codes ESC ) < ESC * < ESC + < ESC ) %5 ESC * %5 ESC + %5 changes both low and high side of banks (I think that I don't have used other codes for selecting 8-bit character sets.) I don't have tried use high side of bank when to bank have assigned 7-bit character set. > Anyway, I don't think I've made any suggestions or asked any questions, > but maybe this information dump will help some other people. -- - Kari E. Hurtta / Elämä on monimutkaista Kari.Hurtta@Fmi.FI puh. (90) 1929 658 {hurtta,root,Postmaster}@dionysos.fmi.fi ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.admin,comp.terminals Path: cs.utk.edu!gatech!purdue!lerc.nasa.gov!magnus.acs.ohio-state.edu !math.ohio-state.edu!cs.utexas.edu!convex!cnn.exu.ericsson.se !erinews.ericsson.se!sunic!sunic.sunet.se!news.funet.fi!news.csc.fi !kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3s15pj$4cs@kronos.fmi.fi> References: <3rli9f$3qd@linet02.li.net> NNTP-Posting-Host: dionysos.fmi.fi Organization: Finnish Meteorological Institute (FMI) Date: 18 Jun 1995 12:22:11 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: extended ascii characters [ Added comp.terminals as receiver. ] steven.g.johnson.1@gsfc.nasa.gov (steve johnson) writes in comp.unix.admin: |In article <3rli9f$3qd@linet02.li.net>, cagenjo@scls1 (Agenjo) wrote: |> |> I am part of a team setting up 54 libraries on an Internet system. I |> designed a welcome screen (using a DOS text editor) that I hoped would |> greet new users. I used some extended ascii characters to create a nice |> graphic, but when our sysadmin loaded it in, the characters we see upon |> login are not what I used - they have become numbers, etc. | | unfortunately, different systems map differently. | |> He doesn't |> think there is a way for his UNIX SunOS to properly display my file. |> Does anyone know of a way to do this? | i'm no expert on this, but what you probably want is one of the isolatin | (iso8859) character sets. ascii is a proper subset of iso8859-1. [ My answer is partially terminal specific and partially uses document "ISO International Register of Coded Character Sets To Be Used With Escape Sequences". Sorry. ] For drawboxes ('nice graphics') he probably want play special graphics sets such as what is in VT100. ie -- Assign special graphic set to back G1 ESC ( 0 ESC is 0033 in octal -- select bank G1 for characters 32-127 SO SO is 0016 in octal -- For boxes you can now use character upper left corner: 0154 in octal, 0x6C in hex upper right corner: 0153 in octal, 0x6B in hex lower left corner: 0155 in octal, 0x6D in hex lower right corner: 0152 in octal, 0x6A in hex horizontal line: 0161 in octal, 0x71 in hex (characters 0157 - 0163 have horizontal lines) vertical line: 0170 in octal, 0x78 in hex -- To return US-ASCII, selext bank G0 for characters 32-127 SI SI is 0017 in octal (This assumes that in G0 have US-ASCII, if it don't include US-ASCII, you can assign it with ESC ( B ESC is 0033 in octal) That Special graphics set is DEC -specific, but for example (in theory) xterm also supports it. To assign Latin/1 you need VT300 or better: -- First assign US-ASCII to bank G0 ESC ( B ESC is 0033 in octal -- Select bank G0 for characters 32-127 SI SI is 0017 in octal -- Assign Latin/1 range 160-255 to bank G2 ESC . A ESC is 0033 in octal -- Select bank G2 for characters 160-255 ESC } ESC is 0033 in octal * Now you have Latin/1 available - If you have shortage of banks and you don't want use special graphich in bank G1, you can assign Latin/1 range 160-255 to bank G1 ESC - A ESC is 0033 in octal and select bank G1 for characters 160-255 ESC ~ ESC is 0033 in octal ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!mp.cs.niu.edu !vixen.cso.uiuc.edu!howland.reston.ans.net!spool.mu.edu !bloom-beacon.mit.edu!crl.dec.com!crl.dec.com!nntpd.lkg.dec.com !regent.enet.dec.com!lasko Organization: Digital Equipment Corporation Message-ID: <3th8pc$dl@nntpd.lkg.dec.com> References: <3tfc1c$qvl@senator-bedfellow.MIT.EDU> Date: 6 JUL 1995 13:54:55 From: lasko@regent.enet.dec.com X-From: (Tim Lasko, Digital Equipment Corp., Marlborough, MA) Subject: Re: Hebrew keyboard mapping In article <3tfc1c$qvl@senator-bedfellow.MIT.EDU>, igorlord@mit.edu (Igor Lyubashevskiy) writes... > >Hi, I am reading my VT420 manual, and it is totally clueless about the control >sequences that envolve Hebrew modes.... Does anyone at DEC or otherwise know >the correct values that go into those sequences ( CSI ? Pd h - like ). >Also, what are the mode identifiers of DECHEM (Hebrew encoding mode) and >DECNAKB (Greek Keyboard Mapping) since they are also mentioned to be either > 34, 35, or 57 in the description, index, or examples. There are actually four commands: DECRLM - Cursor Right to Left Mode ?34 DECHEBM - Hebrew (Keyboard) Mode ?35 DECHEM - Hebrew Encoding Mode ?36 DECNAKB - North American Keyboard Mode ?57 I'm looking at my VT5xx programming manuals (avaliable from Digital's ftp site) and I still see a few typos, unfortunately. >Finally, what is the function of >DECNAKB and DECHEBM (two very similar functions) when SET? The manual claims >that they function in an exactly opposite way to each other, which seems to me >highly illigical. They operate exactly as described. DECHEBM when reset and DECNAKB when set configure the terminal to use the North American keyboard layout. When DECHEBM is set and DECNAKB is reset, the corresponding "non North American" layout is configured. [Back when "specials" of the VT200 series terminals were done, commands to effect the similar operations (switching from a North American to a "non North American" keyboard for one) weren't always well rationalized with each other and these two got switched around. When those features were brought into the base VT400 unit, the definitions were kept the way they were for backwards compatibility with those units.] ------------------------------------------------------------------------------- Tim Lasko, Digital Equipment Corp., Marlborough MA (lasko@regent.enet.dec.com) My opinions are my own; the facts can speak for themselves. I'm on my own time. For Digital terminal support: call 1.800.777.4343 or email ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.mail.mime,comp.terminals,comp.software.international Path: cs.utk.edu!willis.cis.uab.edu!gatech!news.mathworks.com !newsfeed.internetmci.com!news.sprintlink.net!in2.uu.net!news.tele.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Date: 31 Aug 1995 06:28:25 GMT Organization: Finnish Meteorological Institute (FMI) Message-ID: <423kq9$7e5@kronos.fmi.fi> Subject: Re: Security and MIME (Especially, metamail) [ Added comp.terminals and comp.software.international as receiver. ] NED@innosoft.com (Ned Freed) writes in comp.mail.mime: in article <01HUOLFW583090MTNI@INNOSOFT.COM> | |<...> |Designers of user agents (and as you say this is not limited to MIME agents or |even mail user agents) are caught between a rock and hard place on this issue. |On the one hand, escape sequences are often used in text objects and if you |block them the text ends up looking like garbage. This is especially burden- |some to users of Japanese, Chinese, and Korean character sets that employ |escape sequence switching -- block the switching sequences and the result is |completely useless. And on the other hand, not blocking such sequences opens |the door to these kinds of attacks. And by the way, they aren't limited to |programmable keys -- programmable answerback sequences can also be used and |are a lot more common on older, poorly designed equipment. |<...> It is quite easy to allow only sequences what have _syntaxticallly_ correct according of ISO 2022. Switching sequences of Japanese, Chinese, and Korean character sets uses ISO 2022 codes (as far I know, I haven't read specs of everyone -- only some). And answerback codes and such a like don't match syntaxtically to these codes. Notice that matching of syntaxtically don't require to be list of all possible codes. ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.protocols.kermit.misc Path: cs.utk.edu!news.msfc.nasa.gov!newsfeed.internetmci.com!chi-news.cic.net !newsjunkie.ans.net!news.rmii.com!thoth.nilenet.com!ra.nilenet.com!gweisz From: gweisz@nilenet.com (Gideon Weisz) Subject: Hebrew e-mail, etc Date: 21 Dec 1995 04:28:02 GMT Organization: NileNet, Ltd Lines: 29 Message-ID: <4banoi$5k0@thoth.nilenet.com> For those who wish to do Hebrew e-mail, and already have a DOS PC and a UNIX internet node, things are now pretty easy, particularly if you have mskermit 3.14. We are even hoping [in 1995] that there will be a Hebrew mailing list soon. and with mskermit you can even compose hebrew messages in the recent English PINE easily, with the help of some scripts: kermit enables you to write in Hebrew characters and see them on your screen going the right way, while the scripts enable you to reverse their actual direction and right justify afterwards. some helpful files have been posted and are available on jerusalem1. e-brew.txt is a cookbook style info file e-brew.zip and its complementary ebrewadd.zip are a quickstart program package that can also serve as a convenient toolkit, and a later program and script package that improves it. the e-brew files are at ftp://ftp.jer1.co.il/pub/software/msdos/communication/e-brew.txt ftp://ftp.jer1.co.il/pub/software/msdos/communication/e-brew.zip ftp://ftp.jer1.co.il/pub/support/offline_mail/ebrewadd.zip the locations might change, but that's where the files are now. i don't want to use up bandwidth here, so anyone interested should contact me for a copy of the full announcement or anything else that i might be able to help with. gideon -- gideon weisz ïåòãâ [boulder, colorado] ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.fonts,comp.std.internat Path: cs.utk.edu!gatech!news.mathworks.com!fu-berlin.de!news.belwue.de !news.uni-konstanz.de!Otto.Stolz Date: 9 Jan 96 11:39:09 GMT Organization: Universitaet Konstanz From: Otto Stolz To: kirshenbaum@hpl.hp.com References: <499ccn$102o@info4.rus.uni-stuttgart.de> <819044957snz@sahaja.demon.co.uk> X-URL: news:DKoI2K.1Ip@hplabsz.hpl.hp.com Message-ID: <30f253dd.0@news.uni-konstanz.de> Lines: 16 Subject: Re: Euro Currency Symbol (was: What does the "forin" char stand for?) Christopher Fynn (cfynn@sahaja.demon.co.uk) wrote: > Does anyone know if a new currency symbol for this monetary > unit has been decided upon? Stephen Baynes wrote: > All the teletext standards [...] give [...] as the European > Currency symbol [...] a glyph of a combined C and E evan@hpl.hp.com (Evan Kirshenbaum) wrote: > it is the one given in the > Unicode standard as character U+2040, "EURO-CURRENCY SIGN" In my copy of ISO/IEC 10646-1: 1993(E), this is character number 20A0; position 2040 is assigned to the CHARACTER TIE. Unicode, most probably, complies with ISO 10646-1, in this respect. Regards, Otto Stolz ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.protocols.kermit.misc Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!cs.utexas.edu !howland.reston.ans.net!gatech!newsfeed.internetmci.com!panix !news.columbia.edu!watsun.cc.columbia.edu!fdc From: fdc@watsun.cc.columbia.edu (Frank da Cruz) Date: 2 Apr 1996 15:22:20 GMT Organization: Columbia University Lines: 45 Message-ID: <4jrgnc$l49@apakabar.cc.columbia.edu> References: <4hf5q8$fkj@apakabar.cc.columbia.edu> <4hkbmo$gou@apakabar.cc.columbia.edu> Subject: Re: Kermit 3.14 + Kanji = troubles In article zippy@hairball.ecst.csuchico.edu (The Pinhead) writes: : In article <4hkbmo$gou@apakabar.cc.columbia.edu> : fdc@watsun.cc.columbia.edu (Frank da Cruz) writes: : :: set file character-set shift-jis <-- (Irrelevant) :: set terminal bytesize 8 <-- Yes, you need this :: set parity none <-- Ditto :: set terminal character-set transparent <-- Ditto :: :: This is exactly the set of commands you need. If it doesn't work, that :: most likely means that CP982 is not the active code page, or that you don't :: have DOS/V in Japanese mode. Another possibility is that the host that :: you are connecting to is not itself in 8-bit mode. For example, if it were :: a SunOS 4.x system, you would need to tell it to: :: :: stty pass8 :: :: before you could see 8-bit characters (the Kanji bytes of Shift-JIS have :: their 8th bits set to 1). Use the equivalent command ("stty cs8" or :: whatever) on other versions of UNIX or other operating systems. : : You've been really helpful, Frank! Thanks... However, just one : problem remains... MSKermit 3.14 seems to be remapping the line : drawing characters. The application is an ACUCobol program in : shift-jis mode. Using CKermit 5A(190) under Linux with Japanese : extentions, the line drawing character 0xc4 displays properly as a : horizontal bar, but under MSKermit 3.14 it displays as the katakana : character "to" (0x44). Sorry for the delay in replying. This is from our informant in Japan: There are two problems: 1) There is no line drawing character in Japanese DOS/V character set (not only line drawing charters but also many symbols which are included US PC-DOS, e.g., copyright mark etc). 2) 0xc4 is officially defined as Katakana "to" in Shift-JIS code. If we change the mapping, it will cause many problems on many Japanese hosts where they use JIS-X-201 Katakana (Hankaku Katakana). (End quote) - Frank ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.linux.development.system,comp.std.internat,comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!math.ohio-state.edu !howland.reston.ans.net!gatech!newsfeed.internetmci.com!in2.uu.net !nntp.inet.fi!news.funet.fi!news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi !hurtta From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Date: 14 Apr 1996 10:29:40 GMT Organization: Finnish Meteorological Institute (FMI) Message-ID: <4kqk2k$5tm@kronos.fmi.fi> References: <4k73al$2mo@portal.gmu.edu> <4k87g6$5n@cortex.dialin.rrze.uni-erlangen.de> <4kandi$rm@cortex.dialin.rrze.uni-erlangen.de> <4kdcda$1fe@cortex.dialin.rrze.uni-erlangen.de> <316eb9a7.13321510@news.ucs.ubc.ca> In-Reply-To: Article <316eb9a7.13321510@news.ucs.ubc.ca> of Eric Gisin Subject: Re: Linux and UNICODE? [ Added comp.terminals as receiver. ] ericg@unixg.ubc.ca (Eric Gisin) writes in comp.os.linux.development and comp.std.internat: <...> | I thought stateful encodings were added to standard C at IBM's request, whose | EBCDIC-based multibyte character sets require it. What is ISO 2022, and is | anyone using it? I wouldn't want to see GNU libc implement something that's | never going to be used. <...> It is something what is used for example in Digital VT series terminal for selecting and managing character sets... Look for example VT300 series or newer. Linux's console driver implementation does not count (if it is not better after that when I last time looked it :-)) ISO 2022 is used also in Chinize and Japanise character sets. So different parts of ISO 2022 are definately in wide use... ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.folklore.computers Date: Wed, 12 Mar 1997 14:09:38 GMT From: John Savard Subject: Re: Pre-1988 Windows ECS dski@cameonet.cameo.com.tw wrote: >Anyone know what Windows used for an extended character set before >ISO 8859-1 came along? >Dan Strychalski >dski@cameonet.cameo.com.tw No, but ISO 8859-1 came along in 1985, since the Amiga used it then. Only the plus and minus signs weren't agreed on (and, from the code chart, there in a position that ought to be used for the OE and oe ligatures, and, according to the manual for my inkjet printer, is so used in some Unix character set). The other possibility would have been to use the DOS character set, of course, which is still used in some Windows fonts. John Savard ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!news1.radix.net!news4.agis.net !www.nntp.primenet.com!nntp.primenet.com!news-feed.inet.tele.dk !news.nacamar.de!howland.erols.net!worldnet.att.net !cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!newsxfer3.itd.umich.edu !newsxfer.itd.umich.edu!yale!oitnews.harvard.edu!fas-news.harvard.edu !newspump.wustl.edu!newsreader.wustl.edu!not-for-mail Date: Wed, 26 Mar 1997 17:53:44 -0600 Message-ID: <3339B708.7B98B902@artsci.wustl.edu> References: <5ha085$mh0@reader.seed.net.tw> From: Tom Stepleton Subject: Re: Strange IBM glyphs (Was: Amiga) dski@cameonet.cameo.com.tw wrote: > > I've heard it said the 5051 was conceived as a game machine. To look at > some of the characters IBM assigned to values in the ASCII control range > -- playing-card symbols and the like -- the idea doesn't seem so far- > fetched. And a bunch of I/O addresses were designated as the "game port." I wonder about this as well. Why did IBM use all of those bizarre glyphs for the control characters? The smiley faces (01,02), the game cards (03-07), the gender signs (0B,0C), and the 16th notes (0E) don't seem to serve too much of a purpose and aren't that easy for Joe BASIC Game Programmer to put on the screen with only PRINT statements. I remember hearing somewhere long ago that all these card symbols and such originated on Wang word processing systems, but I don't trust my memory... --Tom +-----------+---------------------------+ ____ | Stepleton | ssteplet@artsci.wustl.edu |>-------|\__/_/__ +-----------+---------------------------+ \________} ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!chi-news.cic.net!metro.atlanta.com !feeder.chicago.cic.net!newsrelay.netins.net!news.ececs.uc.edu !newsfeeds.sol.net!news.maxwell.syr.edu!supernews.com!news Organization: All USENET -- http://www.Supernews.com Message-ID: <333c5a9c.12736765@news.comland.com> References: <5ha085$mh0@reader.seed.net.tw> <3339B708.7B98B902@artsci.wustl.edu> Date: Sat, 29 Mar 1997 00:00:30 GMT From: orestes@comland.com (William D. Leara) Subject: Re: Strange IBM glyphs (Was: Amiga) On Wed, 26 Mar 1997 17:53:44 -0600, Tom Stepleton wrote: > > I wonder about this as well. Why did IBM use all of those bizarre > glyphs for the control characters? The smiley faces (01,02), the game > cards (03-07), the gender signs (0B,0C), and the 16th notes (0E) don't > seem to serve too much of a purpose and aren't that easy for Joe BASIC > Game Programmer to put on the screen with only PRINT statements. > > I remember hearing somewhere long ago that all these card symbols and > such originated on Wang word processing systems, but I don't trust my > memory... You're right on the money. Check out the October 2, 1995 edition of FORTUNE magazine, specifically the interview with Paul Allen and Bill Gates. Bill says: "... we were also facinated by dedicated word processors from Wang, because we believed that general-purpose machines could do that just as well. That's why, when it came time to design the keyboard for the IBM PC, we put the funny Wang character set into the machine--you know, smiley faces and boxes and triangles and stuff. We were thinking we'd like to do a clone of Wang word-processing software someday." -- William Leara orestes@comland.com ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!vixen.cso.uiuc.edu!howland.erols.net!europa.clark.net !newsfeeds.sol.net!noc.nyx.net!nyx10.cs.du.edu!not-for-mail Date: 29 Mar 1997 12:27:22 -0700 Organization: University of Denver, Dept. of Math & Comp. Sci. Message-ID: <5hjqeq$raa@nyx10.cs.du.edu> NNTP-Posting-Host: nyx10.nyx.net From: snorwood@nyx10.cs.du.edu (Scott Norwood) Subject: origins of '\' (backslash) on keyboards? I remember reading with interest the thread on the origins of the '\' as a directory separator for M$-DOS (as opposed to the '/' used in UNIX). Now, here's another question: at what point did the backslash key become standard for computer keyboards? It's not on my typewriter, nor is it on the keyboard of my Apple II (whose keyboard is essentially the same as a teletype terminal), but it does exist on early DEC terminals (VT-100, etc.) and other equipment of the late-1970's vintage. How did this practice start? Does it have any roots in the UNIX convention of using the backslash to indicate that the following character should be treated literally (as in referring to filenames with spaces or other 'weird' characters in them).? -- Scott Norwood: snorwood@nyx.net, snorwood@balloon.ml.org, senorw@mail.wm.edu Lame Home Page #1: http://balloon.ml.org/ <-- School year only Lame Home Page #2: http://www.nyx.net/~snorwood/ <-- Regular page ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!newsfeed.nacamar.de !europa.clark.net!newsfeed.internetmci.com!news1.infoave.net!usenet Date: Sun, 30 Mar 1997 05:27:14 GMT Organization: Fantasy Farm Fibers Message-ID: <333df90f.107920290@news.swva.net> References: <5hjqeq$raa@nyx10.cs.du.edu> NNTP-Posting-Host: pem02-02.swva.net From: bernie@rev.net (Bernie Cosell) Subject: Re: origins of '\' (backslash) on keyboards? snorwood@nyx10.cs.du.edu (Scott Norwood) wrote: } } I remember reading with interest the thread on the origins of the '\' } as a directory separator for MS-DOS (as opposed to the '/' used in UNIX). } } Now, here's another question: at what point did the backslash key } become standard for computer keyboards? It has been on computer keyboards for a very long time. The early Model 33 Teletypes had forward-slash and reverse-slash on the keyboard. At the time, there was no particular preference for one over the other: the keyboard just included both slashes. [it also had "uparrow" and "backarrow", which a later revision of ASCII (at the time the model 37 came out) changed to caret and underscore, respectively]. Neither forward-slash nor reverse-slash were added for the convenience of computers... /bernie\ -- .............................................................................. Date: 30 Mar 1997 20:44:25 GMT From: "Douglas W. Jones,201H MLH,3193350740,3193382879" Newsgroups: alt.folklore.computers Subject: Re: origins of '\' (backslash) on keyboards? From article <5hjqeq$raa@nyx10.cs.du.edu>, by snorwood@nyx10.cs.du.edu (Scott Norwood): > Now, here's another question: at what point did the backslash key > become standard for computer keyboards? ... The Teletype Models 33 (ASR, KSR, etc) had a backslash (shift L, if memory serves), and anything that copied the Teletype character set verbatim also had it. The Model 33 was the first ASCII terminal, and the 64 character subset of ASCII it supported (upper case only!) included the backslash. Doug Jones .............................................................................. Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!chi-news.cic.net!arclight.uoregon.edu !feed1.news.erols.com!howland.erols.net!swrinde!ihnp4.ucsd.edu !newshub.nosc.mil!news!mshapiro Organization: NCCOSC RDT&E Division, San Diego, CA References: <5hjqeq$raa@nyx10.cs.du.edu> <859668981snz@tnglwood.demon.co.uk> Message-ID: <1997Mar31.215753.24813@nosc.mil> Date: Mon, 31 Mar 1997 21:57:53 GMT From: Michael D Shapiro Subject: Re: origins of '\' (backslash) on keyboards? In article , Al Castanoli wrote: >Robert Billing writes: > >[...] > >: The key was certainly on the old ASR33, long before there were >: VT-anything terminals. I suspect that it antedates UN*X itself, and >: goes back to the deep magic at the dawn of ASCII. > >It was not on the Model 28 ASR, though ... I remember having to put >"backwards slash" in messages with Mod 28 ASR's and KSR's. Probably >a tradeoff in cramming ASCII into the Baudot bitstream. > The reverse solidus (back slash) showed up fairly early in ASCII code development, which was (as I recall) in the early 1960s. An excellent background on the history of character sets is in the book "Coded Character Sets" (I was about to give a more complete reference but forgot the author and publisher). Please let me know if you want a more complete reference. Incidentally, the Japanese equivalent of ASCII, JISCII, places the yen symbol in place of the reverse solidus. -- Michael D. Shapiro, Ph.D. Internet: mshapiro@nosc.mil Code 4123, NCCOSC RDT&E Division (NRaD) San Diego CA 92152 Voice: (619) 553-4080 FAX: (619) 553-4808 DSN: 553-4080 .............................................................................. Newsgroups: alt.folklore.computers Date: 1 Apr 1997 16:12:05 GMT From: BBReynolds Message-ID: <19970401161101.LAA26059@ladder01.news.aol.com> Subject: Re: origins of '\' (backslash) on keyboards? The complete reference is Charles E. Mackensie, , The Systems Programming Series, Reading, Massachusetts, Addison-Wesley, 1980. Chapters 12 and 13 cover the development of ASCII; the reverse solidus a/k/a (or is that a\k\a??) backslash was part of the original specification. -- Bruce B. Reynolds, Systems Consultant: Founder of Trailing Edge Technologies--- Sweeping Up Behind Data Processing Dinosaurs .............................................................................. Newsgroups: alt.folklore.computers Date: 1 Apr 1997 22:02:06 GMT Message-ID: <5hs0ku$mae$1@news.wizvax.net> From: John Wilson Subject: Re: origins of '\' (backslash) on keyboards? In article <5hmjb9$afo@flood.weeg.uiowa.edu>, Douglas W. Jones,201H MLH,3193350740,3193382879 wrote: >The Teletype Models 33 (ASR, KSR, etc) had a backslash (shift L, if memory >serves), and anything that copied the Teletype character set verbatim also >had it. The Model 33 was the first ASCII terminal, and the 64 character >subset of ASCII it supported (upper case only!) included the backslash. What particularly impressed me about that nasty little mechanical keyboard on the 33 was that its method of generating control characters was consistent, i.e. shift-K gave you "[" and if you wanted ESCape (^[) (N.B. NOT ALTMODE!), you typed ctrl-shift-K. And if I remember right, there was an interlock so that when you had ctrl and shift down, only those data keys that would now send something different would allow themselves to be pressed. Kinda cute. That answerback drum is something pretty special too... -- John Wilson 0,3 @ SID .............................................................................. Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!chi-news.cic.net!arclight.uoregon.edu!europa.clark.net !cpk-news-hub1.bbnplanet.com!cam-news-hub1.bbnplanet.com !news.bbnplanet.com!howland.erols.net!EU.net!news.eunet.fi !news.microdata.fi!nntp.inet.fi!news.sci.fi!usenet Message-ID: <3340bb0f.65034735@news.sci.fi> Date: Tue, 01 Apr 1997 09:03:07 GMT From: Paul Keindnen Subject: Re: origins of '\' (backslash) on keyboards? mshapiro@nosc.mil (Michael D Shapiro) wrote: > >The reverse solidus (back slash) showed up fairly early in ASCII code >development, which was (as I recall) in the early 1960s. An excellent >background on the history of character sets is in the book "Coded >Character Sets" (I was about to give a more complete reference but >forgot the author and publisher). Please let me know if you want a >more complete reference. > >Incidentally, the Japanese equivalent of ASCII, JISCII, places the yen >symbol in place of the reverse solidus. While strictly speaking ASCII is a purely US standard, many national 7-bit character sets exist in the rest of the world, which are almost identical to the ASCII character set, but a few character positions are reserved for national variations. There are usually 6 to 9 character positions that differ from the ASCII representation. Crosshatch character, code 35 (decimal), is in many character sets the pound sign. Character codes after Z (91..94) and after z (123..126) are used for national variants. The backslash (92) is in the Dutch character set '1/2', while in the Finnish, German and Swedish character sets it is upper case O with two dots, in the French and Italian character set it is c with cedilla, in the Norwegian set it is O with a slash and in the Spanish set it is N with tilde. Apparently when the terminal manufacturers had to make keyboards for these languages and include keys for the "extra" characters, it was not economical to manufacture keyboards with different number of keys for each market, the extra keys in the US version were used to generate the same character code as the foreign version, but was labelled with the backslash etc. key cap, which otherwise would not have "deserved" an own key. Paul Keinanen .............................................................................. Newsgroups: alt.folklore.computers Date: Thu, 03 Apr 1997 22:23:33 GMT Message-ID: <5i1au5$bpj@tor-nn1-hb0.netcom.ca> From: John Savard Subject: Re: origins of '\' (backslash) on keyboards? In <3340bb0f.65034735@news.sci.fi>, keinanen@sci.fi (Paul Keindnen) wrote: > > While strictly speaking ASCII is a purely US standard, many national > 7-bit character sets exist in the rest of the world, which are almost > identical to the ASCII character set, but a few character positions > are reserved for national variations. Yes, and these character sets belong to International Telegraph Alphabet No. 5, which is the international version of ASCII; so there is a worldwide standard based on ASCII. John Savard .............................................................................. Message-ID: <5i1aop$bpj@tor-nn1-hb0.netcom.ca> Date: Thu, 03 Apr 1997 22:20:40 GMT From: John Savard Newsgroups: alt.folklore.computers Subject: Re: origins of '\' (backslash) on keyboards? snorwood@nyx10.cs.du.edu (Scott Norwood) wrote: > I remember reading with interest the thread on the origins of the '\' > as a directory separator for M$-DOS (as opposed to the '/' used in UNIX). > Now, here's another question: at what point did the backslash key > become standard for computer keyboards? Well, back in 1964, when the original ASR-33 Teletype was produced-- when ASCII was invented, in other words--the backslash was part of the character set. Back then, the caret was instead an up arrow (which it should have remained, being useful as an exponentiation operator); the underscore was an arrow pointing left; and there were no lowercase characters; ` { | } and ~ did not exist yet. However, in addition to DEL, the last few characters before it were controls as well: ACK, ESC, and ALT MODE then are now printing characters, and a different control character is used for ESC. The last 8 of the first 32 characters did not have their present meanings; they were just S0 through S7. John Savard ////////////////////////////////////////////////////////////////////////////// Sender: Message-Id: <9207241339.AA04105@skinfaxe.diku.dk> Cc: Date: Fri, 24 Jul 1992 15:39:12 +0200 From: Steen Linden Subject: Re: Character sets In message <9207231551.AAgandalf08152@gandalf.uio.no> asked: > > [about character sets in email-directory support] > > If not, where do I start? Must I patch each DUA, or just some library? The ISO8859-1 conversion stuff is in libcommon.a. Take a look at isode-8.0/dsap/common/string.c. The most interesting functions are strprint() and iso8859print(). I haven't done any of the work you are requesting, though I could definitely use it. I was just looking through the code in search of the T.61 version of our Danish common national letters. Here they are by the way: T.61 X11 Keysym ISO8859/1 ASCII --------------------------------- \f1 ae 0xE6 { \e1 AE 0xC6 [ \f9 oslash 0xF8 | \e9 Ooblique 0xD8 \ \caa aring 0xE5 } \caA Aring 0xC5 ] --Steen ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!newsfeed.nacamar.de !news-feed.inet.tele.dk!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com !news-peer.sprintlink.net!news.sprintlink.net!sprint!howland.erols.net !spring.edu.tw!feeder.seed.net.tw!reader.seed.net.tw!!dski Organization: Cameo Communications, Inc. Message-ID: <5iujss$h0h@reader.seed.net.tw> NNTP-Posting-Host: 192.72.104.4 Date: 15 Apr 1997 00:59:08 GMT From: dski@cameonet.cameo.com.tw Subject: Re: ASCII History - No Cents? John Savard (seward@netcom.ca) wrote -- > Since ASCII was originally developed in 1963 for US use, the cents > sign would perhaps have been a useful thing to include. Both ISO and ASA (now called ANSI) began work on text encoding standards in 1961. ASA being an ISO member body, it seems likely that the work was coordinated to some degree. The 1963 ASA standard was for a six-bit uppercase-only version of ASCII; seven-bit "ASCII" seems to have been adopted first in *Europe*, in the form of ECMA-6, which the European Computer Manufacturers Association approved in 1965. The earliest seven-bit U.S. version I know of dates from 1968. This is also the year in which President Johnson mandated ASCII for the federal government's computer operations. > I have felt that the following assignments would make a good 'National > Use' version of ASCII for North American English-language word > processing, considering the keyboard arrangement: Which keyboard arrangement? Before IBM got into the microcomputer sca^H^H^Hbiz, many non-alphanumeric marks were not where you find them now. Most digital keyboards used ASCII-based pairings (["] with [2]; [&] with [6]; ['] with [7]; [(] and [)] with [8] and [9]; etc.). IBM used a layout similar to that of the Selectric. > Of course, I am deeply distressed both by the placement of the > multiplication and division signs in eight-bit ASCII where OE and oe > clearly belong... > > and, in the opposite direction, by the fact that the only standard > eight-bit ASCII is totally devoted to foreign-language word > processing. I wanted AND, OR, less than or equal to, not equal to, > greater than or equal to, and other symbols useful for _programming > languages_ ( X and -^H:, although useful for ALGOL, hardly count ) in > an eight-bit ASCII. Along, of course, with the _Greek_ alphabet [...] Code Page 437! You have it! I've seen so many different characters used for AND, OR, NOT, et al, in *typeset* material, I wonder if they just couldn't agree on which ones to use. Did these get standardized when I wasn't looking? > (although LLL 8-bit ASCII isn't perfect either, as APL should instead > get a character set of its own). "8-bit ASCII" is a contradiction in terms, I believe. What's LLL? -- Dan Strychalski ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!news.maxwell.syr.edu !news-peer.sprintlink.net!news.sprintlink.net!Sprint!howland.erols.net !europa.clark.net!newsfeed2!argos.tel.hr!news Date: 7 May 1997 19:36:08 GMT Organization: Croatian Post & Telecommunications Message-ID: <01bc5b1d$4d7c4420$86e11dc3@gost.hr> References: <01bc5620$cf1a97e0$62e31dc3@gost.hr> <01bc56f2$19bb2220$d2e31dc3@gost.hr> <5kepgr$rgf@neptune.theplanet.co.uk> NNTP-Posting-Host: ac23-p1-zg.tel.hr From: "Bernard Grgic" Subject: Re: Need help with Fonts in Hyper Terminal Win95 Only monospaced (unproportional) fonts are listed in Hyper Terminal font set, NOT every TT Font, as I expected in the beginnig. I had to redesigne one of them (make Croatian characters instead of some other characters like square brackets) The problem was in a fact that I need to communicate with UNIX through code page CP437, not CP852, where I have Croatian characters, by default. I have VGA driver for CP437 with (redesigned) Croatian characters, but it does not work with programs in graphic mode. Greetings, Bernard. oliver st.john wrote in article <5kepgr$rgf@neptune.theplanet.co.uk>... > > >Do not vaste your time reading the text below! > >The problem has been solved, successfuly. > >Bernard > > [ > [ HOW? I'm sure a few of us would like to know... > [ > > >Bernard Grgic wrote in article > > <01bc5620$cf1a97e0$62e31dc3@gost.hr>... > >> Hi, > >> How can I choose other TT Font which is not specified in Hyper Terminal. I > >> need to do it if I want to have Croatian characters on the screen, when > >> communicate with UNIX. I have such characters in TT Fonts but they are not > >> listed in Hyper Terminal font list. > >> > >> If anyone can help or give me any suggestion, please, do it. > >> Thank you. > >> Bernard ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.lang.pl1 Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!newsfeeds.sol.net !news.maxwell.syr.edu!feed1.news.erols.com!news.ecn.uoknor.edu !munnari.OZ.AU!news.mel.connect.com.au!harbinger.cc.monash.edu.au !news.rmit.EDU.AU!goanna.cs.rmit.edu.au!not-for-mail Message-ID: <5outhp$h6n$1@goanna.cs.rmit.edu.au> References: <33B2DACD.4279@ix.netcom.com> Organization: Comp Sci, RMIT University, Melbourne, Australia. Date: 27 Jun 1997 09:21:29 +1000 From: rav@goanna.cs.rmit.edu.au (robin) Subject: Re: CHARSET (48) vs CHARSET (60) In <33B2DACD.4279@ix.netcom.com>, dneubart@ix.netcom.com writes: >I'm trying to map the PL/I 48-character set to 60-character set. >I haven't been able to match anything more than > > .. (dot dot) to : (colon) > ,. (comma cot) to ; (semi-colon) > >Does anyone know the 48-character equivalents for > > > (greater than) > < (less than) > | (logical or) > etc. OTOMH, > is GT < is LT | is OR & is AND || is CAT >= is GE <= is LE ^= is NE ^ is NOT -> is PT ^> is NG ^< is NL Are there any others? ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.lang.pl1 Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!vixen.cso.uiuc.edu !ais.net!news.maxwell.syr.edu!howland.erols.net!psinntp !clothos.candle.com!phobos.candle.com!news Organization: Candle Corporation Message-ID: <33B2FBEE.E9E@candle.com> References: <33B2DACD.4279@ix.netcom.com> Date: Thu, 26 Jun 1997 16:31:58 -0700 From: Eric Jackson Subject: Re: CHARSET (48) vs CHARSET (60) dneubart@ix.netcom.com wrote: > > I'm trying to map the PL/I 48-character set to 60-character set. Boy, this takes me back. These are the ones I can think of off hand: GT > LT < OR | LE <= GE >= NOT * NE *= CAT || AND & ////////////////////////////////////////////////////////////////////////////// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ Path: cs.utk.edu!darwin.sura.net!spool.mu.edu!bloom-beacon.mit.edu!ai-lab !prep.ai.mit.edu!gnu Newsgroups: gnu.announce,gnu.utils.bug,comp.unix.shell, comp.unix.programmer,comp.unix.misc Followup-To: gnu.utils.bug Approved: info-gnu@prep.ai.mit.edu Message-ID: <9312240149.AA11561@icule.UUCP> To: info-gnu@prep.ai.mit.edu Date: Fri, 24 Dec 1993 01:49:20 GMT From: pinard%icule.UUCP@iro.umontreal.ca (Francois Pinard) Subject: Release: GNU recode version 3.3 Here is my Christmas gift to GNU users. Best wishes to all of you! GNU recode 3.3 should soon be available on prep.ai.mit.edu, as file pub/gnu/recode-3.3.tar.gz. All reported bugs have been corrected. Thanks to all those who contributed comments or suggestions. recode converts files between character sets and usages. When exact transliterations are not possible, it may get rid of the offending characters or fall back on approximations. This program recognizes or produces nearly 150 different charsets, able to transliterate files between almost any pair. Most RFC 1345 charsets are supported. Please report bugs to: bug-gnu-utils@prep.ai.mit.edu Here is a list of user visible changes from version 3.2.4: * Charsets atarist, ebcdic-ccc, ebcdic-ibm and nextstep have been added. * Also, most RFC 1345 charsets and aliases are handled. That's a bunch! * Old ascii disappears because of RFC 1345's ascii, use ascii-bs instead. * Old maci disappears because of RFC 1345's macintosh, use applemac instead. * Charsets cccascii and cdcascii disappear, use ebcdic-ccc and ebcdic instead. * Recoding between latin1, ibmpc and applemac is (almost) reversible. * The texinfo documentation has been reorganized, this to be continued. * Long options are accepted, charset names may be abbreviated. * Option --list (-l) displays charsets, aliases and contents in many formats. * Option --strict (-s) asks for stricter, non-reversible recodings. * Option --graphics (-g) approximates ibmpc rulers with ASCII graphics. * Option --header (-h) produces C source for many recoding tables. * Option --auto-check (-a) reports about all possible recodings. * Option --ignore (-x) prevents a charset from being selected. * Execution has been sped up through step merging, hashing for charset names. * Many various buglets have been eradicated, portability increased. * Charsets may be edited out by modifying the Makefile only. * Configuration is made through the use of an external config.h file. -- Franc,ois Pinard ``Vivement GNU!'' pinard@iro.umontreal.ca About the League for Programming Freedom? Email me or lpf@uunet.uu.net [ Most GNU software is packed using the new `gzip' compression program. Source code is available on most sites distributing GNU software. For information on how to order GNU software on tape, floppy, or cd-rom, check the file etc/ORDERS in the GNU Emacs distribution or in GNUinfo/ORDERS on prep, or e-mail a request to: gnu@prep.ai.mit.edu By ordering your GNU software from the FSF, you help us continue to develop more free software. Media revenues are our primary source of support. Donations to FSF are deductible on US tax returns. The above software will soon be at these ftp sites as well. Please try them before prep.ai.mit.edu! thanx -gnu@prep.ai.mit.edu ASIA: ftp.cs.titech.ac.jp, utsun.s.u-tokyo.ac.jp:/ftpsync/prep, cair.kaist.ac.kr:/pub/gnu AUSTRALIA: archie.au:/gnu (archie.oz or archie.oz.au for ACSnet) AFRICA: ftp.sun.ac.za:/pub/gnu MIDDLE-EAST: ftp.technion.ac.il:/pub/unsupported/gnu EUROPE: irisa.irisa.fr:/pub/gnu, ftp.univ-lyon1.fr:pub/gnu, ftp.mcc.ac.uk, unix.hensa.ac.uk:/pub/uunet/systems/gnu, src.doc.ic.ac.uk:/gnu, ftp.win.tue.nl, ugle.unit.no, ftp.denet.dk, ftp.informatik.rwth-aachen.de:/pub/gnu, ftp.informatik.tu-muenchen.de, ftp.eunet.ch, nic.switch.ch:/mirror/gnu, ftp.funet.fi:/pub/gnu, isy.liu.se, ftp.stacken.kth.se, ftp.luth.se:/pub/unix/gnu, archive.eu.net WESTERN CANADA: ftp.cs.ubc.ca:/mirror2/gnu USA: wuarchive.wustl.edu:/mirrors/gnu, labrea.stanford.edu, ftp.digex.net:/pub/gnu, ftp.kpc.com:/pub/mirror/gnu, ftp.cs.widener.edu, uxc.cso.uiuc.edu, ftp.hawaii.edu:/mirrors/gnu, ftp.cs.columbia.edu:/archives/gnu/prep, col.hp.com:/mirrors/gnu, gatekeeper.dec.com:/pub/GNU, ftp.uu.net:/systems/gnu ] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.unix.solaris Date: 19 Mar 1998 08:53:34 GMT From: "Casper H.S. Dik - Network Security Engineer" Subject: Re: Special Symbol Characters (copyright, trademark, etc.)? [[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]] R!ch writes: >On Wed, 18 Mar 1998, Akira Hangai wrote: >> How could I type a special symbol character such as copyright, >> trademark, registered, etc., in a program like Text Editor, Netscape >> Mail, or even ShellTool/Dt Terminal? >There's a section in one of the manuls that shows the Compose key >sequences for most of these (stuff like =A3, =A9, =E6, etc) - but irritatin= >gly, >I've forgotten which one. IIRC, you have to using an 8 bit locale >to display the characters. There's also the table in /usr/openwin/share/include/X11/Suncompose.h ComposeTableEntry compose_table[] = { ... Of course, as a native dutch person I miss teh ability to use compose i-j to create a "^?" (y with diaresis, compose " y) Note that you can also use the reverse of the compositions as the lookup routine will first sort the two characters on ascii values and then lookup he entry in the table. Casper .............................................................................. Date: 21 Mar 1998 00:41:37 GMT From: "Richard L. Hamilton" Newsgroups: comp.unix.solaris Subject: Re: Special Symbol Characters (copyright, trademark, etc.)? In article , R!ch writes: > On Wed, 18 Mar 1998, Akira Hangai wrote: > >> How could I type a special symbol character such as copyright, >> trademark, registered, etc., in a program like Text Editor, Netscape >> Mail, or even ShellTool/Dt Terminal? > > > There's a section in one of the manuls that shows the Compose key > sequences for most of these (stuff like =A3, =A9, =E6, etc) - but irritatin= > gly, > I've forgotten which one. IIRC, you have to using an 8 bit locale > to display the characters. > Or just look at the compose_table[] initializer in /usr/openwin/include/X11/Suncompose.h You don't even have to understand C to figure that one out. > -- > R!ch (Email is flakey at present: use richardt@keaton.uk.sun.com) > | Richard Teer richard.teer@uk.sun.com | > | WWW: www.rkdltd.demon.co.uk | ftp> get |fortune 377 I/O error: smart remark generator failed Bogonics: the primary language inside the Beltway mailto:rlhamil@mindwarp.smart.net http://www.smart.net/~rlhamil ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Date: 27 Mar 1998 09:12:37 GMT From: vjp2@dorsai.org @smtp.dorsai.org (Vasos Panagiotopoulos +1-917-287-8087 Bioengineer-Financier) Subject: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts I have successfully read Greek from Lynx on Windows using the Courier New Greek font from www.hri.org/fonts about two years ago - (yet it did not work with the www.goarch.org Greek pages - at least not back then - and they told me that since it worked with Windows, they weren't going to bother). But I have never succeeded in doing this with DOS and MSKermit. I have only tried this using the abcgrl program (I thought it should work because kdp works the same way for Japanese). Perhaps I have been setting the character set wrong in Lynx (heck if I even remember what I used in Windows). HRI had a codepage which doesn't look like the typical IBM codepages in name (I forget right now) so I was confused if it would work and if indeed IBM offers a different codepage inside Greece. Where would I be able to find the standard IBM codepage in the USA (preferably on the web)? Is it possible that the problem with the IBM codepage is that it uses an older format and not the (ISO) ELOT 928? (I saw the Greek fonts for Win 3.11 and I don't recall seeing it there, but I might be wrong.) Please excuse my confusion. - = - Vasos-Peter John Panagiotopoulos II, Columbia'81+, Bioengineer-Financier, NYC BachMozart ReaganQuayle EvrytanoKastorian http://WWW.Dorsai.Org/~vjp2 vjp2@{MCIMail.Com|CompuServe.Com|Dorsai.Org} ---{Nothing herein constitutes advice. Everything fully disclaimed.}--- .............................................................................. Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Date: 27 Mar 1998 15:33:44 GMT From: Frank da Cruz Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts MS-DOS Kermit has no built-in support for Greek. You would need to find and load a Greek code page that agrees with the host encoding, and then use "set terminal character set transparent". : Where would I be able to find the : standard IBM codepage in the USA (preferably on the web)? : Good question. If you find an answer, please be sure to post it. By the way, Kermit 95 does support Greek: both ELOT 927 and 928. - Frank .............................................................................. Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Followup-To: comp.protocols.kermit.misc,grk.forthnet.users,soc.culture.greek Date: 30 Mar 1998 10:13:53 GMT From: vjp2@dorsai.org @smtp.dorsai.org (Vasos Panagiotopoulos +1-917-287-8087 Bioengineer-Financier) Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts I got gauss.cpi from www.hri.org/fonts but don't know if there's a way to make MS-Kermit alone use it. Since it doesn't work the way the MS-DOS help files say it shoul (but I don't have the Greek files the MS-DOS help files say I need - I'm told only NT 4.0 DOS is country-blind and has them all on one version - I was wondering if there is a web page to find them at? Or do I have to go out and buy "Greek MS-DOS"?). KDP is a Japanese-font utility which allows Kermit to read Japanese (You run Kermit with the DOS command "KDP MSKERMIT") I have used the ABCGRL.COM utility (TSR?) to use ELOT fonts in Emacs, VEdit and other DOS programs, but it doesn't seem to work with Kermit (linking to Lynx). In Windows, connecting to Lynx with a terminal emulator, works ok if I set IBM CODEPAGE and RAW MODE. But in MSKermit, this just beeps alot and splatters all over the page when Greek text is encountered. I am also confused, so sorry in advance. - = - Vasos-Peter John Panagiotopoulos II, Columbia'81+, Bioengineer-Financier, NYC .............................................................................. Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Date: 30 Mar 1998 15:36:09 GMT From: Frank da Cruz Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts In article <6fh6l5$1du@news3.euro.net>, Denis Liigeois wrote: : : Just a question: ELOT 928 is ISO-8859-7. What is ELOT 927 ? It is a 7-bit set (like ASCII) in which the lowercase Roman letters are replaced by uppercase Greek letters. - Frank ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Message-ID: <567ce$a2a1b.32c@news.kea.bc.ca> X-Newsreader: Microsoft Outlook Express 4.72.2106.4 X-Mimeole: Produced By Microsoft MimeOLE V4.72.2106.4 Date: Wed, 6 May 1998 10:42:03 -0700 From: "Michael Simms" Subject: New Euro Currency symbol and DEC terminals Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will support the new Euro Currency symbol. Such as which position in the character sets. Any information would be appreciated. .............................................................................. Date: 6 May 1998 23:34:42 GMT From: "T.E.Dickey" Newsgroups: comp.terminals Subject: Re: New Euro Currency symbol and DEC terminals Michael Simms wrote: : Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will : support the new Euro Currency symbol. Such as which position in the : character sets. Any information would be appreciated. Without a hardware change, they won't (it's not an ISO-8859-1 character). -- Thomas E. Dickey dickey@clark.net http://www.clark.net/pub/dickey .............................................................................. Date: 7 May 1998 01:55:26 GMT From: Jeffrey Altman Newsgroups: comp.terminals Subject: Re: New Euro Currency symbol and DEC terminals In article <6iqs2i$agh$2@clarknet.clark.net>, T.E.Dickey wrote: : : Michael Simms wrote: : : : : Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will : : support the new Euro Currency symbol. Such as which position in the : : character sets. Any information would be appreciated. : : without a hardware change, they won't (it's not an ISO-8859-1 character). It will have to be supported as a soft character set or by the addition of additional character sets such as ISO-8859-15 which do include the Euro. FYI, Kermit 95 1.1.17 will support all of the new ISO and IBM Code Page character sets, which include the "Euro". -- Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Organization: Flashnet Communications, http://www.flash.net Sender: sturgeon@199.165.143.240 Message-ID: <35a101b0.1820019719@news.flash.net> Date: Mon, 06 Jul 1998 17:02:13 GMT From: JonS@futuresoft.com (Jon Stugeon) Subject: Euro currency symbol & dumb terminals/emulators All, I've recently been reading about support for the new Euro currency symbol in the Windows 95/98 & NT O/Ss. This got me thinking if there will be any kind of standard for how legacy host applications will represent the Euro symbol. Obviously if the final display device is a physical dumb terminal (eg VT-220) then it won't know anything about the Euro symbol, but if an emulator is being used then it could be configured to display the Euro symbol in place of an existing character. So, which character would be replaced? My guess is that this would be done on an ad-hoc, host-to-host basis, but I'd be glad to be put right. Regards, Jon Sturgeon JonS@futuresoft.com .............................................................................. Newsgroups: comp.terminals Organization: Columbia University Message-ID: <6nr10p$sf$1@apakabar.cc.columbia.edu> References: <35a101b0.1820019719@news.flash.net> NNTP-Posting-Host: watsun.cc.columbia.edu Date: 6 Jul 1998 17:20:25 GMT From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman) Subject: Re: Euro currency symbol & dumb terminals/emulators In article <35a101b0.1820019719@news.flash.net>, Jon Stugeon wrote: : All, : : I've recently been reading about support for the new Euro currency : symbol in the Windows 95/98 & NT O/Ss. This got me thinking if there : will be any kind of standard for how legacy host applications will : represent the Euro symbol. : : Obviously if the final display device is a physical dumb terminal (eg : VT-220) then it won't know anything about the Euro symbol, but if an : emulator is being used then it could be configured to display the Euro : symbol in place of an existing character. : : So, which character would be replaced? My guess is that this would be : done on an ad-hoc, host-to-host basis, but I'd be glad to be put : right. : : Regards, : Jon Sturgeon : JonS@futuresoft.com : Character-sets (including those with Euro support) are defined by the ISO as part of standard 8859. These are to be used by the host. IBM has defined new code pages for the inclusion of the Euro and Microsoft has added the Euro to its existing code pages. Emulators should not make up their own. Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org .............................................................................. Newsgroups: comp.terminals Organization: Columbia University Lines: 62 Message-ID: <6o0jcc$pqd$1@apakabar.cc.columbia.edu> References: <35a101b0.1820019719@news.flash.net> <35a28fdd.17881822@news.flash.net> <6nuogi$3n2$1@apakabar.cc.columbia.edu> <35a3a03e.87610587@news.flash.net> NNTP-Posting-Host: watsun.cc.columbia.edu Date: 8 Jul 1998 20:04:28 GMT From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman) Subject: Re: Euro currency symbol & dumb terminals/emulators In article <35a3a03e.87610587@news.flash.net>, Jon Stugeon wrote: : : So you're expecting users of host applications to use character : translation features built-into terminal emulation software to choose : to map an arbitrary character in the *host* character set to the : appropriate character representing the Euro in the code page they are : using in their display font? Then if the host application needed to : display the Euro symbol *in addition* to an existing currency symbol : it would need to be modified to be aware of the configuration of the : user's emulation package? : : Surely if there was some kind of standard agreed upon for which : character the host applications will use to represent the Euro then we : wouldn't have another of those cases where the user doesn't get the : correct symbol just because his emulation software isn't configured : correctly. : : Or have I got hold of the wrong end of the stick here? The way that terminals (and emulation software) is supposed to work is that the host application instructs the terminal as to which character-set(s) should be loaded into the G0,G1,G2, and G3 character-set tables. These tables are then used to map a byte from the host to a particular character for display. If the local system does not support the character-sets used by the application, it must perform translation to a character-set that it does support. There are international standards for all of this. The ISO defined ISO 2022 more than 20 years ago to address the host to terminal assignment of character-sets and the mechanisms for switching between them. ISO 8859 defines the agreed upon International character-sets. Part 15 declares the newly formed Western European character set which includes the Euro. IBM maintains the Code Page Registry. As such they introduced new code pages for both ASCII and EBCDIC systems that include the Euro for use in their operating systems (DOS and OS/2 on the PC; OS/400; OS/390, ...). Microsoft maintains its own Code Pages for Windows which are registered with IBM as Code Pages 1250-1258. These are based on the ISO 8859 character-sets but include printable characters in the C1 range. And then, of course, Unicode has defined a position for the Euro in version 2.2 of that standard. (0x20AC) ISO 2022 was used as the basis for the character-set handling ANSI X3.64-1979 (since withdrawn) which was the basis for most Unix consoles and the DEC VT terminal line. It is also the basis of ISO-6429, which is the international standard which replaced ANSI X3.64-1979. Since FutureSoft is a manufacturer of terminal emulation software I would have expected you to know all this. How can Dynacomm emulate a VT terminal if it doesn't support this functionality? -- Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org .............................................................................. Newsgroups: comp.terminals Sender: sturgeon@199.165.143.240 Message-ID: <35a4fe63.111711372@news.flash.net> References: <35a101b0.1820019719@news.flash.net> <35a28fdd.17881822@news.flash.net> <6nuogi$3n2$1@apakabar.cc.columbia.edu> <35a3a03e.87610587@news.flash.net> <6o0jcc$pqd$1@apakabar.cc.columbia.edu> NNTP-Posting-Host: 199.165.143.240 Date: Wed, 08 Jul 1998 23:23:26 GMT From: JonS@futuresoft.com (Jon Stugeon) Subject: Re: Euro currency symbol & dumb terminals/emulators On 8 Jul 1998 20:04:28 GMT, jaltman@watsun.cc.columbia.edu (Jeffrey Altman) wrote: >Since FutureSoft is a manufacturer of terminal emulation software >I would have expected you to know all this. How can Dynacomm >emulate a VT terminal if it doesn't support this functionality? Thanks for the comprehensive reply, Jeffrey. DynaComm indeed emulates a VT terminal, including support for NRCs, DEC Supplemental/Graphics etc etc, but that does necessarily mean that everybody that works for the manufacturer has the benefit of and understands the years of history behind character set development. Furthermore, not everybody that works for FutureSoft works in terminal emulation. I am trying to understand what, if any, modifications would be necessary to provide "support for the Euro", that is the reason for my original enquiry. Regards, Jon Sturgeon JonS@futuresoft.com .............................................................................. Newsgroups: comp.terminals Date: 9 Jul 1998 04:58:49 GMT Organization: Columbia University Message-ID: <6o1im9$cgj$1@apakabar.cc.columbia.edu> References: <35a101b0.1820019719@news.flash.net> <35a3a03e.87610587@news.flash.net> <6o0jcc$pqd$1@apakabar.cc.columbia.edu> <35a4fe63.111711372@news.flash.net> NNTP-Posting-Host: watsun.cc.columbia.edu From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman) Subject: Re: Euro currency symbol & dumb terminals/emulators In article <35a4fe63.111711372@news.flash.net>, Jon Stugeon wrote: : : DynaComm indeed emulates a VT terminal, including support for NRCs, : DEC Supplemental/Graphics etc etc, but that does necessarily mean that : everybody that works for the manufacturer has the benefit of and : understands the years of history behind character set development. : Furthermore, not everybody that works for FutureSoft works in terminal : emulation. : : I am trying to understand what, if any, modifications would be : necessary to provide "support for the Euro", that is the reason for my : original enquiry. I apologize for assuming more knowledge than you have at your disposal. I assumed (obviously incorrectly) that either you would be asking this question because you are somehow involved in your company's terminal emulation development; or that you have spoken to your own developers before asking this query on the Net. While I am a strong believer is the open sharing of knowledge, I must admit that I am a bit hesitant to provide a direct competitor with information that will help it takes sales away from my product. On the other hand, I couldn't let someone comes up with yet another hack solution (that I would end up needing to support for a customer in five years) because of ignorance. Hope this thread has been useful. -- Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals References: <8kc2os$fkm$1@as102.tel.hr> Organization: @Home Network Newsgroups: comp.terminals Message-ID: Date: Mon, 10 Jul 2000 18:43:50 GMT From: dls2 Subject: Re: Setting keyboard over Esc sequences on VT510/520 "IdrEASY" wrote: > Hi! > > I need to set Croatian keyboard over esc sequence (SCS=Croatian/Slovenian > latin). > Also I know that "ESC(&3" are sequence for Russian cyrilic. > > I wrote little program with double loop and generate esc calls, but only > have success for > Russian keyboard. > > Please, help me. > > Bye! The "(" represents the (94-character) G0 character set. The ")" represents the (94-character) G1 character set. The "*" represents the (94-character) G2 character set. The "+" represents the (94-character) G3 character set. The "-" represents the (96-character) G1 character set. The "." represents the (96-character) G2 character set. The "/" represents the (96-character) G3 character set. Russian NRCS is "&5", not "&3". SCS NRCS is "%3". So you should be using "ESC(%3". -- Derrick Shearer ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals References: <8kc2os$fkm$1@as102.tel.hr> <8kho8g$2lr$1@as102.tel.hr> <8khpnp$im8$3@news1.Radix.Net> Date: Wed, 12 Jul 2000 14:04:44 +0100 Organization: RDEL Newsgroups: comp.terminals Message-ID: <396C6CEC.ABE87C5E@rdel.co.uk> From: Paul Williams Subject: Re: Setting keyboard over esc sequences on VT510/520 Thomas Dickey wrote: > > IdrEASY wrote: > >> > >> Russian NRCS is "&5", not "&3". > >> SCS NRCS is "%3". > > > Sorry, on my terminal (&4 is Russian, but (%3 does nothing. > > what type of terminal is that? > (I'm assuming vt220) It says VT510/520 on the subject line, Tom. (Yes, I hate it when vital information is only mentioned on the subject line!) ////////////////////////////////////////////////////////////////////////////// >.Newsgroups: comp.unix.solaris >.Message-ID: <3B8E4842.D1522C78@lucent.com> >.Organization: Lucent Technologies NL >.NNTP-Posting-Host: hvstsg1.nl.lucent.com >.Date: Thu, 30 Aug 2001 16:05:54 +0200 >.From: Remco >.Subject: Euro sign >. >.Hi, >. >.I am trying to use the Euro sign located on the '4' key on my Sun Type 6 >.USB keyboard. I can't seem to figure out the right key combination. >. >.I have the right locale installed - en_US.ISO8859-15-USA (euro) >. >.Thanks, >. >.Remco. Assuming that you have a recent release of Solaris, containing the fix for the bug 4242046, the keys AltGraph-4, AltGraph-5, and AltGraph-e should generate a "currency" symbol, which with an ISO 8859-15 font should look like a Euro. (In an ISO 8859-1 font, you'd see the generic currency glyph, a circle with four spikes sticking out.) ...Richard S. Shuford ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals References: <9o4io2$pog$2@news1.Radix.Net> <828D1D75F69100C5.30B6032E57AF77F7.41333B6BE47D59DE@lp.airnews.net> Message-ID: Organization: Forest Field OmniMedia Date: 2 Nov 2001 23:32:34 GMT From: J.B. Nicholson-Owens Subject: Re: Line graphics on PuTTY Thomas Dickey wrote: > > I don't believe it [line graphics] works, either (short of tinkering > with its source - I've tried various font settings as well). It somewhat works for me; I think I'm facing a font problem, not something wrong with PuTTY. I'm trying version 0.51 right now and with my TERM set to xterm I can see some line graphics in slrn (a free newsreader). My font is set to Courier, 24-pixel. I don't have the XTerm.fon installed. Not all the line graphics work (according to the smgtest in S-Lang, a free library one must compile prior to compiling slrn). When I get around to installing XTerm.fon, I plan to recheck the output from smgtest to see if all the graphics character glyphs are drawn as I expect. ////////////////////////////////////////////////////////////////////////////// X-Sender: -@workstation1.swip.net Newsgroups: comp.unix.bsd.freebsd.misc References: Message-ID: Organization: Tele2/Swipnet Date: Tue, 01 Oct 2002 20:25:55 GMT From: Erik Nygren Subject: Re: keyboard definition, X and bash In article , Keve Nagy wrote: > > Hi Everybody, > > There are two things causing some trouble for a long time. > I would like to type (and read) Hungarian accented letters, both under X > and under bash. > > The more serious is under X. > I would like something which is available under Windows (sorry), to be > able to switch between the standard US 101 layout, and the hungarian 101 > layout, preferrably by pressing Ctrl+Shift to switch from one to the other. > > I use KDE 3 on FreeBSD 4.6R. I used the Control > Center/Peripherals/Keyboard, and enabled a hungarian layout in addition > to the default US English. I can switch between them by clicking the > flag on the taskbar, but I still can not type hungarian letters. > > Similar experience under bash on a VTY. > If I load the hungarian keyboard layout, I am still unable to type > hungarian letters. I enabled LC_CTYPE and LANG. Create a .login_conf in your homedir that looks like this: me:\ :charset=ISO8859-2:\ :lang=hu_HU.ISO8859-2: That should take care of any resistance to accept your characters. Should work in bash as well as vi and others. I guess that should help X as well, but KDE is a little magic to me, so there might be other gotchas... For keyboard-layout in VTY i guess setting keymap="hu.iso2.101keys" in /etc/rc.conf should do the trick, but you probably knew that. -- Erik Nygren e r i k { a t } s w i p { d o t } n e t Linux--If you hate Microsoft, FreeBSD--If you love Unix! ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.protocols.kermit.misc, comp.unix.sco.misc, comp.terminals References: <8ce22d01.0211051340.4391f5cb@posting.google.com> Message-ID: Organization: Columbia University Date: 5 Nov 2002 16:49:13 -0500 From: Frank da Cruz Subject: Re: Kermit (K95), SCO Openserver, Progress, and Linux. In article <8ce22d01.0211051340.4391f5cb@posting.google.com>, Dan Skinner wrote: : : I manage an application done in Progress which has traditionally run on : SCO OS3 and OS5 using several terminal emulators, currently Kermit-95. : We use "scoansi" emulation and all is well. : : We are porting to Linux. Have a system running and we are down to : the nagging details. Linux emulation is not good because the : progress graphics are not supported (box drawing), scoansi works well : in Progress but Linux complains about not being fully functional : (ie: pg). and the silly "ls" color stuff is broken. Any suggestions : appreciated. : K95's Linux terminal emulation is fine; it supports box drawings and color; I assume you have K95's terminal type set to Linux. So the question is whether your application sending the right stuff. Does it rely on termcap/terminfo/curses? If so, maybe there is some confusion over the names of the fields or the syntax of their values. It's also possible that you've chosen a terminal character set in K95 that does not agree with what the application thinks you have. I've copied the SCO newsgroup on this reply -- I expect others there have done similar conversions and can offer some hints. Also the terminals newsgroup, where people who know termcap/terminfo/[n]curses hang out. - Frank .............................................................................. Newsgroups: comp.protocols.kermit.misc, comp.unix.sco.misc, comp.terminals References: <8ce22d01.0211051340.4391f5cb@posting.google.com> Message-ID: <8ce22d01.0211060948.78805fbe@posting.google.com> Date: 6 Nov 2002 09:48:32 -0800 From: Dan Skinner Subject: Re: Kermit (K95), SCO Openserver, Progress, and Linux. Frank da Cruz wrote: > > K95's Linux terminal emulation is fine; it supports box drawings and > color; I assume you have K95's terminal type set to Linux. So the > question is whether your application sending the right stuff. Does it > rely on termcap/terminfo/curses? If so, maybe there is some confusion > over the names of the fields or the syntax of their values. > > It's also possible that you've chosen a terminal character set in K95 > that does not agree with what the application thinks you have. > > I've copied the SCO newsgroup on this reply -- I expect others there > have done similar conversions and can offer some hints. Also the > terminals newsgroup, where people who know termcap/terminfo/[n]curses > hang out. > > - Frank Thanks Frank; I've been doing some expermental research and have found the following. It seems that all functions of the K95 Linux terminal emulation work except the escape to and from graphics mode (GS and GE). The termcap (protermcap in Progress) is set to GS=^N and GE=^O and work on the Linux console. I understand that Linux display codes are like vt100, and I check the vt100 termcap entries and find GS=^N and GE=^O and this works if I set K95 to vt100. Lots of other stuff is broken but the box drawing works with Linux TERM=linux and K95 emulation set to vt100. When K95 emulation set to 'linux' the box drawing characters are the un-escaped values of G1 through GV. When I null the escape codes (GS=\000 and GE=\000) and put in corners of + and lines of | & - both the linux console and k95 in linux emulation give the same result. For your information show char yields: Transfer Translation: on File Character-Set: latin1-iso (ISO 8859-1 Latin-1), 8-bit File Scan: on Default 7bit-Character-Set: ascii Default 8bit-Character-Set: cp437 Transfer Character-Set: Transparent SEND character-set-selection: automatic RECEIVE character-set-selection: manual (Use SHOW ASSOCIATIONS to list automatic character-set selections.) Unknown-Char-Set: Keep Terminal character-sets: Mode: 8-bit Multinational Mode Local: Unicode display / Windows Code Page 1252 input Remote: GL->G0: US ASCII (94 chars) G1: US ASCII (96 chars) GR->G2: ISO Latin-1 (94 chars) <<<< G3: DEC Special Graphics (94 chars) Keyboard character-sets: Multinational: PC Code Page 437 National: US ASCII Code Pages: Active: 1252 Are you sure this is not a K95 issue? Again, any help appreciated. Regards...Dan. .............................................................................. Newsgroups: comp.protocols.kermit.misc, comp.unix.sco.misc, comp.terminals References: <8ce22d01.0211051340.4391f5cb@posting.google.com> <8ce22d01.0211060948.78805fbe@posting.google.com> Message-ID: Organization: Columbia University Date: 6 Nov 2002 18:11:16 GMT From: Jeffrey Altman Subject: Re: Kermit (K95), SCO Openserver, Progress, and Linux. You have your remote character set configured for ISO Latin 1, which does not contain graphics characters. Set your remote character set to CP437 and you will experience the desired behavior. -- Jeffrey Altman * Sr.Software Designer Kermit 95 2.0 GUI available now!!! The Kermit Project @ Columbia University SSH, Secure Telnet, Secure FTP, HTTP http://www.kermit-project.org/ Secured with MIT Kerberos, SRP, and kermit-support@columbia.edu OpenSSL. .............................................................................. Newsgroups: comp.protocols.kermit.misc, comp.unix.sco.misc, comp.terminals References: <8ce22d01.0211051340.4391f5cb@posting.google.com> <8ce22d01.0211060948.78805fbe@posting.google.com> Message-ID: <8ce22d01.0211070925.114fe5fe@posting.google.com> Date: 7 Nov 2002 09:25:49 -0800 From: Dan Skinner Subject: Re: Kermit (K95), SCO Openserver, Progress, and Linux. JDanSkinner@JDanSkinner.com (Dan Skinner) wrote: > > > : I manage an application done in Progress which has traditionally run > > : on SCO OS3 and OS5 using several terminal emulators, currently K95. > > : We use scoansi emulation and all is well. > > : We are porting to Linux. Have a system running and we are down to > > : the nagging details. Linux emulation is not good because the > > : progress graphics are not supported (box drawing), scoansi works well > > : in Progress but Linux complains about not being fully functional > > : (ie: pg). and the silly ls color stuff is broken. Any suggestions > > : appreciated. Jeffrey at Kermit Support has provided the information and background information which made it possible for me to solve my problem. In a nutshell the problem was in the private termcap for Progress. The is= string in the Progress Linux termcap string sends the escape sequence (B setting the G2 character-set to ISO Latin1 as opposed to (U which sets G2 character-set to cp437 as required by my particular application environment. This solved the problem I was having with character drawn screen logos. With this change the Linux console works pretty well. The linux emulation Kermit box drawing still presented the un-shifted Gx values of "jklmqx". This I solved with a technique Progress used in previous SCO Unix ansi termcap's. This is to null out GS and GE and to replace G1 - GV with the octal values of the line drawing characters. I'll take this opportunity to praise Kermit support. They live up to the quote "AND . . . Super-responsive technical support: we stand behind our products and support them vigorously." The session log Jeffery suggested presented the (B like a slap in the face, (as soon as I took the time to record it!) With their help we have successfully married Progress, SCO Open Server, Linux (Mandrake), and Kermit (K95). Regards, Dan. .............................. The following is the revised Linux termcap for Progress protermcap: #linux linux|linux-lat|linux console:\ :START-RESIZE(ESC-1)=\E1:\ :GO(F1)=\E[[A:\ :GO(CTRL-X)=^x:\ :HELP(F2)=\E[[B:\ :ENTER-MENUBAR(F3)=\E[[C:\ :END-ERROR(F4)=\E[[D:\ :GET(F5)=\E[[E:\ :PUT(F6)=\E[17~:\ :RECALL(F7)=\E[18~:\ :CLEAR(F8)=\E[19~:\ :CLEAR(CTRL-Z)=^z:\ :INSERT-MODE(CTRL-T)=^t:\ :CUT(F10)=\E[21~:\ :COPY(F11)=\E[23~:\ :PASTE(F12)=\E[24~:\ :BACKSPACE(BACKSPACE)=^?:\ :HOME(HOME)=\E[1~:\ :DELETE(DELETE)=\E[3~:\ :END(END)=\E[4~:\ :PAGE-UP(PAGE-UP)=\E[5~:\ :PAGE-DOWN(PAGE-DOWN)=\E[6~:\ :BLOCK(CTRL-V)=^v:\ :HOME(ESC-<)=\E<:\ :END(ESC->)=\E>:\ :is=\E>\E[?3l\E[?4l\E[m\E[?7h\E[?8h\E(U\E)0:\ :nd=2\E[C:\ :do=\E[B:\ :cl=50\E[;H\E[2J:\ :cm=5\E[%i%d;%dH:\ :so=2\E[7m:\ :DELETE-COLUMN(ESC-CTRL-Z)=\E[4:\ :se=2\E[m:\ :us=2\E[4m:\ :ue=2\E[m:\ :GS=\000:\ :GE=\000:\ :G1=\277:\ :G2=\332:\ :G3=\300:\ :G4=\331:\ :GC=n:\ :GD=w:\ :GH=\304:\ :GL=u:\ :GR=t:\ :GU=v:\ :GV=\263:\ :HS=2\E[1m:\ :HR=2\E[m:\ :BB=2\E[5m:\ :BR=2\E[m:\ :ks=\E[?1h\E=:\ :ke=\E[?1l\E>:\ :cd=10\E[J:\ :ce=10\E[K:\ :co#80:\ :kd=\E[B:\ :kl=\E[D:\ :kr=\E[C:\ :ku=\E[A:\ :li#24:\ :up=\E[A:\ :xi:\ :cs=\E[%i%d;%dr:\ :sr=\EM:\ :sf=\n:\ :GO(PF1)=\EOP:\ :HELP(PF2)=\EOQ:\ :ENTER-MENUBAR(PF3)=\EOR:\ :END-ERROR(PF4)=\EOS:\ :PAGE-UP(ESC-UP-ARROW)=\E\E[A:\ :PAGE-DOWN(ESC-DOWN-ARROW)=\E\E[B:\ :LEFT-END(ESC-LEFT-ARROW)=\E\E[D:\ :RIGHT-END(ESC-RIGHT-ARROW)=\E\E[C:\ :ku=\E[A: :L_ku=:\ :kd=\E[B: :L_kd=:\ :kr=\E[C: :L_kr=:\ :kl=\E[D: :L_kl=:\ :bc=\177: :.L_bc:\ :kh=\Eh: :L_kh= h:\ :EN=\Ee: :L_EN= e:\ :PU=^U: :L_PU=:\ :PD=^K: :L_PD=:\ :ki=\Ei: :L_ki= i:\ :DL=^X: :L_DL=:\ :ESC=\E\E: :L_ESC= :\ :bt=\Eb: :L_bt= b:\ :fk4=\EOP: :L_fk4=:\ :fk1=\EOQ: :L_fk1=:\ :fk2=\EOR: :L_fk2=:\ :fk3=\EOS: :L_fk3=:\ :fk5=\E6: :L_fk5= 6:\ :fk6=\E7: :L_fk6= 7:\ :fk7=\E8: :L_fk7= 8:\ :Aka=^k: :L_Aka=Ctrl-K:\ :Akd=^z: :L_Akd=Ctrl-Z:\ :Akp=^r: :L_Akp=Ctrl-R:\ :Aks=^l: :L_Aks=Ctrl-L:\ :Aku=\Em: :L_Aku=Esc-M:\ :Akw=^g: :L_Akw=Ctrl-G:\ :Aki=^e: :L_Aki=Ctrl-E:\ :tc=v7kf: ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals Message-ID: Date: 17 Nov 2002 16:28:56 -0800 From: Cyril Chevrot Subject: Can't change the character set from UTF-8 to iso-8859-1 I don't arrive to change the character set of my terminal emulators (xterm, konsole 1.1.3, gnome 2.0.1). All are set to UTF-8 (Unicode) or I would like to use an iso-8859-1 (latin1) charset. I have tried to change the value of the LANG environment variable from fr_FR.UTF-8 to fr_FR.iso-8859-1. But it doesn't work ! :(( Apparently, all my terminal emulators support only Unicode. Indeed the SUPPORTED environment variable is set to fr_FR.UTF-8:fr_FR:fr. How can I change that ? Thanks in advance -- Cyril .............................................................................. Newsgroups: comp.terminals References: Message-ID: Organization: http://groups.google.com/ Date: 20 Nov 2002 05:33:05 -0800 From: Cyril Chevrot Subject: Re: Can't change the character set from UTF-8 to iso-8859-1 Thomas Dickey wrote in message news:... > Cyril Chevrot wrote: > > Thomas Dickey wrote in message news:... > >> Cyril Chevrot wrote: > >> > I don't arrive to change the character set of my terminal emulators > >> > (xterm, konsole 1.1.3, gnome 2.0.1). > >> > All are set to UTF-8 (Unicode) or I would like to use an iso-8859-1 > >> > (latin1) charset. > > >> > I have tried to change the value of the LANG environment variable from > >> > fr_FR.UTF-8 to fr_FR.iso-8859-1. > >> > >> fr_FR.UTF-8 to fr_FR > > > > It doesn't work :( > > Redhat sets several environment variables - did you change all of them? > (Actually only LANG is needed, of course). I have found the answer to my question. And the answer is: modify the /etc/sysconfig/i18n file. -- Cyril .............................................................................. Newsgroups: comp.terminals References: Message-ID: Date: 30 Jan 2003 10:43:25 -0800 From: Yves St-Arnaud Subject: Re: TeraTerm. yves_st_arnaud@hotmail.com (Yves St-Arnaud) wrote in message news:... > Hi everybody, > > I want my printer, a POS Epson TM-T88III, to print french characters (éèê...) > from teraterm. > > Thanks. > > Yves St-Arnaud I found the solution. In teraterm.ini, with this line PrnFont=Terminal,0,-10,0 Thanks. .............................................................................. Newsgroups: comp.terminals References: Message-ID: Date: 31 Jan 2003 06:14:30 -0800 From: Yves St-Arnaud Subject: Re: TeraTerm. yves_st_arnaud@hotmail.com (Yves St-Arnaud) wrote: > > I found the solution. In teraterm.ini, with this line > PrnFont=Terminal,0,-10,0 > > Thanks. Oops, problem is coming back. It's OK if I use the item in the menu of Teraterm, but not when the Unix application prints a report !!!! ////////////////////////////////////////////////////////////////////////////// Organization: Columbia University Newsgroups: comp.protocols.kermit.misc Message-ID: References: <6d5o4vgob49hjtibic9oj7rplgf0rp03qd@4ax.com> Date: 13 Feb 2003 17:26:26 -0500 From: Frank da Cruz Subject: Re: RedHat 8.0 Linux and K-95 Character Set In article <6d5o4vgob49hjtibic9oj7rplgf0rp03qd@4ax.com>, Ron Heiby wrote: : fdc@columbia.edu (Frank da Cruz) wrote: : > : > Andale Mono WT J is an excellent Unicode fixed-pitch font, so naturally it : > does not come free with Windows; you have to buy it from Agfa Monotype. : : I went to the Agfa Monotype web site to find out pricing for this font. : I found "Andale Mono", but not "Andale Mono WT J". Any difference? A world of difference, quite literally :-) : Also, does K-95 care whether I have it in TrueType or Postscript? TrueType. We discuss the font issue somewhere in the documentation, although perhaps not with as much candor as the following: Andale Mono WT J is exactly the font every Kermit user ever wanted, but the vendor currently doesn't sell it retail. It's an OEM font. They wanted to license it to us for inclusion in K95, but it would have more than doubled the price of the shrinkwrap, and could not even have been included in bulk or academic site licenses without adding several 0's to the price. I told them they should start selling it retail. One way to convince them is for people to ask. Send e-mail to support@monotype.com. If they get a lot of these, maybe they'll put it on the market. It won't be cheap, but it might be affordable, and once it starts moving the price might come down. Meanwhile EMT is our attempt to fill the gap, but it needs a lot of work. We simply don't have the font expertise, tools, and [wo]manpower that Monotype does. Still EMT is not bad at all at certain sizes in certain color combinations (such as black on white) for text applications. The main problems with EMT are what many people characterize as its thinness or lightness, the loose line spacing, and the failure of box- and line-drawing characters to align. (This stuff is not hard to do in a bitmap font like Terminal, but it's REALLY hard to get right with a well-populated Unicode font). For the record, EMT was contributed by volunteers who enjoy working on fonts and promoting Unicode, and we're grateful for it. We hope it will be improved in future releases. EMT includes Latin, Cyrillic, Greek, Coptic, Arabic, Hebrew, Armenian, Georgian, Runes, Ogham, Canadian Syllabics, Cherokee, Math, Symbols, Line and Box Drawing Characters (including the new Unicode 3.1 Terminal Emulation Characters proposed by us), Dingbats, and APL, which is way more than any monospace font that comes with Windows or, to my knowledge, any that you can download for free. (If I'm wrong, I'd like to find out!) Andale Mono WT J includes all of that (except the new terminal emulation characters), plus Chinese, Japanese, Korean (Hangul), Indic, Ethiopic, Syriac, Thai, Lao, Braille, and some other stuff, and its line/box characters mostly line up and join correctly. - Frank ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.protocols.kermit.misc Message-ID: References: <6d5o4vgob49hjtibic9oj7rplgf0rp03qd@4ax.com> Date: 16 Feb 2003 00:15:51 GMT From: those who know me have no need of my name Subject: Re: RedHat 8.0 Linux and K-95 Character Set in comp.protocols.kermit.misc i read: > Andale Mono WT J is exactly the font every Kermit user ever wanted, but the > vendor currently doesn't sell it retail. > Meanwhile EMT is our attempt to fill the gap, but it needs a lot of work. Microsoft has some better fonts available for download too (though they seem to have pulled some lately) at http://www.microsoft.com/typography/fonts/ E.g., Trebuchet MS is supposed to be fully WGL4 including greek and cyrillic, and Arial Unicode MS is positively huge at 24 megabytes and boasts that it has glyphs for every unicode 2.0 code point. -- bringing you boring signatures for 17 years ////////////////////////////////////////////////////////////////////////////// Organization: Columbia University Newsgroups: comp.protocols.kermit.misc Message-ID: References: <6d5o4vgob49hjtibic9oj7rplgf0rp03qd@4ax.com> Date: 16 Feb 2003 13:48:16 -0500 From: Frank da Cruz Subject: Re: RedHat 8.0 Linux and K-95 Character Set In article , those who know me have no need of my name wrote: : in comp.protocols.kermit.misc i read: : : >Andale Mono WT J is exactly the font every Kermit user ever wanted, but the : >vendor currently doesn't sell it retail. : : >Meanwhile EMT is our attempt to fill the gap, but it needs a lot of work. : : microsoft has some better fonts available for download too (though they : seem to have pulled some lately) at : , e.g., Trebuchet MS is : supposed to be fully wgl4 including greek and cyrillic... It's not a monospace font, which is what you need in a terminal emulator. : and Arial Unicode MS is positively huge at 24 megabytes and boasts that it : has glyphs for every unicode 2.0 code point. But isn't monospace. - Frank ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris Message-ID: References: <9p9r1h$3hn$2@pita.alt.net> <9pb4e1$dkf$0@pita.alt.net> Date: Tue, 2 Oct 2001 16:40:07 +0200 From: Malik Cherif Subject: Re: KEYBOARD HOW TO cypher@punk.net wrote in message news:9pb4e1$dkf$0@pita.alt.net... > # > # wrote in message news:9p9r1h$3hn$2@pita.alt.net... > # > Malik Cherif wrote: > # > # Hi all , > # > # How i can change my keyboard layout into arabic ? > # > > # > For a practical joke on a fellow office worker? > > Malik Cherif wrote: > # I'm serious, it's for a project, and i need this ,i'm not joking, trust > # me. Because i need this function on a Solaris Oracle-based environment. > First, located the language packages on the install CD. > > Then, you need to change the LOCALE. (pronounced "low-cal") > > > # LC_MESSAGES=C > # LC_TIME=en_US.ISO8859-1 > # LC_NUMERIC=en_US.ISO8859-1 > # LC_CTYPE=en_US.ISO8859-1 > # LC_MONETARY=en_US.ISO8859-1 > # LC_COLLATE=en_US.ISO8859-1 > > I'm not sure where. [ Archiver's Note: try /etc/default/init ] ////////////////////////////////////////////////////////////////////////////// Newsgroups: alt.solaris.x86 Organization: University of Hannover (RRZN) Message-ID: References: <3C446017.6364F12B@company.com> Date: 16 Jan 2002 13:30:43 GMT To: John Smith From: Gerd Marquardt Subject: Re: missing keys in solaris 8 - german keyboard In article <3C446017.6364F12B@company.com>, John Smith wrote: |> Hello, |> |> I am running Solaris 8 with a German keyboard and the latest recommended |> patch cluster. The German keys work fine, but unfortunately some other |> keys are dead (i.e. "|" or ">"). Before I upgraded to Solaris 8, these |> keys worked fine with Solaris 7. |> |> Any ideas? |> |> Thanks You can define the missing keys with: xmodmap -e "keycode 131 = less greater bar" xmodmap -e "keycode 48 = SunFA_Acute SunFA_Grave" xmodmap -e "keycode 49 = asciicircum degree" xmodmap -e "keycode 71 = Udiaeresis" xmodmap -e "keycode 72 = plus asterisk asciitilde" xmodmap -e "keycode 93 = Odiaeresis" xmodmap -e "keycode 94 = Adiaeresis" xmodmap -e "keycode 95 = numbersign apostrophe grave" Put these commands in one of the xinit scripts. -- Gerd Marquardt RRZN / Universitaet Hannover marquardt@rrzn.uni-hannover.de Schlosswender Str. 5 Tel. +49-511-762-4727 D-30159 Hannover fax: +49-511-762-3003 ////////////////////////////////////////////////////////////////////////////// \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ Newsgroups: comp.unix.misc Message-ID: Date: 8 Apr 2003 02:17:43 -0700 From: Erudam Subject: C of Lang=C Hello. I'm studying language used in localization field, and can't find out what C stands for when declaring 'Lang=C'. Is C for Character, Country or C language? .............................................................................. Newsgroups: comp.unix.misc Message-ID: <3e92e9ca$0$29716$4d4ebb8e@read.news.nl.uu.net> References: Organization: Hiscom b.v. Date: Tue, 08 Apr 2003 17:24:58 +0200 From: Corni Beerse Subject: Re: C of Lang=C Erudam wrote: > > Hello. > > I'm studying language used in localization field, > and can't find out what C stands for when declaring 'Lang=C'. > Is C for Character, Country or C language? The language C is the computer language as accepted by the c-compiler `cc`. There are 2 variants: The origional, by Kerningham and Ritchy, hence K&R C. And the ansi standardized one: ansi-c. Since unix and C are somehow written in each other, C is the standard language at unix. Once unix got localized and translated, somehow, somewhere, the default was called C to avoid the battle wether it should be called Australian-English or Canadian-English (or some other English variant). Hence, the default language was called C and no-one argues about it being standardized or standardised, in C it is right both ways. CBee .............................................................................. Date: 8 Apr 2003 15:12:13 +0200 Newsgroups: comp.unix.misc Message-ID: <3e92caad@news.uni-ulm.de> References: From: Sven Mascheck Subject: Re: C of Lang=C Thomas Dickey wrote: > > my understanding of "LANG=C" is that it corresponds to POSIX C (US-ASCII > with no locale modifications, i.e., 7-bit ASCII). I guess you meant ISO C, formerly ANSI C. (ASCII due to emphasis on a _portable_ character set.) "POSIX" is just an alias for the value "C", and is used in the POSIX standard from the IEEE as well as in the SUSvX from The Open Group. (Meanwhile, POSIX is approved jointly.) (And to avoid ambiguity: there's only one ASCII, 7-bit.) Sven .............................................................................. Newsgroups: comp.unix.misc Message-ID: References: <3e92caad@news.uni-ulm.de> Date: 8 Apr 2003 20:43:56 GMT From: Thomas Dickey Subject: Re: C of Lang=C Sven Mascheck wrote: > Thomas Dickey wrote: >> Erudam wrote: >>> [...] when declaring 'Lang=C'. Is C for Character, Country or C language? >> >> my understanding of "LANG=C" is that it corresponds to POSIX C (US-ASCII >> with no locale modifications, i.e., 7-bit ASCII). > I guess you meant ISO C, formerly ANSI C. (ASCII due to emphasis on a I was thinking of the POSIX locale actually (the language of course is ISO, but the context for comp.unix.misc would be POSIX, anyway). Your clarification is useful (I tend to leave out details). > (And to avoid ambiguity: there's only one ASCII, 7-bit.) ...true (there's no such thing as "extended-ASCII" ;-) -- Thomas E. Dickey ////////////////////////////////////////////////////////////////////////////// From alanc@alum.calberkeley.org Mon May 19 15:49:30 2003 Newsgroups: comp.unix.admin, comp.unix.solaris References: <3EC8BDEE.1070202@dot.com> Message-ID: Organization: University of California, Berkeley Date: Mon, 19 May 2003 14:18:58 +0000 (UTC) From: Alan Coopersmith Subject: Re: Character set A writes in comp.unix.solaris: |Hi, | I use solaris 2.6 and it has /usr/pub/ascii that has all the list of |ascii characters. But when I use other characters like | |echo "\0255" or "\0254" or "\0253" etc. they give some characters too. |My ascii file in /usr/pub has only entries till 177 , so where does |these extended characters get picked from? Which file is it? ASCII only defines characters in the lower half of the 8-bit range (i.e. 0-127 decimal or 0-0177 octal). Characters beyond that are defined differently depending on which character set you're using to extend ascii - ISO-8859-1, ISO-8859-15, & UTF-8 are most common in English/European locales. Other ISO-8859-* are defined for other character sets as well. (I don't think Solaris 2.6 supported UTF-8 though.) /usr/pub/iso & /usr/pub/utf-8 contain the charts with these values in. What the characters appear as depend on what locale you are using. -- ________________________________________________________________________ Alan Coopersmith alanc@alum.calberkeley.org http://www.CSUA.Berkeley.EDU/~alanc/ aka: Alan.Coopersmith@Sun.COM Working for, but definitely not speaking for, Sun Microsystems, Inc. .............................................................................. Newsgroups: comp.unix.admin, comp.unix.solaris References: <3EC8BE52.3050906@dot.com> Message-ID: Organization: Timetravellers Anonymous Date: Mon, 19 May 2003 14:13:43 -0000 From: Richard L. Hamilton Subject: Re: Character set In article <3EC8BE52.3050906@dot.com>, A writes: > > Hi, > I use Solaris 2.6 and am unable to find some chars in the > /usr/pub/ascii file where ascii chars are found. > > For example ' echo "\0255" gives "-" > ' echo "\0254" gives "¬" > ' echo "\0253" gives "«" > etc.. > > While the /usr/pub/ascii file shows only till 177, where do I get the > 255, 254,253,etc.. Are these extended characters? Is there any other > file it picks these characters from? ASCII proper is a 7-bit character set, with character codes running only up to 127 decimal (177 octal). Characters sets such as ISO8859-1 (or ISO8859-15) are supersets of ASCII, with up to twice as many characters. (and then there's Unicode, which has thousands of characters, but it's a multi-byte code). Here's /usr/pub/iso off of Solaris 8, with all the 8-bit characters in it (although the codes are in hexidecimal rather than octal). | 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel| | 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si | | 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb| | 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us | | 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' | | 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / | | 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 | | 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? | | 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G | | 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O | | 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W | | 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ | | 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g | | 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o | | 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w | | 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del| | a0  nbs| a1 ¡ | a2 ¢ | a3 £ | a4 ¤ | a5 ¥ | a6 ¦ | a7 § | | a8 ¨ | a9 © | aa ª | ab « | ac ¬ | ad ­ | ae ® | af ¯ | | b0 ° | b1 ± | b2 ² | b3 ³ | b4 ´ | b5 µ | b6 ¶ | b7 · | | b8 ¸ | b9 ¹ | ba º | bb » | bc ¼ | bd ½ | be ¾ | bf ¿ | | c0 À | c1 Á | c2  | c3 à | c4 Ä | c5 Å | c6 Æ | c7 Ç | | c8 È | c9 É | ca Ê | cb Ë | cc Ì | cd Í | ce Î | cf Ï | | d0 Ð | d1 Ñ | d2 Ò | d3 Ó | d4 Ô | d5 Õ | d6 Ö | d7 × | | d8 Ø | d9 Ù | da Ú | db Û | dc Ü | dd Ý | de Þ | df ß | | e0 à | e1 á | e2 â | e3 ã | e4 ä | e5 å | e6 æ | e7 ç | | e8 è | e9 é | ea ê | eb ë | ec ì | ed í | ee î | ef ï | | f0 ð | f1 ñ | f2 ò | f3 ó | f4 ô | f5 õ | f6 ö | f7 ÷ | | f8 ø | f9 ù | fa ú | fb û | fc ü | fd ý | fe þ | ff ^? | -- mailto:rlhamil@mindwarp.smart.net http://www.smart.net/~rlhamil ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris NNTP-Posting-Host: 194.106.105.145 References: <66630471.0501030604.7fc43818@posting.google.com> Message-ID: <41d9bba2@news.infonet.ee> Organization: Microlink Eesti AS Date: 3 Jan 2005 23:39:46 +0200 From: Toomas Soome Subject: Re: National (Latvian) fonts in X, Firefox, works with Linux Normunds Jekabsons wrote: > > I tried to install font server on Linux side + checked it > from an other Linux box (by replacing > font paths to my server in XFree86 config file). > Everething seems to be OK for 2 Linux boxes. > However, the command > > xset +fp tcp/192.168.1.1 > > in Solaris xterm (Solaris X server) > gave no success... > > May be I have to find a configuration > script for Solaris Xserver in order to replace native font > support to my Linux font server at boot? a. use lt_LT.ISO8859-13 locale b. set up your font path or configure font server. fonts are in /usr/openwin/lib/locale/iso_8859_13/ and font server config file is: /usr/openwin/lib/X11/fontserver.cfg update it and pkill fs.auto toomas ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.dcom.telecom Path: cs.utk.edu!darwin.sura.net!spool.mu.edu!telecom-request Reply-To: jjmhome!pig!die@transfer.stratus.com Message-ID: Organization: Opinion Mongers Incorperated... Sender: telecom@eecs.nwu.edu Approved: telecom@eecs.nwu.edu X-Submissions-To: telecom@eecs.nwu.edu X-Administrivia-To: telecom-request@eecs.nwu.edu X-Telecom-Digest: Volume 13, Issue 665, Message 2 of 12 Lines: 69 Date: 26 Sep 1993 16:03:52 GMT From: jjmhome!pig!die@transfer.stratus.com (Dave Emery) Subject: Re: Information Wanted on Six-bit Code In article johan@tts.lth.se (Johan M Karlsson) writes: > I just wonder if anybody know anything about the Six-bit code called > TTS, that was used by many newspapers in the 70's to receive stories > from the wire services. Like what does the letters TTS stand for? TTS standards for TeleTypeSetter. Indeed it is a 6-bit code which was developed by AT&T's now defunct Teletype subsidiary in the early 50s as a means of inputing news stories direct to Linotype machines. As such it incorporates the special control characters that operate Linotype machines such as upper rail and lower rail shifts and em space and en space. Originally in the days long before computers in the pockets of every reporter, the wire services had computerized systems that ran on mainframes for creating formated stock tables, sports box scores, racing information and other highly structured text. Sending this material in TTS code ready for direct input into a type casting machine saved local newspapers the services of several compositors and made it possible for them to publish reams of this sort of material at low cost. Later, in the 60's and early 70s the wire services developed computer programs to format (perform hyphenation and justification) their regular news feed into standard newspaper columns using Linotype control characters. Many of the newspaper oriented wire service wires (particularly the AP A wire) were transmitted in TTS code in this era and could be directly input to a Linotype typesetting machine. TTS code was popular for wire service distribution for another reason, it supported upper and lower case. The earlier Baudot alphabet only supported upper case which meant that a human being had to worry about getting the case correct in transcribing stories into type -- but TTS had the correct case already. TTS format paper tape in fact became a standard in the printing industry for input to composition equipment of later generations than Linotype machines. TTS represented an alphabet for encoding text formated for printing, and may still see some use for this purpose today. Teletype developed a modification of their model 15 workhorse wire service teleprinter to print TTS in upper and lower case on rolls of Teletype paper; this machine was called the model 20 monitor printer. Many newspapers which did not actually use TTS input to their typesetting machines for news stories used these machines to print out stories in upper and lower case for later entry by human compositors. Newspapers which used TTS input directly usually punched the TTS into 6 level paper tape for off line entry into Linotype machines. So a typical newspaper would have a monitor printer and a tape punch on each of their tts wires. TTS wire transmissions were usually low speed (66 or 75 wpm) at baud rates adjusted for the 8.42 element code. This resulted in some strange low baud rates that gave the designers of serial port boards for early minicomputers fits. TTS was largely replaced in the mid 70s by the high speed ASCII wire transmissions and by newspaper computerized composition systems which could do hyphenation and justification automatically and output text direct to optical typesetters. Remnents of it survive, however, in the standard ASCII format for transmitting wire service news stories which incorperates ASCII versions of some of the special typesetter control characters. ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.dcom.telecom Path: cs.utk.edu!gatech!howland.reston.ans.net!spool.mu.edu!telecom-request Message-ID: Organization: TELECOM Digest Sender: telecom@eecs.nwu.edu Approved: telecom@eecs.nwu.edu X-Submissions-To: telecom@eecs.nwu.edu X-Administrivia-To: telecom-request@eecs.nwu.edu X-Telecom-Digest: Volume 13, Issue 664, Message 15 of 15 Date: Sun, 26 Sep 1993 14:10:51 -0500 (cdt) From: Brian D McMahon Subject: Re: kUPL@ TELEGRAFNYJ MODEM (095) 212-3937 > [Moderator's Note: This message came to me from Russia. I have no idea > at all what he is saying, except I think it has to do with a BBS or > public access site in Moscow. This was the entire text. Can someone > read it to me? PAT] > sRO^NO KUPL@ TELEGRAFNYJ MODEM > tEL: (095) 212-39-37 sIDORENKO sERGEJ. Hi, Pat. That would be "srochno kuplyu telegrafnyj modem," or "urgently (want to) buy a telegraphic modem." Signed by Sergej Sidorenko. I have no idea what a "telegraphic" modem is; I'm not up on the technical terminology. At a guess, the gentleman wants to buy a FAX modem. The message text, BTW, is in a format known as KOI-7, one of several mutually incompatible (sigh) methods of transmitting Russian Cyrillic text over the net. Upper and lower case are reversed, as you probably guessed. Brian McMahon Postmaster / Acad. Software Support Grinnell College Computer Services Grinnell, Iowa 50112 USA Voice: +1 515 269 4901 Fax: +1 515 269 4936 [Telecom Moderator's Note: You think then a 'telegraphic modem' would be a fax modem? My thanks to the 27 other responses I received to this query. I selected a few to use here which make a good representative sample of the lot. PAT] .............................................................................. Article 8992 of comp.dcom.telecom: Path: cs.utk.edu!gatech!howland.reston.ans.net!spool.mu.edu!telecom-request Date: Sun, 26 Sep 1993 02:14:29 -0400 From: anarres!gaarder@TC.Cornell.EDU Newsgroups: comp.dcom.telecom Subject: kUPL@ TELEGRAFNYJ MODEM (095) 212-3937 Message-ID: Organization: TELECOM Digest X-Telecom-Digest: Volume 13, Issue 664, Message 14 of 15 Passing that through a little transliteration program I wrote back during the coup in the Soviet Union (remember then? I was glued to my Usenet feed!) produces: Srochno kuplyu telegrafnyy modem Tel: (095) 212-39-37 Sidorenko Sergey. Which I read as offering to buy a modem. I'm not sure just what "srochno" means in this context; my dictionary defines it as "of term; to be paid at a fixed date; due; payable". "Kuplyu" means "I buy"; I don't know whether a "telegrafnyy modem" is a special kind of modem or just a modem in general. Why this is here is a puzzle; probably it was sent to the wrong address. Steve Gaarder gaarder@anarres.ithaca.ny.us [Moderator's Note: Well no, it was not sent to the wrong address. He wrote 'telecom-request@mintaka.lcs.mit.edu' which is just an alias that points at me. That is, he did not post to a newsgroup where it found its way to comp.dcom.telecom; some news program found it lacking authorization and shoved it to me. He mailed it direct, albeit to an alias I had forgotten existed, going back to the days of jsol. So he must think we can do something for him. Fancy that; he wants to buy a modem, and here I thought he was looking for publicity for his BBS or similar and decided to give it to him. PAT] ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris NNTP-Posting-Host: ce38fb66.newsreader.visi.com References: <20040728211836.12901.00000613@mb-m06.aol.com> <358db3cc.0407290649.47ce248f@posting.google.com> <358db3cc.0407300556.3c54aaea@posting.google.com> Message-ID: Organization: VISI.com Date: 30 Jul 2004 09:51:28 -0500 To: Tonij From: Anton Rang Subject: Re: How to use the UNIX command: tr tonij67@hotmail.com (Tonij) writes: > Dan Espen wrote in message news:... > > I think for a UCB tr, it is working fine: > > > > xxx> echo "lower" | /usr/ucb/tr '[:lower:]' '[:upper:]' > > upper > > > How is that working fine? It looks to me like you translated the word > lower to the word upper and didnt do anything to the case. That's exactly what should happened. Run 'man -s1b tr' and you'll see no mention of the POSIX character class elements. The command above says to translate l=>u, o=>p, w=>p (and leave the rest the same). It's only the System V / POSIX 'tr' commands which support character classes. Hence: % echo "lower" | /usr/ucb/tr '[:lower:]' '[:upper:]' upper % echo "lower" | /usr/bin/tr '[:lower:]' '[:upper:]' LOWER % echo "lower" | /usr/xpg4/bin/tr '[:lower:]' '[:upper:]' LOWER -- Anton ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris NNTP-Posting-Host: celano.cc.umanitoba.ca NNTP-Posting-Date: 13 Apr 2005 19:22:52 GMT References: <425d6690$0$144$e4fe514c@news.xs4all.nl> Message-ID: Organization: The University of Manitoba Date: 13 Apr 2005 19:22:52 GMT From: Gary Mills Subject: Re: S10: monospace fonts for Java Desktop In <425d6690$0$144$e4fe514c@news.xs4all.nl> Casper H.S. Dik writes: > >J.D. Baldwin writes: >> >>One step down I've noticed with S10/JDS relative to S9/Gnome is that >>the xterm ("gnome-terminal") can't display multinational characters >>(such as é and ô) in any of the monospaced fonts I have tried (and I >>think I have tried them all). I get either a little cursor-shaped box >>or a question-mark. It used to work fine for xterms in S9. >You need to add the "Western" encodings in your terminal type; >I tried picking the US+euro locale but somehow something set >"LC_ALL=C" which break gnome-terminal. I'm using the System Terminal Font in gnome-terminal under JDS3, which is 10 point Monospace. It has not problem displaying accented characters. I'm using the en_CA.ISO8859-1 locale. Other than that, I didn't have anything special to get 8-bit characters to work. -- -Gary Mills- -Unix Support- -U of M Academic Computing and Networking- ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris NNTP-Posting-Host: c-24-6-155-172.hsd1.ca.comcast.net [24.6.155.172] NNTP-Posting-Date: Thu, 19 May 2005 09:29:43 -0500 References: Message-ID: Organization: DexLabs, Inc. Date: Thu, 19 May 2005 07:29:43 -0700 From: Michael Vilain Subject: Re: wcwidth on arrow parts, bug in mutt In article , Marc wrote: > > Marc wrote: > > > > The cause of this problem seems to be that wcwidth says the size of > > characters like: \342\224\224 (this is a top-right corner) is 2, whereas > > it really occupies width only 1 > > Could someone tell me if this is corrected in recent versions of solaris > (it fails on a solaris 9 that may not have the latest patches), most of > sun's website is unaccessible without paying... A test program is: > > #include > #include > #include > #include > int main(void){ > int b; > mbstate_t mbstate; > wchar_t c; > char* s="\342\224\224"; > setlocale(LC_CTYPE,"en_US.UTF-8"); > assert(mbrtowc(&c,s,3,&mbstate)==3); > b=wcwidth(c); > printf("%d\n",b); > return 0; > } > > It should answer 1, but for me it answers 2. This returns 1 for me on my system, using gcc 3.3 -- DeeDee, don't press that button! DeeDee! NO! Dee... ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.os.linux.misc NNTP-Posting-Host: 24.20.116.48 NNTP-Posting-Date: Thu, 2 Aug 2007 23:37:43 +0000 (UTC) Message-ID: <1186097862.139456.220350@i13g2000prf.googlegroups.com> Organization: http://groups.google.com Date: Thu, 02 Aug 2007 23:37:42 -0000 From: Scott Subject: high-ascii characters in linux terminal via ssh MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" << notice clue here! Content-Transfer-Encoding: quoted-printable [ The following text is in the "utf-8" character set. ] [ Your display is set for the "ISO-8859-1" character set. ] [ Some characters may be displayed incorrectly. ] When I SSH into most of my newer Linux machines from my Windows computer, I get some funny upper-ascii characters that appear from time to time, particularly in manpages and gcc output. Here is a randomly chosen snippet from a manpage: Use --progress=dot to switch to the ââ^¬Ë^Üââ^¬Ë^Üdotââ^¬â^Ģââ^¬â^Ä¢ display. The character sequences look like an 'a' with a hat over it and a cursive upper case 'E'. It's very annoying particularly with gcc output as these characters end up around every identifier that appears in a gcc warning. I've tried numerous different terminal emulation settings in my ssh program, to no avail. [surprise, surprise] I'm sure I used to know how to turn this off (it seems like there was an environment variable to set), but I forgot... Can anyone remind me? Thanks, Scott . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newsgroups: comp.os.linux.misc NNTP-Posting-Host: eJSSGroJa5Qh6TM459JBWw.user.aioe.org References: <1186097862.139456.220350@i13g2000prf.googlegroups.com> Message-ID: Organization: Aioe.org NNTP Server Date: Fri, 3 Aug 2007 03:46:11 +0200 (CEST) From: Kenan Kalajdzic Subject: Re: high-ascii characters in linux terminal via ssh You need to set the TERM environment variable in your login shell. If you use putty, setting TERM to either "linux", "ansi" or "xterm" should work fine in your case. -- Kenan Kalajdzic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newsgroups: comp.os.linux.misc NNTP-Posting-Host: 24.20.116.48 NNTP-Posting-Date: Fri, 3 Aug 2007 02:29:13 +0000 (UTC) References: <1186097862.139456.220350@i13g2000prf.googlegroups.com> Message-ID: <1186108152.857363.135060@q3g2000prf.googlegroups.com> Date: Fri, 03 Aug 2007 02:29:12 -0000 From: Scott Subject: Re: high-ascii characters in linux terminal via ssh > You need to set the TERM environment variable in your login shell. If > you use putty, setting TERM to either "linux", "ansi" or "xterm" should > work fine in your case. No luck there, it doesn't seem to make any difference. The default is vt100, which is what my ssh client is set to. I tried changing it (both the term variable and the ssh client) to linux, ansi, and xterm to no avail. I did manage to find the previous "fix" for this issue, which was to put: export LANG="POSIX" in my .bashrc file. Strangely enough, this works for RHEL4, but on RHEL5, it changes the funny characters to a string <80><99>, which I'm assuming is the hex values of the funny characters it was printing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newsgroups: comp.os.linux.misc References: <1186097862.139456.220350@i13g2000prf.googlegroups.com> Message-ID: <13b622ck8n9rcdc@corp.supernews.com> Date: Fri, 03 Aug 2007 10:49:48 -0000 From: Thomas Dickey Subject: Re: high-ascii characters in linux terminal via ssh Kenan Kalajdzic wrote: >> >> The character sequences look like an 'a' with a hat over it and a >> cursive upper case 'E'. ... UTF-8 >> I've tried numerous different terminal emulation settings in my ssh >> program to no avail. I'm sure I used to know how to turn this off (it >> seems like there was an environment variable to set), but I forgot... > You need to set the TERM environment variable in your login shell. If > you use putty, setting TERM to either "linux", "ansi" or "xterm" should > work fine in your case. The $TERM variable is unrelated. It's the locale settings (man locale). -- Thomas E. Dickey http://invisible-island.net/ ftp://invisible-island.net/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newsgroups: comp.os.linux.misc NNTP-Posting-Host: gw.roaima.co.uk NNTP-Posting-Date: Fri, 3 Aug 2007 12:07:05 +0000 (UTC) References: <1186097862.139456.220350@i13g2000prf.googlegroups.com> Message-ID: <58fbo4-r4o.ln1@news.roaima.co.uk> Organization: Roaima. Harrogate, North Yorkshire, UK Date: Fri, 3 Aug 2007 12:13:41 +0100 From: Chris Davies Subject: Re: high-ascii characters in linux terminal via ssh > Use --progress=dot to switch to the > ââ^¬Ë^Üââ^¬Ë^Üdotââ^¬â^Ģââ^¬â^Ä¢ display. > I've tried numerous different terminal emulation settings in my ssh > program to no avail [...] This is a consequence of a mismatched locale setting. The newer box is (probably) configured to use UTF8 but for some reason your pager doesn't know it. For other people reading this post, you can probably reproduce it like this (replacing en_GB.UTF8 with an appropriate locale): LANG=en_GB.UTF8 man ls | LANG= less To avoid it, you need to ensure that everything runs in the same locale. So either remove LANG entirely, or ensure that it's set consistently everywhere: unset LANG # Maybe in your .profile / .bash_profile man ls # Etc... If you're using xterm windows anywhere, start using uxterm (or better, lxterm) instead. Chris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newsgroups: comp.os.linux.misc NNTP-Posting-Host: 24.20.116.48 NNTP-Posting-Date: Fri, 3 Aug 2007 18:54:46 +0000 (UTC) References: <1186097862.139456.220350@i13g2000prf.googlegroups.com> <58fbo4-r4o.ln1@news.roaima.co.uk> Message-ID: <1186167285.541896.202950@i38g2000prf.googlegroups.com> Date: Fri, 03 Aug 2007 11:54:45 -0700 From: Scott Subject: Re: high-ascii characters in linux terminal via ssh > This is a consequence of a mismatched locale setting. The newer box is > (probably) configured to use UTF8 but for some reason your pager doesn't > know it. Thanks for the info. Now that I know what is causing it, I think I've fixed it by telling my SSH client to use UTF-8 instead of 'default' which was what it was configured to use. Scott ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals NNTP-Posting-Date: Tue, 27 Oct 2009 13:41:51 -0500 References: Message-ID: Date: Tue, 27 Oct 2009 18:41:44 +0000 From: Jonathan Casiot Subject: Re: VT terminals in a UTF-8 world Vebjorn Ljosa wrote: > > I finally have a VT320 connected to my Linux box again after many > years. One thing that has changed in those years is that UTF-8 has > become common, both for filenames and for the contents of text files. > And the terminal doesn't understand UTF-8, of course. > > The best solution I have found is to have screen translate between > UTF-8 and ISO 8859-1. What do others do? > > Vebjorn I have LANG="en_GB.iso88591" in /etc/sysconfig/i18n for my VT420s. -- Jonathan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Newsgroups: comp.terminals NNTP-Posting-Host: 69.173.104.170 NNTP-Posting-Date: Thu, 29 Oct 2009 12:38:38 +0000 (UTC) References: Message-ID: Date: Thu, 29 Oct 2009 05:36:55 -0700 (PDT) From: Vebjorn Ljosa Subject: Re: VT terminals in a UTF-8 world On Oct 27, 2:41 pm, Jonathan Casiot wrote: > > I have LANG="en_GB.iso88591" in /etc/sysconfig/i18n for my VT420s. That will obviously help by telling emacs and many other programs to generate Latin-1 output. The problem I meant to highlight occurs when working with UTF-8 text files and various unix tools. Say that you have a list of names in a UTF-8-encoded text file. If you just cat the file to view it, cat will not translate it from UTF-8 to Latin-1, regardless of the locale settings. And I'm not arguing that it should. Next... ...say that you "grep" the file. Now, LANG/LC_CTYPE should indicate UTF-8 so that grep will evaluate the regular expression correctly (e.g., making '.' match a single character, even when that character is encoded as several bytes in the UTF-8 file). But grep's output will also be UTF-8. It therefore needs to be translated to Latin-1 somehow, before being sent to the terminal. I hope I managed to be more clear this time. As mentioned, screen is a solution, at least if you like to work in screen. I was just curious if there are other alternatives. Would recoding from UTF-8 to Latin-1 be an appropriate thing to do in a tty/serial port driver? Vebjorn ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals NNTP-Posting-Host: admin.sibptus.tomsk.ru NNTP-Posting-Date: Thu, 16 Oct 2008 02:37:42 +0000 (UTC) References: Message-ID: Organization: AO "Svyaztransneft", SibPTUS Date: Thu, 16 Oct 2008 02:37:42 +0000 (UTC) From: Victor Sudakov Subject: Re: custom XLT for PuTTY Victor Sudakov wrote: > > Is there a way to create a custom translation table for PuTTY (win32)? The problem was solved with IrLex. It supports custom translation tables. http://sourceforge.net/projects/irlex -- Victor Sudakov, VAS4-RIPE, VAS47-RIPN 2:5005/49@fidonet http://vas.tomsk.ru/ ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.unix.solaris References: Message-ID: Organization: http://www.andreas-borchert.de/ Date: Thu, 4 Feb 2010 11:55:42 +0100 (CET) From: Andreas F. Borchert Subject: Re: Trying to get UTF-8 to work On 2010-02-03, Charles Lindsey wrote: > > But if I view that in a supposedly en_US.UTF-8 window, it fails to display > those Greek characters (they are rendered as spaces), and yet if I attempt > to display characters such as the Danish æøâ (shown in iso-8859-1 for this > message) Apparently, Greek characters are missing in the font you are using. For my xterms, I use some of the iso10646 character sets under Solaris, e.g., -misc-fixed-medium-r-normal--20-200-75-75-c-100-iso10646-1 which can be passed to the "-fn" option of xterm. I have never used dtpad nor do I know how its font can be specified but there exists surely a similar command line argument for dtpad. Andreas. //////////////////////////////////////////////////////////////////////////////