IPA-ASCII

From spssig.spss.com!news.oc.com!lgc.com!cs.utexas.edu!sdd.hp.com!hpscit.sc.hp.com!hplextra!rigel!evan Wed Dec 30 12:03:43 CST 1992
Article: 9381 of sci.lang
Newsgroups: sci.lang,alt.usag.english
Path: spssig.spss.com!news.oc.com!lgc.com!cs.utexas.edu!sdd.hp.com!hpscit.sc.hp.com!hplextra!rigel!evan
From: evan@hpl.hp.com (Evan Kirshenbaum)
Subject: Final(?) Draft of ASCII/IPA Representation
Sender: news@hplabsz.hpl.hp.com (News Subsystem (Rigel))
Message-ID: <1992Dec23.195025.27611@hplabsz.hpl.hp.com>
Date: Wed, 23 Dec 1992 19:50:25 GMT
Reply-To: kirshenbaum@hpl.hp.com
References: <1992Dec23.194326.27000@hplabsz.hpl.hp.com>
Nntp-Posting-Host: hplerk.hpl.hp.com
Organization: Hewlett-Packard Laboratories
Lines: 498

This article describes a standard scheme for representing IPA
transcriptions in ASCII for use in Usenet articles and email.  The
following guidelines were kept in mind:

        o It should be usable for both phonemic and narrow phonetic
          transcription.

        o It should be possible to represent *all* symbols and
          diacritics in the IPA.

        o The previous guideline notwithstanding, it is expected that
          (as in the past) most use will be in transcribing English,
          so where tradeoffs are necessary, decisions should be made
          in favor of ease of representation of phonemes which are
          common in English.

        o The representation should be readable.

        o It should be possible to mechanically translate from the
          representation to a character set which includes IPA.  The
          reverse would also be nice.

In order to be able to represent a wide range of segments while making
common segments easy to type, we allow more than one representation
for a given segment.  Each segment has an "explicit" representation,
which is a set of features between curly braces ("{" and "}").  Each
feature is represented as a three letter abbreviation taken from a
standardized set.  The phoneme /b/ (a voiced, bilabial stop) could be
represented as /{vcd,blb,stp}/.  A first cut at the feature set
appears in appendix A below.

The word "tag" could thus be represented phonemically as
        /{vls,alv,stp}{low,fnt,unr,vwl}{vcd,vel,stp}/
and phonetically as
        [{vls,asp,alv,stp}{low,fnt,lng,unr,vwl}{unx,vcd,vel,stp}]

This works, but it's a bit of a pain.  To simplify transcription, we
allow an "implicit" representation for a segment which consists of a
(generally alphabetic) symbol followed by diacritics.  Thus /b/ stands
for /{vcd,blb,stp}/.  Case is significant (/n/ and /N/ are different
segments).  The segment symbols are given in appendix B below.

The word "tag" can thus be represented phonemically as
        /t&g/

The diacritics for a segment are represented between angle brackets
("<" and ">") and consist of symbols or features.  (In the common case
where the diacritic symbol is a single character which does not encode
a segment, the brackets may be removed.)  The features which the
diacritics map to override those of the segment.

The word "tag" thus becomes narrowly
        [t&g]
or
        [t&<:>g]
or
        [t&:g]

Some diacritic symbols encode more than one feature set.  Which one is
meant should be apparent from context.  For example, "." stands for
"{rnd}" when attached to a vowel, but "{rfx}" when attached to a
consonant. 

Clicks are common to many languages (especially in Africa), but there
is no IPA diacritic that means "click".  Rather than use up several
characters for clicks (which are infrequent in the languages most
often discussed), we instead use the diacritic "!" after the
homorganic unvoiced stop.  Thus /t!/ (= /t/ = /{alv,clk}/) is the
sound commonly written "tsk" and used in English to show disapproval.

The complete set of diacritic symbols appears in appendix C below.
Appendices D and E contain representations of segments more or less
ordered by feature (appendix D in tabular form, appendix E as a list).
Appendix F contains a list of all of the ASCII characters and the uses
they have been pressed to.

For transcription of any specific language a group can by convention
alter the character mappings (as an example, for Spanish /R/ may be
better used to represent /{alv,trl}/ than /{mid,cnt,rzd,vwl}/).  An
author may also press a little used symbol (for the language under
consideration) into service to highlight a distinction.  Such an
alteration should be made explicitly to avoid confusion.

The diacritics "+" and "=" and the segment symbols "$" and "%" are
explicitly left unspecified so that they can be used to mark
language-specific features (that are otherwise cumbersome to mark).
Such symbols can be assigned either by convention for a specific
language or in an ad-hoc manner by an individual author.

Stress marks are prepended to the syllable they attach to.  "'"
signals primary stress, "," signals secondary stress.  Spaces should
be employed to separate words (cliticized words may be written
unseparated).  When discussing single words, it may be helpful to
insert a space before each syllable that doesn't carry a
suprasegmental marker.

The "I hear the secretary" for an American might be something like
        /aI hir D@ 'sEkrI,t&ri/
while to an Englishman it might be more like
        /aI hi@ DI 'sEkr^tri/
  
Transcribing tone is harder.  Here's an attempt. For register tone
languages (e.g., Hausa, Navajo), numbers should be used with one being
the lowest.  Thus in Navajo, "1" is low tone and "2" is high.  In
Yoruba "1" is low, "2" is mid, and "3" is high.  The language's
"default" tone need not be specified.  For contour tone languages
(e.g., Mandarin, Thai), there is generally a numeric system in place
(Mandarin: "1" is high, "2" is rising, "3" is falling rising, "4" is
falling).  The tone indication should follow the syllable (vowel?).

The symbol "#" is used to represent a syllable or word boundary.

Appendix A.  Feature Abbreviations
----------------------------------

vcd     voiced              nas     nasal            fnt     front
vls     voiceless           orl     oral             cnt     center
                            apr     approximant      bck     back
blb     bilabial            vwl     vowel
lbd     labio-dental        lat     lateral          unr     unrounded
dnt     dental              ctl     central          rnd     rounded
alv     alveolar            trl     trill
rfx     retroflex           flp     flap             asp     aspirated
pla     palato-alveolar     clk     click            unx     unexploded
pal     palatal             ejc     ejective         syl     syllabic
vel     velar               imp     implosive        mrm     murmured
lbv     labio-velar                                  lng     long
uvl     uvular              hgh     high             vzd     velarized
phr     pharyngeal          smh     semi-high        lzd     labialized
glt     Glottal             umd     upper-mid        pzd     palatalized
                            mid     mid              rzd     rhoticized
stp     stop                lmd     lower-mid        nzd     nasalized
frc     fricative           low     low              fzd     pharyngealized


Appendix B.  Segment Symbols
----------------------------

This table lists the symbol, the associated feature set, and the
Unicode character code and name for the corresponding IPA character.
In some cases (e.g., /I/) there are multiple IPA characters in use for
the segment.  I have listed both.  In some cases (e.g. /j/), the IPA
symbol seems to be ambiguous (generally between an approximant and the
homorganic voiced fricative).

The entries marked with "??" are those that I am least sure of.  When
I have listed more than one possibility for a symbol, the first is my
current preference.

a {low,cnt,unr,vwl}     U+0061 LATIN SMALL LETTER A
b {vcd,blb,stp}         U+0062 LATIN SMALL LETTER B
c {vls,pal,stp}         U+0063 LATIN SMALL LETTER C
d {vcd,alv,stp}         U+0064 LATIN SMALL LETTER D
e {umd,fnt,urd,vwl}     U+0065 LATIN SMALL LETTER E
f {vls,lbd,frc}         U+0066 LATIN SMALL LETTER F
g {vcd,vel,stp}         U+0067 LATIN SMALL LETTER G
                        U+0261 LATIN SMALL LETTER SCRIPT G
h {glt,apr}             U+0068 LATIN SMALL LETTER H
i {hgh,fnt,unr,vwl}     U+0069 LATIN SMALL LETTER I
j {pal,apr}/{vcd,pal,frc} U+006A LATIN SMALL LETTER J
k {vls,vel,stp}         U+006B LATIN SMALL LETTER K
l {vcd,alv,lat}         U+006C LATIN SMALL LETTER L
m {blb,nas}             U+006D LATIN SMALL LETTER M
n {alv,nas}             U+006E LATIN SMALL LETTER N
o {umd,bck,rnd,vwl}     U+006F LATIN SMALL LETTER O
p {vls,blb,stp}         U+0070 LATIN SMALL LETTER P
q {vls,uvl,stp}         U+0071 LATIN SMALL LETTER Q
r {alv,apr}             U+0279 LATIN SMALL LETTER TURNED R
s {vls,alv,frc}         U+0073 LATIN SMALL LETTER S
t {vls,alv,stp}         U+0074 LATIN SMALL LETTER T
u {hgh,bck,rnd,vwl}     U+0075 LATIN SMALL LETTER U
v {vcd,lbd,frc}         U+0076 LATIN SMALL LETTER V
w {lbv,apr}/{vcd,lbv,frc} U+0077 LATIN SMALL LETTER W
x {vls,vel,frc}         U+0078 LATIN SMALL LETTER X
y {hgh,fnt,rnd,vwl}     U+0079 LATIN SMALL LETTER Y
z {vcd,alv,frc}         U+007A LATIN SMALL LETTER Z

A {low,bck,unr,vwl}     U+0251 LATIN SMALL LETTER SCRIPT A
B {vcd,blb,frc}         U+03B2 GREEK SMALL LETTER BETA
C {vls,pal,frc}         U+00E7 LATIN SMALL LETTER C CEDILLA
D {vcd,dnt,frc}         U+00F0 LATIN SMALL LETTER ETH
E {lmd,fnt,unr,vwl}     U+025B LATIN SMALL LETTER EPSILON
F   -- Unused --
G {vcd,uvl,stp}         U+0262 LATIN LETTER SMALL CAPITAL G
H {vls,phr,frc}         U+0127 LATIN SMALL LETTER H BAR
I {smh,fnt,unr,vwl}     U+026A LATIN LETTER SMALL CAPITAL I
                        U+0269 LATIN SMALL LETTER IOTA
J {vcd,pal,stp}         U+025F LATIN SMALL LETTER DOTLESS J BAR
K   -- Unused --
L {vcd,vel,lat}     ??  U+026B LATIN SMALL LETTER L WITH MIDDLE TILDE
                        U+029F LATIN LETTER SMALL CAPITAL L
  {vls,alv,lat,frc} ??  U+026C LATIN SMALL LETTER L BELT
M {lbd,nas}		U+0271 LATIN SMALL LETTER M HOOK
N {vel,nas}             U+014B LATIN SMALL LETTER ENG
O {lmd,bck,rnd,vwl}     U+0254 LATIN SMALL LETTER OPEN O
P {vls,blb,frc}         U+03A6 GREEK CAPITAL LETTER PHI
Q {vcd,vel,frc}         U+0263 LATIN SMALL LETTER GAMMA
R {mid,cnt,rzd,vwl} ??  U+025A LATIN SMALL LETTER SCHWA HOOK
  {alv,trl}         ??  U+0280 LATIN LETTER SMALL CAPITAL R
S {vls,pla,frc}         U+0283 LATIN SMALL LETTER ESH
T {vls,dnt,frc}         U+03B8 GREEK SMALL LETTER THETA
U {smh,bck,rnd,vwl}     U+028A LATIN SMALL LETTER UPSILON
                        U+0277 LATIN SMALL LETTER CLOSED OMEGA
V {lmd,bck,unr,vwl}     U+028C LATIN SMALL LETTER TURNED V
W {lmd,fnt,rnd,vwl} ??  U+0153 LATIN SMALL LETTER O E
X {vls,uvl,frc}         U+03C7 GREEK SMALL LETTER CHI
Y {umd,fnt,rnd,vwl} ??  U+00F8 LATIN SMALL LETTER O SLASH
  {lmd,fnt,rnd,vwl} ??  U+0153 LATIN SMALL LETTER O E
Z {vcd,pla,frc}         U+0292 LATIN SMALL LETTER YOGH

? {glt,stp}             U+0294 LATIN SMALL LETTER GLOTTAL STOP
@ {mid,cnt,unr,vwl}     U+0259 LATIN SMALL LETTER SCHWA
& {low,fnt,unr,vwl}     U+00E6 LATIN SMALL LETTER A E
* {vcd,alv,flp}         U+027E LATIN SMALL LETTER FISHHOOK R
%   -- Ad Hoc Segment --
$   -- Ad Hoc Segment --

Appendix C.  Diacritics
-----------------------

  Consonants: {vzd}     U+0334 NON-SPACING TILDE OVERLAY
: {lng}                 U+02D0 MODIFIER LETTER TRIANGULAR COLON
- Vowels: {unr}           -- No equivalent --
  Consonants: {syl}     U+0329 NON-SPACING VERTICAL LINE BELOW
! {clk}                   -- No equivalent --
. Vowels: {rnd}           -- No equivalent --
  Consonants: {rfx}     U+0322 NON-SPACING RETROFLEX HOOK BELOW
                        U+0323 NON-SPACING DOT BELOW
` Voicless: {ejc}       U+02BC MODIFIER LETTER APOSTROPHE
  Voiced: {imp}           -- No equivalent --
[ {dnt}                 U+032A NON-SPACING BRIDGE BELOW
; {pzd}                 U+02B2 MODIFIER LETTER SMALL J
                        U+0321 NON-SPACING PALATALIZED HOOK BELOW
" Vowels: {cnt}           -- No equivalent -- 
  Consonants: {uvl}       -- No equivalent --
^ {pal}                   -- No equivalent --
+   -- Add Hoc Diacritic --
=   -- Add Hoc Diacritic --
 {fzd}               U+0334 NON-SPACING TILDE OVERLAY
 {asp}               U+02B0 MODIFIER LETTER SMALL H
 {unx}           ??  U+02DA SPACING RING ABOVE
    {vls}           ??  U+0325 NON-SPACING RING BELOW
 {rzd}               U+02B3 MODIFIER LETTER SMALL R
 {lzd}               U+02B7 MODIFIER LETTER SMALL W
                        U+032B NON-SPACING INVERTED DOUBLE ARCH BELOW
 {mrm}               U+02B1 MODIFIER LETTER SMALL H HOOK
                        U+0324 NON-SPACING DOUBLE DOT BELOW

Appendix D.  Segment Table
--------------------------

     blb-- -lbd-- --dnt-- --alv-- -rfx- -pla-- --pal--- --vel-- -----uvl-----

nas    m      M       n[      n      n.            n^      N           n"
stp  p b           t[ d[    t d   t. d.        c   J     k g      q    G
frc  F V    f v    T  D     s z   s. z.  S Z   C C  x Q      X    g"
apr        r     r[      r      r.            j       j      g"
lat                   l[      l      l.            l^      L
trl  b                r                                      r"
flp                           *      *. 
ejc  p`            t[`      t`                 c`        k'
imp    b`             d`      d`                   J`      g`     q`   G`
clk  p!            t!       c!                   c!      k!


     ---- lbv ----   --phr--  ---glt---

nas         n                               alv lat frc: s z
stp  t d              ?                    lat flp: *
frc  w   w      H H   h                 lat clk: l!
apr           w                 h


    ----- unr -----     unr     ----- rnd -----
    fnt   cnt   bck     cnt     fnt   cnt   bck
                        rzd
hgh  i     i"    u-              y     u"    u
smh  I                           I.          U
umd  e   @  o-    R    Y           o
mid        @             R             @.
lmd  E     V"    V               W     O"    O
low  &     a     A               &.    a.    A.


Appendix E.  Segment List
-------------------------

Where a segment requires more than one character to represent, and
there is a single IPA character, the Unicode code and name is noted.
If I couldn't find an IPA symbol for a segment, I left it out.

{blb,nas}       /m/
{vls,blb,stp}   /p/
{vcd,blb,stp}   /b/
{vls,blb,frc}   /F/
{vcd,blb,frc}   /V/
{blb,trl}       /b/  U+0299 LATIN LETTER SMALL CAPITAL B
{blb,imp}       /b`/      U+0253 LATIN SMALL LETTER B HOOK
{blb,ejc}       /p`/
{blb,clk}       /p!/      U+0298 LATIN LETTER BULSEYE

{lbd,nas}       /M/
{vls,lbd,frc}   /f/
{vcd,lbd,frc}   /v/
{lbd,apr}       /r/  U+028B LATIN SMALL LETTER SCRIPT V

{dnt,nas}       /n[/
{vls,dnt,stp}   /t[/
{vcd,dnt,stp}   /d[/
{vls,dnt,frc}   /T/
{vcd,dnt,frc}   /D/
{dnt,apr}       /r[/
{dnt,lat}       /l[/
{dnt,imp}       /d`/ (same as {alv,imp})
{dnt,ejc}       /t[`/
{dnt,clk}       /t!/      U+0287 LATIN SMALL LETTER TURNED T
                  (by rights this should be alveolar, but the alveolar
                   and palatal clicks use the same symbol (/c!/))

{alv,nas}       /n/
{vls,alv,stp}   /t/
{vcd,alv,stp}   /d/
{vls,alv,frc}   /s/
{vcd,alv,frc}   /z/
{alv,apr}       /r/
{alv,lat}       /l/
{alv,trl}       /r/  U+0072 LATIN SMALL LETTER R
                (perhaps /R/)
{alv,flp}       /*/
{vls,alv,lat,frc} /s/  U+026C LAIN SMALL LETTER L BELT
{vcd,alv,lat,frc} /z/  U+026E LATIN SMALL LETTER L YOGH
{alv,imp}       /d`/      U+0257 LATIN SMALL LETTER D HOOK
{alv,ejc}       /t`/
{alv,clk}       /c!/ (same as {pal,clk})

{rfx,nas}       /n./      U+0273 LATIN SMALL LETTER N RETROFLEX HOOK
{vls,rfx,stp}   /t./      U+0288 LATIN SMALL LETTER T RETROFLEX HOOK
{vcd,rfx,stp}   /d./      U+0256 LATIN SMALL LETTER D RETROFLEX HOOK
{vls,rfx,frc}   /s./      U+0282 LATIN SMALL LETTER S HOOK
{vcd,rfx,frc}   /z./      U+0290 LATIN SMALL LETTER Z RETROFLEX HOOK
{rfx,apr}       /r./      U+027B LATIN SMALL LETTER TURNED R HOOK
{rfx,lat}       /l./      U+026D LATIN SMALL LETTER L RETROFLEX HOOK
{rfx,flp}       /*./      U+027D LATIN SMALL LETTER R HOOK

{vls,pla,frc}   /S/
{vcd,pla,frc}   /Z/

{pal,nas}       /n^/
{vls,pal,stp}   /c/
{vcd,pal,stp}   /J/
{vls,pal,frc}   /C/
{vcd,pal,frc}   /C/  U+029D LATIN SMALL LETTER CROSSED-TAIL J
                (perhaps /j/ (same as {pal,apr}))

{pal,apr}       /j/
{rnd,pal,apr}   /j/  U+0265 LATIN SMALL LETTER TURNED H
{pal,lat}       /l^/      U+028E LATIN SMALL LETTER TURNED Y
{pal,imp}       /J`/      U+0284 LATIN SMALL LETTER DOTLESS J BAR HOOK
{pal,clk}       /c!/      U+0297 LATIN LETTER STRETCHED C

{vel,nas}       /N/
{vls,vel,stp}   /k/
{vcd,vel,stp}   /g/
{vls,vel,frc}   /x/
{vcd,vel,frc}   /Q/
{vel,apr}       /j/  U+0270 LATIN SMALL LETTER TURNED M WITH LONG
                                 LEG
{vel,lat}       /L/
{vel,imp}       /g`/      U+0260 LATIN SMALL LETTER G HOOK
{vel,ejc}       /k'/
{vel,clk}       /k!/      U+029E LATIN SMALL LETTER TURNED K

{lbv,nas}       /n/  Written as "ng" with tie above
{vls,lbv,stp}   /t/  Written as "kp" with tie above
{vcd,lbv,stp}   /d/  Written as "gb" with tie above
{vls,lbv,frc}   /w/  U+028D LATIN SMALL LETTER TURNED W
{vcd,lbv,frc}   /w/  (same as {lbv,apr})
{lbv,apr}       /w/

{uvl,nas}       /n"/      U+0274 LATIN LETTER SMALL CAPITAL N
{vls,uvl,stp}   /q/
{vcd,uvl,stp}   /G/
{vls,uvl,frc}   /X/       U+03C7 GREEK SMALL LETTER CHI
{vcd,uvl,frc}   /g"/ (same as {uvl,apr})
{uvl,apr}       /g"/      U+0281 LATIN LETTER SMALL CAPITAL INVERTED R
{uvl,trl}       /r"/      U+0280 LATIN LETTER SMALL CAPITAL R
{vls,uvl,imp}   /q`/      U+02A0 LATIN SMALL LETTER Q HOOK
{vcd,uvl,imp}   /G`/      U+029B LATIN LETTER SMALL CAPITAL G HOOK

{vls,phr,frc}   /H/
{vcd,phr,frc}   /H/  U+0295 LATIN LETTER REVERSED GLOTTAL STOP

{glt,stp}       /?/
{glt,apr}       /h/
{mrm,glt,frc}   /h/    U+0266 LATIN SMALL LETTER H HOOK

{vcd,lat,flp}   /*/  U+027A LATIN SMALL LETTER TURNED R WITH LONG
                                 LEG 
{lat,clk}       /l!/      U+0296 LATIN LETTER INVERTED GLOTTAL STOP

{hgh,fnt,unr,vwl} /i/
{hgh,fnt,rnd,vwl} /y/
{smh,fnt,unr,vwl} /I/
{smh,fnt,rnd,vwl} /I./    U+028F LATIN LETTER SMALL CAPITAL Y
{umd,fnt,unr,vwl} /e/
{umd,fnt,rnd,vwl} /Y/
{lmd,fnt,unr,vwl} /E/
{lmd,fnt,rnd,vwl} /W/
{low,fnt,unr,vwl} /&/
{low,fnt,rnd,vwl} /&./    U+0276 LATIN LETTER SMALL CAPITAL O E

{hgh,cnt,unr,vwl} /i"/    U+0268 LATIN SMALL LETTER BARRED I
{hgh,cnt,rnd,vwl} /u"/    U+0289 LATIN SMALL LETTER U BAR
{umd,cnt,unr,vwl} /@/  U+0258 LATIN SMALL LETTER REVERSED E
{umd,cnt,unr,rzd,vwl} /R/ U+025D LATIN SMALL LETTER REVERSED EPSILON
                                      HOOK 
{mid,cnt,unr,vwl} /@/
{mid,cnt,unr,rzd,vwl} /R/
{mid,cnt,rnd,vwl} /@./    U+0275 LATIN SMALL LETTER BARRED O
{lmd,cnt,unr,vwl} /V"/    U+025C LATIN SMALL LETTER REVERSED EPSILON
{lmd,cnt,rnd,vwl} /O"/    U+025E LATIN SMALL LETTER CLOSED REVERSED
                                 EPSILON
{low,cnt,unr,vwl} /a/     U+0061 LATIN SMALL LETTER A

{hgh,bck,unr,vwl} /u-/    U+026F LATIN SMALL LETTER TURNED M
{hgh,bck,rnd,vwl} /u/
{umd,bck,unr,vwl} /o-/    U+0264 LATIN SMALL LETTER BABY GAMMA
{umd,bck,rnd,vwl} /o/
{lmd,bck,unr,vwl} /V/
{lmd,bck,rnd,vwl} /O/
{low,bck,unr,vwl} /A/
{low,bck,rnd,vwl} /A./    U+0252 LATIN SMALL LETTER TURNED SCRIPT A

Appendix F.  ASCII Table
------------------------
In the following table, the following abbreviations are used:
        P: punctuation
        S: segment
        D: diacritic
      : diacritic that must be delimited


sp P: Separate words and segments Q  S: {vcd,vel,frc}           
!  D: {clk}                       R  S: {mid,cnt,rzd,vwl}       
"  D Vowel: {cnt}                 S  S: {vls,pla,frc}           
     Cons:  {uvl}                 T  S: {vls,dnt,frc}           
#  Unused                         U  S: {smh,bck,rnd,vwl}       
$  S: Ad Hoc                      V  S: {lmd,bck,unr,vwl}       
%  S: Ad Hoc                      W  S: {lmd,fnt,rnd,vwl}       
&  S: {low,fnt,unr,vwl}           X  S: {vls,uvl,frc}           
'  P: Primary stress              Y  S: {umd,fnt,rnd,vwl}       
(  Unused                         Z  S: {vcd,pla,frc}           
)  Unused                         [  P: Phonetic delimiter      
*  S: {vcd,lav,flp}                  D: {dnt}                   
+  D: Ad Hoc                      \  Unused                     
,  P: Secondary stress            ]  P: Phonetic delimiter      
-  D: Vowel: {unr}                ^  D: {pal}                   
      Cons:  {syl}                _  Unused                     
.  D: Vowel: {rnd}                `  D Voiced: {imp}            
      Cons:  {rfx}                     Voiceless: {ejc}         
/  P: Phonemic delimiter          a  S: {low,cnt,unr,vwl}       
0  Unused                         b  S: {vcd,blb,stp}           
1  P: Tone 1                      c  S: {vls,pal,stp}           
2  P: Tone 2                      d  S: {vcd,alv,stp}           
3  P: Tone 3                      e  S: {umd,fnt,urd,vwl}       
4  P: Tone 4                      f  S: {vls,lbd,frc}           
5  Unused                         g  S: {vcd,vel,stp}           
6  Unused                         h  S: {glt,apr}               
7  Unused                            : {asp}                 
8  Unused                         i  S: {hgh,fnt,unr,vwl}       
9  Unused                         j  S: {pal,apr}/{vcd,pal,frc} 
:  D: {lng}                          : {pzd}                 
;  D: {pzd}                       k  S: {vls,vel,stp}           
<  P: Diacritic delimiter         l  S: {vcd,alv,lat}           
=  D: Ad Hoc                      m  S: {blb,nas}               
>  P: Diacritic delimiter         n  S: {alv,nas}               
?  S: {glt,stp}                   o  S: {umd,bck,rnd,vwl}       
   : {mrm}
@  S: {mid,cnt,unr,vwl}              : {unx}                 
A  S: {low,bck,unr,vwl}           p  S: {vls,blb,stp}           
B  S: {vcd,blb,frc}               q  S: {vls,uvl,stp}           
C  S: {vls,pal,frc}               r  S: {alv,apr}               
D  S: {vcd,dnt,frc}                  : {rzd}                 
E  S: {lmd,fnt,unr,vwl}           s  S: {vls,alv,frc}           
F  Unused                         t  S: {vls,alv,stp}           
G  S: {vcd,uvl,stp}               u  S: {hgh,bck,rnd,vwl}       
H  S: {vls,phr,frc}               v  S: {vcd,lbd,frc}           
   : {fzd}                     w  S: {lbv,apr}/{vcd,lbv,frc} 
I  S: {smh,fnt,unr,vwl}              : {lzd}                 
J  S: {vcd,pal,stp}               x  S: {vls,vel,frc}           
K  Unused                         y  S: {hgh,fnt,rnd,vwl}       
L  S: {vcd,vel,lat}               z  S: {vcd,alv,frc}           
M  S: {lbd,nas}                   {  P: Feature set delimiter   
N  S: {vel,nas}                   |  Unused                     
O  S: {lmd,bck,rnd,vwl}           }  P: Feature set delimiter   
P  S: {vls,blb,frc}               ~  D: Cons:  {vzd}
                                        Vowel: {nzd}

>From spssig.spss.com!news.oc.com!lgc.com!cs.utexas.edu!sdd.hp.com!hpscit.sc.hp.com!hplextra!rigel!evan Wed Dec 30 12:08:41 CST 1992
Article: 9382 of sci.lang
Newsgroups: sci.lang,alt.usag.english
Path: spssig.spss.com!news.oc.com!lgc.com!cs.utexas.edu!sdd.hp.com!hpscit.sc.hp.com!hplextra!rigel!evan
From: evan@hpl.hp.com (Evan Kirshenbaum)
Subject: Summary of IPA/ASCII transcription for English
Sender: news@hplabsz.hpl.hp.com (News Subsystem (Rigel))
Message-ID: <1992Dec23.195332.27936@hplabsz.hpl.hp.com>
Date: Wed, 23 Dec 1992 19:53:32 GMT
Reply-To: kirshenbaum@hpl.hp.com
References: <1992Dec23.194326.27000@hplabsz.hpl.hp.com>
Nntp-Posting-Host: hplerk.hpl.hp.com
Organization: Hewlett-Packard Laboratories
Lines: 217

To aid English speakers in using the phonetic transcription, this
document describes the mapping onto a standard American dictionary
transcription system for sounds that commonly occur in the English
language.  When it differs from the symbol used, I've also included a
description of the IPA symbol for the benefit of non-Americans.

The table is taken from	the 'Pronunciation Symbols' page of
Merriam-Webster's New Collegiate Dictionary.  In the examples, the
letters which spell the sound are bracketed by '<...>'.

Note that this only describes a small subset of the transcription
system.  There are far more sounds (used in other languages) and
nuances of sound that can be captured.  See the document describing
the full standard for complete details.


Phonemic (broad) transcriptions are bracketed by '/.../'.  Phonetic
(narrow) transcriptions are bracketed by '[...]'.  Syllables that
carry primary stress are preceded by "'".  Syllables that carry
secondary stress are preceded by ",".  When giving the transcription
of a single word, spaces are generally inserted between syllables
(often omitted before syllables that have stress marks).  When giving
the transcription of a multi-word utterance, it is common to put
spaces between words and omit them between syllables.


/@/: schwa (upside-down 'e').
     Used in both unaccented ('b(a)nan(a)', 'c(o)llide', '(a)but'),
     and accented ('h(u)mdr(u)m', 'ab(u)t') contexts.

     The IPA symbol is a schwa.

     [British speakers often have different vowels in these two
     contexts.  The accented one is further back and is written /V/.
     Its IPA symbol is a 'wedge' or upside-down 'v'.]

/l-/, /n-/, /m-/, /N-/:
     Superscript schwa preceding consonant.
     As in 'batt(le)', 'mitt(en)', 'eat(en)'.  Signifies that the
     consonant is pronounced as a syllable by itself.

     The IPA symbol is a vertical bar below the consonant.

/R/: shwa followed by 'r'.
     'op(er)ation', 'f(ur)th(er)', '(ur)g(er)'.

     The IPA symbol is a schwa with a hook.

/&/: short a.
     'm(a)t', 'm(a)p', 'm(a)d', 'g(a)g, 'sn(a)p', 'p(a)tch'.

     The IPA symbol is an 'a-e' digraph.

/eI/: long a ('a' with bar above).
     'd(ay)', 'f(a)de', 'd(a)te', '(a)orta', 'dr(a)pe', 'c(a)pe'.
     
/A/: a with diaeresis (two dots) above.
     'b(o)ther', 'c(o)t', and, with most American speakers,
     'f(a)ther', 'c(a)rt'.

     The IPA symbol is a script 'a'.

/a/: a with dot above.
     'f(a)ther' as pronounced by speakers who do not rhyme it with
     bother. 

/AU/: a followed by u with dot.
     'n(ow)', 'l(ou)d', '(ou)t'.

/b/: '(b)a(b)y', 'ri(b)'.

/tS/: ch.  The dictionary notes "(actually, this sound is \t\ + \sh\)"
     '(ch)in', 'na(tu)re' (/'neI tSR/).

     In IPA transcription, this is sometimes spelled as 'c with hacek'.

/d/: '(d)i(d)', 'a(dd)er'.

/E/: short e.
     'b(e)t', 'b(e)d', 'p(e)ck'.

     The IPA symbol is a lower-case epsilon.  It is sometimes spelled
     with a small capital E.

/i/: long e ('e' with bar above).
     'b(ea)t', 'nosebl(ee)d', '(e)venl(y)', '(ea)s(y)'.

/f/: '(f)i(f)ty', 'cu(ff)'

/g/: '(g)o', 'bi(g)', '(g)ift'.

/h/: '(h)at', 'a(h)ead'.

/hw/: '(wh)ale' as pronounced by those who do not have the same
     pronunciation for both 'whale' and 'wail'.

/I/: short i.
     't(i)p', 'ban(i)sh', 'act(i)ve'.

     The IPA symbol is a small capital I or a lower-case iota.

/aI/: long i ('i' with bar above).
     's(i)te', 's(i)de', 'b(uy)', 'tr(i)pe'.

/dZ/: j. The dictionary notes "(actually, this sound is \d\ + \zh\)"
     '(j)ob', '(g)em', 'e', '(j)oin', '(j)u(dge)'.

/k/: '(k)in', '(c)oo(k)', 'a(che)'.

/x/: k with bar below.
     Geman 'i(ch)', 'bu(ch)'.

/l/: '(l)i(l)y', 'poo(l)'.

/m/: '(m)ur(m)ur', 'di(m)', 'ny(m)ph'.

/n/: '(n)o', 'ow(n)'.

/~/: superscript 'n'.
     "indicates that a preceeding vowel or diphthong is pronounced
     with the nasal passages open as in French 'un bon vin blanc' /W~
     bo~ va~ blA~/"

     The IPA diacritic is a tilde above the vowel.

/N/: eng ('n' with a tail).
     'si(ng)' /sIN/, 'si(ng)er' /'sIN R/, 'fi(ng)er' /'fIN gR/,
     'i(n)k' /iNk/

     The IPA symbol is an eng.

/oU/: long o ('o' with bar above).
     'b(o)ne', 'kn(ow)', 'b(eau)'.

/O/: 'o' with dot above.
     's(aw)', '(a)ll', 'gn(aw)'.

     The IPA symbol is a small open 'o' or upside-down 'c'.

/W/: o-e digraph
     French 'b(oeu)f', german 'H(o:)lle.

     The IPA symbol is an o-e digraph.

/Oi/: 'o' with dot above followed by 'i'.
     'c(oi)n', 'destr(oy)'.  [The dictionary also lists 's(awi)ng',
     but I pronounce that as two separate syllables /'sO IN/.]

/p/: '(p)e(pp)er', 'li(p)'.

/r/: '(r)ed', 'ca(r)', '(r)a(r)ity'.

/s/: '(s)our(ce)', 'le(ss)'.

/S/: sh.
     '(sh)y', 'mi(ssi)on', 'ma(ch)ine', 'spe(ci)al'.

     The IPA symbol is an esh: a tall, pulled 's' or long, barless 'f'.

/t/: '(t)ie', 'a(tt)ack'.

/T/: th.
     '(th)in'. 'e(th)er'.

     The IPA symbol as a lower-case theta.

/D/: 'th' with bar below.
     '(th)en', 'ei(th)er', '(th)is'.

     The IPA symbol is an eth, sort of a script 'd' with the bar crossed.

/u/: 'u' with diaeresis (two dots) above.
     'r(u)le', 'y(ou)th', 'union' /'jun j@n/, 'few' /fju/.

/U/: 'u' with dot above.
     'p(u)ll', 'w(oo)d', 'b(oo)k', 'curable' /'kjUr @ b@l/.

     The IPA symbol is a small letter upsilon.  A small capital U or
     closed lower-case omega is also used.

/y/: u-e digraph.
     German 'f(u:)llen', 'h(u:)bsch', French 'r(ue)'.

/v/: '(v)i(v)id', 'gi(ve)'.

/w/: '(w)e', 'a(w)ay'.

/j/: '(y)ard', '(y)oung', 'cue' /kju/, 'union' /'jun y@n/;

/(cons);/: superscript 'y' following consonant;
     "indicates that during the articulation of the sound represented
     by the preceding character, the front of the tongue has
     substantially the position it has for the articulation of the
     first sound of 'yard', as in French 'digne' /din;/."

     The IPA diacritic is a superscript 'j' following or hook below
     the consonant.

/ju/: '(you)th', '(u)nion', 'c(ue0', 'f(ew)', 'm(u)te'.

/jU/: 'c(u)rable', 'f(u)ry'.

/z/: '(z)one', 'rai(se)'.

/Z/: zh.
     'vi(si)on', 'azure' /'aZ R/.

     The IPA symbol is a yogh: like a flat-topped '3' lowered so that
     the top is the height of that of a 'z'.
[Return to Top]