This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
mktables: Accommodate new @missings in Unicode 6.1
[perl5.git] / lib / unicore / ArabicShaping.txt
CommitLineData
bd84d130
KW
1# ArabicShaping-6.0.0.txt
2# Date: 2010-04-30, 13:47:00 PDT [KW]
8836d2a5
JH
3#
4# This file is a normative contributory data file in the
5# Unicode Character Database.
6#
bd84d130 7# Copyright (c) 1991-2010 Unicode, Inc.
a2bd7410
JH
8# For terms of use, see http://www.unicode.org/terms_of_use.html
9#
283b82dc 10# This file defines the shaping classes for Arabic, Syriac, and N'Ko
8836d2a5 11# positional shaping, repeating in machine readable form the
283b82dc 12# information exemplified in Tables 8-3, 8-7, 8-8, 8-11, 8-12,
bd84d130 13# 8-13, and 13-5 of The Unicode Standard, Version 6.0.
8836d2a5 14#
bd84d130 15# See sections 8.2, 8.3, and 13.5 of The Unicode Standard, Version 6.0
8836d2a5
JH
16# for more information.
17#
18# Each line contains four fields, separated by a semicolon.
19#
1911be83 20# Field 0: the code point, in 4-digit hexadecimal
283b82dc
KW
21# form, of an Arabic, Syriac, or N'Ko character.
22#
1911be83 23# Field 1: gives a short schematic name for that character,
8836d2a5 24# abbreviated from the normative Unicode character name.
bd84d130
KW
25# Note that this schematic name is considered a comment,
26# and does not constitute a formal property value.
283b82dc 27#
a2bd7410
JH
28# Field 2: defines the joining type (property name: Joining_Type)
29# R Right_Joining
30# L Left_Joining
31# D Dual_Joining
32# C Join_Causing
33# U Non_Joining
34# T Transparent
283b82dc
KW
35# See Section 8.2, Arabic for more information on these types.
36#
a2bd7410
JH
37# Field 3: defines the joining group (property name: Joining_Group)
38#
39# The values of the joining group are based schematically on character
40# names. Where a schematic character name consists of two or more parts separated
41# by spaces, the formal Joining_Group property value, as specified in
42# PropertyValueAliases.txt, consists of the same name parts joined by
43# underscores. Hence, the entry:
44#
45# 0629; TEH MARBUTA; R; TEH MARBUTA
46#
47# corresponds to [Joining_Group = Teh_Marbuta].
8836d2a5 48#
bd84d130
KW
49# Note: The property value now designated [Joining_Group = Teh_Marbuta_Goal]
50# used to apply to both of the following characters
a2bd7410
JH
51# in earlier versions of the standard:
52#
53# U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE
54# U+06C3 ARABIC LETTER TEH MARBUTA GOAL
55#
56# However, it currently applies only to U+06C3, and *not* to U+06C2.
57# To avoid destabilizing existing Joining_Group property aliases, the
bd84d130
KW
58# prior Joining_Group value for U+06C3 (Hamza_On_Heh_Goal) has been
59# retained as a property value alias, despite the fact that it
60# no longer applies to its namesake character, U+06C2.
61# See PropertyValueAliases.txt.
afc46004 62#
283b82dc
KW
63# When other cursive scripts are added to the Unicode Standard in
64# the future, the joining group value of all its letters will default
65# to jg=No_Joining_Group in this data file. Other, more specific
66# joining group values will be defined only if an explicit proposal
67# to define those values exactly has been approved by the UTC. This
68# is the convention exemplified by the N'Ko script. Only the Arabic
69# and Syriac scripts currently have explicit joining group values defined.
70#
1911be83 71# Note: Code points that are not explicitly listed in this file are
a2bd7410 72# either of joining type T or U:
afc46004 73#
a2bd7410 74# - Those that not explicitly listed that are of General Category Mn, Me, or Cf
1911be83 75# have joining type T.
283b82dc 76# - All others not explicitly listed have joining type U.
afc46004
JH
77#
78# For an explicit listing of characters of joining type T, see
79# the derived property file DerivedJoiningType.txt.
80#
283b82dc 81# There are currently no characters of joining type L defined in Unicode.
afc46004 82#
8836d2a5
JH
83# #############################################################
84
85# Unicode; Schematic Name; Joining Type; Joining Group
86
d357d9fe 87# Arabic characters
8836d2a5 88
a2bd7410
JH
890600; ARABIC NUMBER SIGN; U; No_Joining_Group
900601; ARABIC SIGN SANAH; U; No_Joining_Group
910602; ARABIC FOOTNOTE MARKER; U; No_Joining_Group
920603; ARABIC SIGN SAFHA; U; No_Joining_Group
20e8a3a3 930608; ARABIC RAY; U; No_Joining_Group
a2bd7410 94060B; AFGHANI SIGN; U; No_Joining_Group
bd84d130 950620; YEH WITH RING; D; YEH
a2bd7410 960621; HAMZA; U; No_Joining_Group
d357d9fe
JH
970622; MADDA ON ALEF; R; ALEF
980623; HAMZA ON ALEF; R; ALEF
990624; HAMZA ON WAW; R; WAW
1000625; HAMZA UNDER ALEF; R; ALEF
1010626; HAMZA ON YEH; D; YEH
1020627; ALEF; R; ALEF
1030628; BEH; D; BEH
1040629; TEH MARBUTA; R; TEH MARBUTA
105062A; TEH; D; BEH
106062B; THEH; D; BEH
107062C; JEEM; D; HAH
108062D; HAH; D; HAH
109062E; KHAH; D; HAH
110062F; DAL; R; DAL
1110630; THAL; R; DAL
1120631; REH; R; REH
1130632; ZAIN; R; REH
1140633; SEEN; D; SEEN
1150634; SHEEN; D; SEEN
1160635; SAD; D; SAD
1170636; DAD; D; SAD
1180637; TAH; D; TAH
1190638; ZAH; D; TAH
1200639; AIN; D; AIN
121063A; GHAIN; D; AIN
20e8a3a3
NC
122063B; KEHEH WITH 2 DOTS ABOVE; D; GAF
123063C; KEHEH WITH 3 DOTS BELOW; D; GAF
283b82dc
KW
124063D; FARSI YEH WITH INVERTED V; D; FARSI YEH
125063E; FARSI YEH WITH 2 DOTS ABOVE; D; FARSI YEH
126063F; FARSI YEH WITH 3 DOTS ABOVE; D; FARSI YEH
a2bd7410 1270640; TATWEEL; C; No_Joining_Group
d357d9fe
JH
1280641; FEH; D; FEH
1290642; QAF; D; QAF
1300643; KAF; D; KAF
1310644; LAM; D; LAM
1320645; MEEM; D; MEEM
1330646; NOON; D; NOON
1340647; HEH; D; HEH
1350648; WAW; R; WAW
8836d2a5 1360649; ALEF MAKSURA; D; YEH
d357d9fe 137064A; YEH; D; YEH
822ebcc8
JH
138066E; DOTLESS BEH; D; BEH
139066F; DOTLESS QAF; D; QAF
8836d2a5 1400671; HAMZAT WASL ON ALEF; R; ALEF
d357d9fe
JH
1410672; WAVY HAMZA ON ALEF; R; ALEF
1420673; WAVY HAMZA UNDER ALEF; R; ALEF
a2bd7410 1430674; HIGH HAMZA; U; No_Joining_Group
d357d9fe
JH
1440675; HIGH HAMZA ALEF; R; ALEF
1450676; HIGH HAMZA WAW; R; WAW
1460677; HIGH HAMZA WAW WITH DAMMA; R; WAW
1470678; HIGH HAMZA YEH; D; YEH
1480679; TEH WITH SMALL TAH; D; BEH
149067A; TEH WITH 2 DOTS VERTICAL ABOVE; D; BEH
150067B; BEH WITH 2 DOTS VERTICAL BELOW; D; BEH
151067C; TEH WITH RING; D; BEH
152067D; TEH WITH 3 DOTS ABOVE DOWNWARD; D; BEH
153067E; TEH WITH 3 DOTS BELOW; D; BEH
154067F; TEH WITH 4 DOTS ABOVE; D; BEH
1550680; BEH WITH 4 DOTS BELOW; D; BEH
1560681; HAMZA ON HAH; D; HAH
1570682; HAH WITH 2 DOTS VERTICAL ABOVE; D; HAH
1580683; HAH WITH MIDDLE 2 DOTS; D; HAH
1590684; HAH WITH MIDDLE 2 DOTS VERTICAL; D; HAH
1600685; HAH WITH 3 DOTS ABOVE; D; HAH
1610686; HAH WITH MIDDLE 3 DOTS DOWNWARD; D; HAH
1620687; HAH WITH MIDDLE 4 DOTS; D; HAH
1630688; DAL WITH SMALL TAH; R; DAL
1640689; DAL WITH RING; R; DAL
165068A; DAL WITH DOT BELOW; R; DAL
166068B; DAL WITH DOT BELOW AND SMALL TAH; R; DAL
167068C; DAL WITH 2 DOTS ABOVE; R; DAL
168068D; DAL WITH 2 DOTS BELOW; R; DAL
169068E; DAL WITH 3 DOTS ABOVE; R; DAL
170068F; DAL WITH 3 DOTS ABOVE DOWNWARD; R; DAL
1710690; DAL WITH 4 DOTS ABOVE; R; DAL
1720691; REH WITH SMALL TAH; R; REH
1730692; REH WITH SMALL V; R; REH
1740693; REH WITH RING; R; REH
1750694; REH WITH DOT BELOW; R; REH
1760695; REH WITH SMALL V BELOW; R; REH
1770696; REH WITH DOT BELOW AND DOT ABOVE; R; REH
1780697; REH WITH 2 DOTS ABOVE; R; REH
1790698; REH WITH 3 DOTS ABOVE; R; REH
1800699; REH WITH 4 DOTS ABOVE; R; REH
181069A; SEEN WITH DOT BELOW AND DOT ABOVE; D; SEEN
182069B; SEEN WITH 3 DOTS BELOW; D; SEEN
183069C; SEEN WITH 3 DOTS BELOW AND 3 DOTS ABOVE; D; SEEN
184069D; SAD WITH 2 DOTS BELOW; D; SAD
185069E; SAD WITH 3 DOTS ABOVE; D; SAD
186069F; TAH WITH 3 DOTS ABOVE; D; TAH
18706A0; AIN WITH 3 DOTS ABOVE; D; AIN
18806A1; DOTLESS FEH; D; FEH
18906A2; FEH WITH DOT MOVED BELOW; D; FEH
19006A3; FEH WITH DOT BELOW; D; FEH
19106A4; FEH WITH 3 DOTS ABOVE; D; FEH
19206A5; FEH WITH 3 DOTS BELOW; D; FEH
19306A6; FEH WITH 4 DOTS ABOVE; D; FEH
19406A7; QAF WITH DOT ABOVE; D; QAF
19506A8; QAF WITH 3 DOTS ABOVE; D; QAF
a2bd7410 19606A9; KEHEH; D; GAF
d357d9fe
JH
19706AA; SWASH KAF; D; SWASH KAF
19806AB; KAF WITH RING; D; GAF
19906AC; KAF WITH DOT ABOVE; D; KAF
20006AD; KAF WITH 3 DOTS ABOVE; D; KAF
20106AE; KAF WITH 3 DOTS BELOW; D; KAF
20206AF; GAF; D; GAF
20306B0; GAF WITH RING; D; GAF
20406B1; GAF WITH 2 DOTS ABOVE; D; GAF
20506B2; GAF WITH 2 DOTS BELOW; D; GAF
20606B3; GAF WITH 2 DOTS VERTICAL BELOW; D; GAF
20706B4; GAF WITH 3 DOTS ABOVE; D; GAF
20806B5; LAM WITH SMALL V; D; LAM
20906B6; LAM WITH DOT ABOVE; D; LAM
21006B7; LAM WITH 3 DOTS ABOVE; D; LAM
21106B8; LAM WITH 3 DOTS BELOW; D; LAM
21206B9; NOON WITH DOT BELOW; D; NOON
21306BA; DOTLESS NOON; D; NOON
21406BB; DOTLESS NOON WITH SMALL TAH; D; NOON
21506BC; NOON WITH RING; D; NOON
283b82dc 21606BD; NYA; D; NYA
d357d9fe
JH
21706BE; KNOTTED HEH; D; KNOTTED HEH
21806BF; HAH WITH MIDDLE 3 DOTS DOWNWARD AND DOT ABOVE; D; HAH
21906C0; HAMZA ON HEH; R; TEH MARBUTA
22006C1; HEH GOAL; D; HEH GOAL
a2bd7410 22106C2; HAMZA ON HEH GOAL; D; HEH GOAL
bd84d130 22206C3; TEH MARBUTA GOAL; R; TEH MARBUTA GOAL
d357d9fe
JH
22306C4; WAW WITH RING; R; WAW
22406C5; WAW WITH BAR; R; WAW
22506C6; WAW WITH SMALL V; R; WAW
22606C7; WAW WITH DAMMA; R; WAW
22706C8; WAW WITH ALEF ABOVE; R; WAW
22806C9; WAW WITH INVERTED SMALL V; R; WAW
22906CA; WAW WITH 2 DOTS ABOVE; R; WAW
23006CB; WAW WITH 3 DOTS ABOVE; R; WAW
283b82dc 23106CC; FARSI YEH; D; FARSI YEH
d357d9fe 23206CD; YEH WITH TAIL; R; YEH WITH TAIL
283b82dc 23306CE; FARSI YEH WITH SMALL V; D; FARSI YEH
d357d9fe
JH
23406CF; WAW WITH DOT ABOVE; R; WAW
23506D0; YEH WITH 2 DOTS VERTICAL BELOW; D; YEH
23606D1; YEH WITH 3 DOTS BELOW; D; YEH
23706D2; YEH BARREE; R; YEH BARREE
23806D3; HAMZA ON YEH BARREE; R; YEH BARREE
afc46004 23906D5; AE; R; TEH MARBUTA
a2bd7410 24006DD; ARABIC END OF AYAH; U; No_Joining_Group
1911be83
JH
24106EE; DAL WITH INVERTED V; R; DAL
24206EF; REH WITH INVERTED V; R; REH
d357d9fe
JH
24306FA; SEEN WITH DOT BELOW AND 3 DOTS ABOVE; D; SEEN
24406FB; DAD WITH DOT BELOW; D; SAD
24506FC; GHAIN WITH DOT BELOW; D; AIN
7be0dac3 24606FF; HEH WITH INVERTED V; D; KNOTTED HEH
8836d2a5 247
d357d9fe 248# Syriac characters
8836d2a5 249
d357d9fe
JH
2500710; ALAPH; R; ALAPH
2510712; BETH; D; BETH
2520713; GAMAL; D; GAMAL
2530714; GAMAL GARSHUNI; D; GAMAL
2540715; DALATH; R; DALATH RISH
2550716; DOTLESS DALATH RISH; R; DALATH RISH
2560717; HE; R; HE
822ebcc8 2570718; WAW; R; SYRIAC WAW
d357d9fe
JH
2580719; ZAIN; R; ZAIN
259071A; HETH; D; HETH
260071B; TETH; D; TETH
261071C; TETH GARSHUNI; D; TETH
262071D; YUDH; D; YUDH
263071E; YUDH HE; R; YUDH HE
264071F; KAPH; D; KAPH
2650720; LAMADH; D; LAMADH
2660721; MIM; D; MIM
2670722; NUN; D; NUN
2680723; SEMKATH; D; SEMKATH
2690724; FINAL SEMKATH; D; FINAL SEMKATH
2700725; E; D; E
2710726; PE; D; PE
2720727; REVERSED PE; D; REVERSED PE
2730728; SADHE; R; SADHE
2740729; QAPH; D; QAPH
275072A; RISH; R; DALATH RISH
276072B; SHIN; D; SHIN
277072C; TAW; R; TAW
1911be83
JH
278072D; PERSIAN BHETH; D; BETH
279072E; PERSIAN GHAMAL; D; GAMAL
280072F; PERSIAN DHALATH; R; DALATH RISH
281074D; SOGDIAN ZHAIN; R; ZHAIN
282074E; SOGDIAN KHAPH; D; KHAPH
283074F; SOGDIAN FE; D; FE
afc46004 284
a2bd7410
JH
285# Arabic supplement characters
286
2870750; BEH WITH 3 DOTS HORIZONTALLY BELOW; D; BEH
2880751; BEH WITH DOT BELOW AND 3 DOTS ABOVE; D; BEH
2890752; BEH WITH 3 DOTS POINTING UPWARDS BELOW; D; BEH
2900753; BEH WITH 3 DOTS POINTING UPWARDS BELOW AND 2 DOTS ABOVE; D; BEH
2910754; BEH WITH 2 DOTS BELOW AND DOT ABOVE; D; BEH
2920755; BEH WITH INVERTED SMALL V BELOW; D; BEH
2930756; BEH WITH SMALL V; D; BEH
2940757; HAH WITH 2 DOTS ABOVE; D; HAH
2950758; HAH WITH 3 DOTS POINTING UPWARDS BELOW; D; HAH
2960759; DAL WITH 2 DOTS VERTICALLY BELOW AND SMALL TAH; R; DAL
297075A; DAL WITH INVERTED SMALL V BELOW; R; DAL
298075B; REH WITH STROKE; R; REH
299075C; SEEN WITH 4 DOTS ABOVE; D; SEEN
300075D; AIN WITH 2 DOTS ABOVE; D; AIN
301075E; AIN WITH 3 DOTS POINTING DOWNWARDS ABOVE; D; AIN
302075F; AIN WITH 2 DOTS VERTICALLY ABOVE; D; AIN
3030760; FEH WITH 2 DOTS BELOW; D; FEH
3040761; FEH WITH 3 DOTS POINTING UPWARDS BELOW; D; FEH
3050762; KEHEH WITH DOT ABOVE; D; GAF
3060763; KEHEH WITH 3 DOTS ABOVE; D; GAF
3070764; KEHEH WITH 3 DOTS POINTING UPWARDS BELOW; D; GAF
3080765; MEEM WITH DOT ABOVE; D; MEEM
3090766; MEEM WITH DOT BELOW; D; MEEM
3100767; NOON WITH 2 DOTS BELOW; D; NOON
3110768; NOON WITH SMALL TAH; D; NOON
3120769; NOON WITH SMALL V; D; NOON
313076A; LAM WITH BAR; D; LAM
314076B; REH WITH 2 DOTS VERTICALLY ABOVE; R; REH
315076C; REH WITH HAMZA ABOVE; R; REH
316076D; SEEN WITH 2 DOTS VERTICALLY ABOVE; D; SEEN
20e8a3a3
NC
317076E; HAH WITH SMALL TAH BELOW; D; HAH
318076F; HAH WITH SMALL TAH AND 2 DOTS; D; HAH
3190770; SEEN WITH SMALL TAH AND 2 DOTS; D; SEEN
3200771; REH WITH SMALL TAH AND 2 DOTS; R; REH
3210772; HAH WITH SMALL TAH ABOVE; D; HAH
3220773; ALEF WITH DIGIT TWO ABOVE; R; ALEF
3230774; ALEF WITH DIGIT THREE ABOVE; R; ALEF
283b82dc
KW
3240775; FARSI YEH WITH DIGIT TWO ABOVE; D; FARSI YEH
3250776; FARSI YEH WITH DIGIT THREE ABOVE; D; FARSI YEH
3260777; YEH WITH DIGIT FOUR BELOW; D; YEH
20e8a3a3
NC
3270778; WAW WITH DIGIT TWO ABOVE; R; WAW
3280779; WAW WITH DIGIT THREE ABOVE; R; WAW
329077A; YEH BARREE WITH DIGIT TWO ABOVE; D; BURUSHASKI YEH BARREE
330077B; YEH BARREE WITH DIGIT THREE ABOVE; D; BURUSHASKI YEH BARREE
331077C; HAH WITH DIGIT FOUR BELOW; D; HAH
332077D; SEEN WITH DIGIT FOUR ABOVE; D; SEEN
333077E; SEEN WITH INVERTED V; D; SEEN
334077F; KAF WITH 2 DOTS ABOVE; D; KAF
a2bd7410 335
98fbe989
JH
336# N'Ko Characters
337
33807CA; NKO A; D; No_Joining_Group
33907CB; NKO EE; D; No_Joining_Group
34007CC; NKO I; D; No_Joining_Group
34107CD; NKO E; D; No_Joining_Group
34207CE; NKO U; D; No_Joining_Group
34307CF; NKO OO; D; No_Joining_Group
34407D0; NKO O; D; No_Joining_Group
34507D1; NKO DAGBASINNA; D; No_Joining_Group
34607D2; NKO N; D; No_Joining_Group
34707D3; NKO BA; D; No_Joining_Group
34807D4; NKO PA; D; No_Joining_Group
34907D5; NKO TA; D; No_Joining_Group
35007D6; NKO JA; D; No_Joining_Group
35107D7; NKO CHA; D; No_Joining_Group
35207D8; NKO DA; D; No_Joining_Group
35307D9; NKO RA; D; No_Joining_Group
35407DA; NKO RRA; D; No_Joining_Group
35507DB; NKO SA; D; No_Joining_Group
35607DC; NKO GBA; D; No_Joining_Group
35707DD; NKO FA; D; No_Joining_Group
35807DE; NKO KA; D; No_Joining_Group
35907DF; NKO LA; D; No_Joining_Group
36007E0; NKO NA WOLOSO; D; No_Joining_Group
36107E1; NKO MA; D; No_Joining_Group
36207E2; NKO NYA; D; No_Joining_Group
36307E3; NKO NA; D; No_Joining_Group
36407E4; NKO HA; D; No_Joining_Group
36507E5; NKO WA; D; No_Joining_Group
36607E6; NKO YA; D; No_Joining_Group
36707E7; NKO NYA WOLOSO; D; No_Joining_Group
36807E8; NKO JONA JA; D; No_Joining_Group
36907E9; NKO JONA CHA; D; No_Joining_Group
37007EA; NKO JONA RA; D; No_Joining_Group
37107FA; NKO LAJANYALAN; C; No_Joining_Group
372
afc46004
JH
373# Other
374
a2bd7410 375200C; ZERO WIDTH NON-JOINER; U; No_Joining_Group
283b82dc 376200D; ZERO WIDTH JOINER; C; No_Joining_Group
98fbe989
JH
377
378# EOF