This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
mktables not run unless needed
[perl5.git] / lib / unicore / ArabicShaping.txt
CommitLineData
20e8a3a3
NC
1# ArabicShaping-5.1.0.txt
2# Date: 2008-03-20, 17:39:00 PDT [KW]
8836d2a5
JH
3#
4# This file is a normative contributory data file in the
5# Unicode Character Database.
6#
20e8a3a3 7# Copyright (c) 1991-2008 Unicode, Inc.
a2bd7410
JH
8# For terms of use, see http://www.unicode.org/terms_of_use.html
9#
8836d2a5
JH
10# This file defines the shaping classes for Arabic and Syriac
11# positional shaping, repeating in machine readable form the
1911be83 12# information printed in Tables 8-3, 8-7, 8-8, 8-11, 8-12, and
20e8a3a3 13# 8-13 of The Unicode Standard, Version 5.0.
8836d2a5 14#
20e8a3a3 15# See sections 8.2 and 8.3 of The Unicode Standard, Version 5.0
8836d2a5
JH
16# for more information.
17#
18# Each line contains four fields, separated by a semicolon.
19#
1911be83 20# Field 0: the code point, in 4-digit hexadecimal
8836d2a5 21# form, of an Arabic or Syriac character.
1911be83 22# Field 1: gives a short schematic name for that character,
8836d2a5 23# abbreviated from the normative Unicode character name.
a2bd7410
JH
24# Field 2: defines the joining type (property name: Joining_Type)
25# R Right_Joining
26# L Left_Joining
27# D Dual_Joining
28# C Join_Causing
29# U Non_Joining
30# T Transparent
1911be83 31# See the Arabic block description for more information on these types.
a2bd7410
JH
32# Field 3: defines the joining group (property name: Joining_Group)
33#
34# The values of the joining group are based schematically on character
35# names. Where a schematic character name consists of two or more parts separated
36# by spaces, the formal Joining_Group property value, as specified in
37# PropertyValueAliases.txt, consists of the same name parts joined by
38# underscores. Hence, the entry:
39#
40# 0629; TEH MARBUTA; R; TEH MARBUTA
41#
42# corresponds to [Joining_Group = Teh_Marbuta].
8836d2a5 43#
a2bd7410
JH
44# Note: For historical reasons, the property value [Joining_Group = Hamza_On_Heh_Goal]
45# is anachronistically named. It used to apply to both of the following characters
46# in earlier versions of the standard:
47#
48# U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE
49# U+06C3 ARABIC LETTER TEH MARBUTA GOAL
50#
51# However, it currently applies only to U+06C3, and *not* to U+06C2.
52# To avoid destabilizing existing Joining_Group property aliases, the
53# value Hamza_On_Heh_Goal has not been changed, despite the fact that it
54# no longer applies to Hamza On Heh Goal, but only to Teh Marbuta Goal.
afc46004 55#
1911be83 56# Note: Code points that are not explicitly listed in this file are
a2bd7410 57# either of joining type T or U:
afc46004 58#
a2bd7410 59# - Those that not explicitly listed that are of General Category Mn, Me, or Cf
1911be83
JH
60# have joining type T.
61# - All others not explicitly listed have type U.
afc46004
JH
62#
63# For an explicit listing of characters of joining type T, see
64# the derived property file DerivedJoiningType.txt.
65#
66# There are currently no characters of type L defined in Unicode.
67#
8836d2a5
JH
68# #############################################################
69
70# Unicode; Schematic Name; Joining Type; Joining Group
71
d357d9fe 72# Arabic characters
8836d2a5 73
a2bd7410
JH
740600; ARABIC NUMBER SIGN; U; No_Joining_Group
750601; ARABIC SIGN SANAH; U; No_Joining_Group
760602; ARABIC FOOTNOTE MARKER; U; No_Joining_Group
770603; ARABIC SIGN SAFHA; U; No_Joining_Group
20e8a3a3 780608; ARABIC RAY; U; No_Joining_Group
a2bd7410
JH
79060B; AFGHANI SIGN; U; No_Joining_Group
800621; HAMZA; U; No_Joining_Group
d357d9fe
JH
810622; MADDA ON ALEF; R; ALEF
820623; HAMZA ON ALEF; R; ALEF
830624; HAMZA ON WAW; R; WAW
840625; HAMZA UNDER ALEF; R; ALEF
850626; HAMZA ON YEH; D; YEH
860627; ALEF; R; ALEF
870628; BEH; D; BEH
880629; TEH MARBUTA; R; TEH MARBUTA
89062A; TEH; D; BEH
90062B; THEH; D; BEH
91062C; JEEM; D; HAH
92062D; HAH; D; HAH
93062E; KHAH; D; HAH
94062F; DAL; R; DAL
950630; THAL; R; DAL
960631; REH; R; REH
970632; ZAIN; R; REH
980633; SEEN; D; SEEN
990634; SHEEN; D; SEEN
1000635; SAD; D; SAD
1010636; DAD; D; SAD
1020637; TAH; D; TAH
1030638; ZAH; D; TAH
1040639; AIN; D; AIN
105063A; GHAIN; D; AIN
20e8a3a3
NC
106063B; KEHEH WITH 2 DOTS ABOVE; D; GAF
107063C; KEHEH WITH 3 DOTS BELOW; D; GAF
108063D; FARSI YEH WITH INVERTED V; D; YEH
109063E; FARSI YEH WITH 2 DOTS ABOVE; D; YEH
110063F; FARSI YEH WITH 3 DOTS ABOVE; D; YEH
a2bd7410 1110640; TATWEEL; C; No_Joining_Group
d357d9fe
JH
1120641; FEH; D; FEH
1130642; QAF; D; QAF
1140643; KAF; D; KAF
1150644; LAM; D; LAM
1160645; MEEM; D; MEEM
1170646; NOON; D; NOON
1180647; HEH; D; HEH
1190648; WAW; R; WAW
8836d2a5 1200649; ALEF MAKSURA; D; YEH
d357d9fe 121064A; YEH; D; YEH
822ebcc8
JH
122066E; DOTLESS BEH; D; BEH
123066F; DOTLESS QAF; D; QAF
8836d2a5 1240671; HAMZAT WASL ON ALEF; R; ALEF
d357d9fe
JH
1250672; WAVY HAMZA ON ALEF; R; ALEF
1260673; WAVY HAMZA UNDER ALEF; R; ALEF
a2bd7410 1270674; HIGH HAMZA; U; No_Joining_Group
d357d9fe
JH
1280675; HIGH HAMZA ALEF; R; ALEF
1290676; HIGH HAMZA WAW; R; WAW
1300677; HIGH HAMZA WAW WITH DAMMA; R; WAW
1310678; HIGH HAMZA YEH; D; YEH
1320679; TEH WITH SMALL TAH; D; BEH
133067A; TEH WITH 2 DOTS VERTICAL ABOVE; D; BEH
134067B; BEH WITH 2 DOTS VERTICAL BELOW; D; BEH
135067C; TEH WITH RING; D; BEH
136067D; TEH WITH 3 DOTS ABOVE DOWNWARD; D; BEH
137067E; TEH WITH 3 DOTS BELOW; D; BEH
138067F; TEH WITH 4 DOTS ABOVE; D; BEH
1390680; BEH WITH 4 DOTS BELOW; D; BEH
1400681; HAMZA ON HAH; D; HAH
1410682; HAH WITH 2 DOTS VERTICAL ABOVE; D; HAH
1420683; HAH WITH MIDDLE 2 DOTS; D; HAH
1430684; HAH WITH MIDDLE 2 DOTS VERTICAL; D; HAH
1440685; HAH WITH 3 DOTS ABOVE; D; HAH
1450686; HAH WITH MIDDLE 3 DOTS DOWNWARD; D; HAH
1460687; HAH WITH MIDDLE 4 DOTS; D; HAH
1470688; DAL WITH SMALL TAH; R; DAL
1480689; DAL WITH RING; R; DAL
149068A; DAL WITH DOT BELOW; R; DAL
150068B; DAL WITH DOT BELOW AND SMALL TAH; R; DAL
151068C; DAL WITH 2 DOTS ABOVE; R; DAL
152068D; DAL WITH 2 DOTS BELOW; R; DAL
153068E; DAL WITH 3 DOTS ABOVE; R; DAL
154068F; DAL WITH 3 DOTS ABOVE DOWNWARD; R; DAL
1550690; DAL WITH 4 DOTS ABOVE; R; DAL
1560691; REH WITH SMALL TAH; R; REH
1570692; REH WITH SMALL V; R; REH
1580693; REH WITH RING; R; REH
1590694; REH WITH DOT BELOW; R; REH
1600695; REH WITH SMALL V BELOW; R; REH
1610696; REH WITH DOT BELOW AND DOT ABOVE; R; REH
1620697; REH WITH 2 DOTS ABOVE; R; REH
1630698; REH WITH 3 DOTS ABOVE; R; REH
1640699; REH WITH 4 DOTS ABOVE; R; REH
165069A; SEEN WITH DOT BELOW AND DOT ABOVE; D; SEEN
166069B; SEEN WITH 3 DOTS BELOW; D; SEEN
167069C; SEEN WITH 3 DOTS BELOW AND 3 DOTS ABOVE; D; SEEN
168069D; SAD WITH 2 DOTS BELOW; D; SAD
169069E; SAD WITH 3 DOTS ABOVE; D; SAD
170069F; TAH WITH 3 DOTS ABOVE; D; TAH
17106A0; AIN WITH 3 DOTS ABOVE; D; AIN
17206A1; DOTLESS FEH; D; FEH
17306A2; FEH WITH DOT MOVED BELOW; D; FEH
17406A3; FEH WITH DOT BELOW; D; FEH
17506A4; FEH WITH 3 DOTS ABOVE; D; FEH
17606A5; FEH WITH 3 DOTS BELOW; D; FEH
17706A6; FEH WITH 4 DOTS ABOVE; D; FEH
17806A7; QAF WITH DOT ABOVE; D; QAF
17906A8; QAF WITH 3 DOTS ABOVE; D; QAF
a2bd7410 18006A9; KEHEH; D; GAF
d357d9fe
JH
18106AA; SWASH KAF; D; SWASH KAF
18206AB; KAF WITH RING; D; GAF
18306AC; KAF WITH DOT ABOVE; D; KAF
18406AD; KAF WITH 3 DOTS ABOVE; D; KAF
18506AE; KAF WITH 3 DOTS BELOW; D; KAF
18606AF; GAF; D; GAF
18706B0; GAF WITH RING; D; GAF
18806B1; GAF WITH 2 DOTS ABOVE; D; GAF
18906B2; GAF WITH 2 DOTS BELOW; D; GAF
19006B3; GAF WITH 2 DOTS VERTICAL BELOW; D; GAF
19106B4; GAF WITH 3 DOTS ABOVE; D; GAF
19206B5; LAM WITH SMALL V; D; LAM
19306B6; LAM WITH DOT ABOVE; D; LAM
19406B7; LAM WITH 3 DOTS ABOVE; D; LAM
19506B8; LAM WITH 3 DOTS BELOW; D; LAM
19606B9; NOON WITH DOT BELOW; D; NOON
19706BA; DOTLESS NOON; D; NOON
19806BB; DOTLESS NOON WITH SMALL TAH; D; NOON
19906BC; NOON WITH RING; D; NOON
20006BD; NOON WITH 3 DOTS ABOVE; D; NOON
20106BE; KNOTTED HEH; D; KNOTTED HEH
20206BF; HAH WITH MIDDLE 3 DOTS DOWNWARD AND DOT ABOVE; D; HAH
20306C0; HAMZA ON HEH; R; TEH MARBUTA
20406C1; HEH GOAL; D; HEH GOAL
a2bd7410 20506C2; HAMZA ON HEH GOAL; D; HEH GOAL
d357d9fe
JH
20606C3; TEH MARBUTA GOAL; R; HAMZA ON HEH GOAL
20706C4; WAW WITH RING; R; WAW
20806C5; WAW WITH BAR; R; WAW
20906C6; WAW WITH SMALL V; R; WAW
21006C7; WAW WITH DAMMA; R; WAW
21106C8; WAW WITH ALEF ABOVE; R; WAW
21206C9; WAW WITH INVERTED SMALL V; R; WAW
21306CA; WAW WITH 2 DOTS ABOVE; R; WAW
21406CB; WAW WITH 3 DOTS ABOVE; R; WAW
21506CC; DOTLESS YEH; D; YEH
21606CD; YEH WITH TAIL; R; YEH WITH TAIL
21706CE; YEH WITH SMALL V; D; YEH
21806CF; WAW WITH DOT ABOVE; R; WAW
21906D0; YEH WITH 2 DOTS VERTICAL BELOW; D; YEH
22006D1; YEH WITH 3 DOTS BELOW; D; YEH
22106D2; YEH BARREE; R; YEH BARREE
22206D3; HAMZA ON YEH BARREE; R; YEH BARREE
afc46004 22306D5; AE; R; TEH MARBUTA
a2bd7410 22406DD; ARABIC END OF AYAH; U; No_Joining_Group
1911be83
JH
22506EE; DAL WITH INVERTED V; R; DAL
22606EF; REH WITH INVERTED V; R; REH
d357d9fe
JH
22706FA; SEEN WITH DOT BELOW AND 3 DOTS ABOVE; D; SEEN
22806FB; DAD WITH DOT BELOW; D; SAD
22906FC; GHAIN WITH DOT BELOW; D; AIN
7be0dac3 23006FF; HEH WITH INVERTED V; D; KNOTTED HEH
8836d2a5 231
d357d9fe 232# Syriac characters
8836d2a5 233
d357d9fe
JH
2340710; ALAPH; R; ALAPH
2350712; BETH; D; BETH
2360713; GAMAL; D; GAMAL
2370714; GAMAL GARSHUNI; D; GAMAL
2380715; DALATH; R; DALATH RISH
2390716; DOTLESS DALATH RISH; R; DALATH RISH
2400717; HE; R; HE
822ebcc8 2410718; WAW; R; SYRIAC WAW
d357d9fe
JH
2420719; ZAIN; R; ZAIN
243071A; HETH; D; HETH
244071B; TETH; D; TETH
245071C; TETH GARSHUNI; D; TETH
246071D; YUDH; D; YUDH
247071E; YUDH HE; R; YUDH HE
248071F; KAPH; D; KAPH
2490720; LAMADH; D; LAMADH
2500721; MIM; D; MIM
2510722; NUN; D; NUN
2520723; SEMKATH; D; SEMKATH
2530724; FINAL SEMKATH; D; FINAL SEMKATH
2540725; E; D; E
2550726; PE; D; PE
2560727; REVERSED PE; D; REVERSED PE
2570728; SADHE; R; SADHE
2580729; QAPH; D; QAPH
259072A; RISH; R; DALATH RISH
260072B; SHIN; D; SHIN
261072C; TAW; R; TAW
1911be83
JH
262072D; PERSIAN BHETH; D; BETH
263072E; PERSIAN GHAMAL; D; GAMAL
264072F; PERSIAN DHALATH; R; DALATH RISH
265074D; SOGDIAN ZHAIN; R; ZHAIN
266074E; SOGDIAN KHAPH; D; KHAPH
267074F; SOGDIAN FE; D; FE
afc46004 268
a2bd7410
JH
269# Arabic supplement characters
270
2710750; BEH WITH 3 DOTS HORIZONTALLY BELOW; D; BEH
2720751; BEH WITH DOT BELOW AND 3 DOTS ABOVE; D; BEH
2730752; BEH WITH 3 DOTS POINTING UPWARDS BELOW; D; BEH
2740753; BEH WITH 3 DOTS POINTING UPWARDS BELOW AND 2 DOTS ABOVE; D; BEH
2750754; BEH WITH 2 DOTS BELOW AND DOT ABOVE; D; BEH
2760755; BEH WITH INVERTED SMALL V BELOW; D; BEH
2770756; BEH WITH SMALL V; D; BEH
2780757; HAH WITH 2 DOTS ABOVE; D; HAH
2790758; HAH WITH 3 DOTS POINTING UPWARDS BELOW; D; HAH
2800759; DAL WITH 2 DOTS VERTICALLY BELOW AND SMALL TAH; R; DAL
281075A; DAL WITH INVERTED SMALL V BELOW; R; DAL
282075B; REH WITH STROKE; R; REH
283075C; SEEN WITH 4 DOTS ABOVE; D; SEEN
284075D; AIN WITH 2 DOTS ABOVE; D; AIN
285075E; AIN WITH 3 DOTS POINTING DOWNWARDS ABOVE; D; AIN
286075F; AIN WITH 2 DOTS VERTICALLY ABOVE; D; AIN
2870760; FEH WITH 2 DOTS BELOW; D; FEH
2880761; FEH WITH 3 DOTS POINTING UPWARDS BELOW; D; FEH
2890762; KEHEH WITH DOT ABOVE; D; GAF
2900763; KEHEH WITH 3 DOTS ABOVE; D; GAF
2910764; KEHEH WITH 3 DOTS POINTING UPWARDS BELOW; D; GAF
2920765; MEEM WITH DOT ABOVE; D; MEEM
2930766; MEEM WITH DOT BELOW; D; MEEM
2940767; NOON WITH 2 DOTS BELOW; D; NOON
2950768; NOON WITH SMALL TAH; D; NOON
2960769; NOON WITH SMALL V; D; NOON
297076A; LAM WITH BAR; D; LAM
298076B; REH WITH 2 DOTS VERTICALLY ABOVE; R; REH
299076C; REH WITH HAMZA ABOVE; R; REH
300076D; SEEN WITH 2 DOTS VERTICALLY ABOVE; D; SEEN
20e8a3a3
NC
301076E; HAH WITH SMALL TAH BELOW; D; HAH
302076F; HAH WITH SMALL TAH AND 2 DOTS; D; HAH
3030770; SEEN WITH SMALL TAH AND 2 DOTS; D; SEEN
3040771; REH WITH SMALL TAH AND 2 DOTS; R; REH
3050772; HAH WITH SMALL TAH ABOVE; D; HAH
3060773; ALEF WITH DIGIT TWO ABOVE; R; ALEF
3070774; ALEF WITH DIGIT THREE ABOVE; R; ALEF
3080775; DOTLESS YEH WITH DIGIT TWO ABOVE; D; YEH
3090776; DOTLESS YEH WITH DIGIT THREE ABOVE; D; YEH
3100777; DOTLESS YEH WITH DIGIT FOUR BELOW; D; YEH
3110778; WAW WITH DIGIT TWO ABOVE; R; WAW
3120779; WAW WITH DIGIT THREE ABOVE; R; WAW
313077A; YEH BARREE WITH DIGIT TWO ABOVE; D; BURUSHASKI YEH BARREE
314077B; YEH BARREE WITH DIGIT THREE ABOVE; D; BURUSHASKI YEH BARREE
315077C; HAH WITH DIGIT FOUR BELOW; D; HAH
316077D; SEEN WITH DIGIT FOUR ABOVE; D; SEEN
317077E; SEEN WITH INVERTED V; D; SEEN
318077F; KAF WITH 2 DOTS ABOVE; D; KAF
a2bd7410 319
98fbe989
JH
320# N'Ko Characters
321
32207CA; NKO A; D; No_Joining_Group
32307CB; NKO EE; D; No_Joining_Group
32407CC; NKO I; D; No_Joining_Group
32507CD; NKO E; D; No_Joining_Group
32607CE; NKO U; D; No_Joining_Group
32707CF; NKO OO; D; No_Joining_Group
32807D0; NKO O; D; No_Joining_Group
32907D1; NKO DAGBASINNA; D; No_Joining_Group
33007D2; NKO N; D; No_Joining_Group
33107D3; NKO BA; D; No_Joining_Group
33207D4; NKO PA; D; No_Joining_Group
33307D5; NKO TA; D; No_Joining_Group
33407D6; NKO JA; D; No_Joining_Group
33507D7; NKO CHA; D; No_Joining_Group
33607D8; NKO DA; D; No_Joining_Group
33707D9; NKO RA; D; No_Joining_Group
33807DA; NKO RRA; D; No_Joining_Group
33907DB; NKO SA; D; No_Joining_Group
34007DC; NKO GBA; D; No_Joining_Group
34107DD; NKO FA; D; No_Joining_Group
34207DE; NKO KA; D; No_Joining_Group
34307DF; NKO LA; D; No_Joining_Group
34407E0; NKO NA WOLOSO; D; No_Joining_Group
34507E1; NKO MA; D; No_Joining_Group
34607E2; NKO NYA; D; No_Joining_Group
34707E3; NKO NA; D; No_Joining_Group
34807E4; NKO HA; D; No_Joining_Group
34907E5; NKO WA; D; No_Joining_Group
35007E6; NKO YA; D; No_Joining_Group
35107E7; NKO NYA WOLOSO; D; No_Joining_Group
35207E8; NKO JONA JA; D; No_Joining_Group
35307E9; NKO JONA CHA; D; No_Joining_Group
35407EA; NKO JONA RA; D; No_Joining_Group
35507FA; NKO LAJANYALAN; C; No_Joining_Group
356
afc46004
JH
357# Other
358
a2bd7410
JH
359200D; ZERO WIDTH JOINER; C; No_Joining_Group
360200C; ZERO WIDTH NON-JOINER; U; No_Joining_Group
98fbe989
JH
361
362# EOF