This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Oops. I need to learn how to use git add.
[perl5.git] / cpan / Unicode-Collate / Changes
CommitLineData
ae6aa562
JH
1Revision history for Perl module Unicode::Collate.
2
584e761d
CBW
30.66 Sun Nov 7 10:47:30 2010
4 - U::C::Locale newly supports locale: ko.
5 - added Unicode::Collate::CJK::Korean for ko.
6 - added t/loc_ko.t.
7 - 12 compat. ideographs (e.g. U+FA0E) are treated as unified ideographs.
8 (though DUCET also does it, now Unicode::Collate does it without DUCET.)
9 - added t/compatui.t.
10 ! Ideographs Ext.B (U+20000..U+2A6D6) can be overrided with UCA_Version 8.
11 This is a long-standing behavior from Unicode::Collate 0.11 to 0.63.
12 A wrong fix at 0.64 should be abandoned.
13
028d3bfa
CBW
140.65 Wed Nov 3 13:10:20 2010
15 - U::C::Locale newly supports locale: zh and its some variants.
584e761d 16 (zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke)
028d3bfa
CBW
17 - added Unicode::Collate::CJK::Big5 for zh__big5han.
18 - added Unicode::Collate::CJK::GB2312 for zh__gb2312han.
19 - added Unicode::Collate::CJK::Pinyin for zh__pinyin.
20 - added Unicode::Collate::CJK::Stroke for zh__stroke.
584e761d 21 - added loc_zh.t, loc_zhb5.t, loc_zhgb.t, loc_zhpy.t, loc_zhst.t in t.
028d3bfa 22
539ce3d8
CBW
230.64 Sun Oct 31 14:17:29 2010
24 - U::C::Locale newly supports locale: ja.
25 - added Unicode::Collate::CJK::JISX0208 for ja.
584e761d 26 - added loc_ja.t, loc_jait.t, loc_japr.t in t.
539ce3d8
CBW
27 - a subroutine specified in 'overrideCJK' or 'overrideHangul' is allowed
28 to return an integer or undef value.
584e761d
CBW
29 - fix: Ideographs Ext.B (U+20000..U+2A6D6) are assigned in Unicode 3.1,
30 then 'overrideCJK' should not override them with UCA_Version 8.
31 !! sorry, this fix is based on a wrong idea. reverted at 0.66. !!
32 - separated t/overcjk0.t and t/overcjk1.t from t/override.t.
539ce3d8 33
aa7758f7
CBW
340.63 Sun Oct 10 22:13:21 2010
35 - supported suppress contractions (see 'suppress' in POD).
028d3bfa 36 - internal for 'hangul_terminator' in getSortKey().
aa7758f7 37 - U::C::Locale newly supports locales: be, bg, kk, mk, ru, sr.
584e761d
CBW
38 - added loc_be.t, loc_bg.t, loc_cyrl.t, loc_kk.t, loc_mk.t, loc_ru.t,
39 loc_sr.t in t.
aa7758f7
CBW
40 - added tailoring with U+0340 or U+0341 instead of U+0300 or U+0301.
41 (affected locales: hr, is, pl, se, to, wo)
42
6709de88
CBW
430.62 Wed Oct 6 21:35:54 2010
44 - U::C::Locale newly supports locales: ar, hu, hy, se, to, uk.
584e761d 45 - added loc_ar.t, loc_hu.t, loc_hy.t, loc_se.t, loc_to.t, loc_uk.t in t.
6709de88
CBW
46 - Vietnamese (vi): added tailoring for U+0340 and U+0341.
47
c02ee425
CBW
480.61 Sat Oct 2 11:41:29 2010
49 - U::C::Locale newly supports locales: hr, ig, sq.
584e761d 50 - added loc_hr.t, loc_ig.t, loc_sq.t in t.
c02ee425
CBW
51 - precomposites of e-dot-below, o-dot-below, o-tilde are tailored as well.
52 (affected locales: et, yo)
53 - Vietnamese (vi): added contractions for non-blocked decompositions
aa7758f7 54 * base + dot-below + mark such as a\x{323}\x{306}, \x{1EA1}\x{306} etc.
6709de88 55 * base + tone + horn such as o\x{309}\x{31B}, \x{1ECF}\x{31B} etc.
c02ee425 56
1393fe00
CBW
570.60 Thu Sep 23 21:37:36 2010
58 - bug fix: index() [and its friends including gmatch()] didn't remove
59 ignorable characters in the substring correctly.
60 Thanks for the bug report:
aa7758f7 61 http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2010-09/msg00014.html
1393fe00
CBW
62
63 - U::C::Locale newly supports locales: de__phonebook, nso, om, tn, vi.
584e761d 64 - added loc_de.t, loc_deph.t, loc_nso.t, loc_om.t, loc_tn.t, loc_vi.t in t.
1393fe00
CBW
65 - precomposites of a-breve, a-circ, e-circ, o-circ are tailored as well.
66 (affected locales: ro, sk, sv)
67
f1a7422f
CBW
680.59 Sun Sep 5 17:03:52 2010
69 - U::C::Locale newly supports locales: az, fil, ha, lt, mt, tr, wo, yo.
584e761d
CBW
70 - added loc_az.t, loc_fil.t, loc_ha.t, loc_lt.t, loc_mt.t, loc_tr.t,
71 loc_wo.t, loc_yo.t in t.
f1a7422f
CBW
72 - precomposites of a-uml, o-uml, and u-uml are tailored as well.
73 (affected locales: da, et, fi, fo, is, kl, nb, nn, sk, sv)
74
6484f676
CBW
750.58 Sun Aug 29 19:56:50 2010
76 - U::C::Locale newly supports locales: af, cy, da, fo, haw, is, kl, sw.
584e761d
CBW
77 - added loc_af.t, loc_cy.t, loc_da.t, loc_fo.t, loc_haw.t, loc_is.t,
78 loc_kl.t, loc_sw.t in t.
6484f676 79
64dc7822 800.57 Sun Aug 22 22:39:58 2010
6484f676 81 - U::C::Locale newly supports locales: ca, et, fi, lv, sk, sl.
584e761d 82 - added loc_ca.t, loc_et.t, loc_fi.t, loc_lv.t, loc_sk.t, loc_sl.t in t.
64dc7822 83
456a1446
CBW
840.56 Sun Aug 8 20:24:03 2010
85 - Unicode::Collate::Locale newly supports locales: eo, nb, ro, sv.
584e761d 86 - added loc_eo.t, loc_es.t, loc_estr.t, loc_nb.t, loc_ro.t, loc_sv.t in t.
456a1446 87 ! renamed t/locale_{xy}.t to t/loc_{xy}.t (for safer 8.3 names)
584e761d 88 (loc_cs.t, loc_fr.t, loc_nn.t, loc_pl.t, loc_test.t)
456a1446 89
00e00351 900.55 Sun Aug 1 21:21:23 2010
aa7758f7
CBW
91 - incorporated Unicode::Collate::Locale with some changes. see:
92 http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2004-03/msg00030.html
456a1446 93 - supported locales: cs, es, es__traditional, fr, nn, pl.
00e00351 94 ! added t/locale*.t that uses DUCET.
584e761d 95 (locale_cs.t, locale_fr.t, locale_nn.t, locale_pl.t, locale_test.t)
00e00351
CBW
96
970.54 Sun Jul 25 21:37:04 2010
98 - Now UCA Revision 20 (based on Unicode 5.2.0).
99 - DUCET is also updated (for Unicode 5.2.0) as Collate/allkeys.txt,
100 which *is required* to test this module.
101 ! Please notice that allkeys.txt will be overwritten if you have had
102 other allkeys.txt already.
103 - U+9FC4..U+9FCB and U+2A700..U+2B734 are new CJK Unified Ideographs.
104 - Many hangul jamo are assigned (affecting hangul_terminator).
105
106 ! DUCET will be compiled when XS is used. Explicit saying
107 <table => 'allkeys.txt'> (or using another table) will prevent
1393fe00 108 this module from using the compiled DUCET.
00e00351
CBW
109
110 ! added t/default.t that uses DUCET.
111
74b94a79
CBW
1120.53 Sun Feb 14 20:46:27 2010
113 - Now UCA Revision 18 (based on Unicode 5.1.0).
00e00351 114 - DUCET is also updated (for Unicode 5.1.0) as Collate/allkeys.txt,
74b94a79
CBW
115 which is not required to test this module.
116 ! Please notice that allkeys.txt will be overwritten if you have had
117 other allkeys.txt already.
118 - U+9FBC..U+9FC3 are new CJK Unified Ideographs.
119
6d24ed10
SP
1200.52 Thu Oct 13 21:51:09 2005
121 - The Unicode::Collate->new method does not destroy user's $_ any longer.
122 (thanks to Jon Warbrick for bug report)
123
0d50d293
RGS
1240.51 Sun May 29 20:21:19 2005
125 - Added the latest DUCET (for Unicode 4.1.0) as Collate/allkeys.txt,
126 which is not required to test this module.
74b94a79 127 ! Please notice that allkeys.txt will be overwritten if you have had
0d50d293
RGS
128 other allkeys.txt already.
129 - Added INSTALL section in POD.
130
3756e7ca
RGS
1310.50 Sun May 8 20:26:39 2005
132 - Now UCA Revision 14 (based on Unicode 4.1.0).
133 - Some tests are modified.
584e761d 134 - Added cjkrange.t, ignor.t, override.t in t.
3756e7ca
RGS
135 - Added META.yml.
136
1370.40 Sat Apr 24 06:54:40 2004
138 - Now a table file is searched in @INC.
139
abd1ec54
NC
1400.33 Sat Dec 13 14:07:27 2003
141 - documentation improvement: in "entry", "overrideHangul", etc.
142
1430.32 Wed Dec 3 23:38:18 2003
144 - A matching part from index(), match() etc. will include illegal
145 code points (as well as ignorable characters) following a grapheme.
146 - Contraction with illegal code point will be invalid.
584e761d
CBW
147 - Added t/view.t.
148 - Added some tests in t/illegal.t.
149 - Separated t/altern.t and t/rearrang.t from t/test.t.
abd1ec54
NC
150 - modified XSUB internals.
151
10d7ec48
NC
1520.31 Sun Nov 16 15:40:15 2003
153 - Illegal code points (surrogate and noncharacter; they are definitely
154 ignorable) will be distinguished from NULL ("\0");
155 but porting is not successful in the case of ((Pure Perl) and
156 (Perl 5.7.3 or before)). If perl 5.6.X is used, XSUB may help it
157 in place of broken CORE::unpack('U*') in older perl.
584e761d 158 - added illegal.t and illegalp.t in t.
abd1ec54
NC
159 - added XSUB (EXPERIMENTAL!) where some functions are implemented
160 in XSUB. Pure Perl is also supported.
10d7ec48 161
91ae00cb
NC
1620.30 Mon Oct 13 21:26:37 2003
163 - fix: Completely ignorable in table should be able to be overrided
164 by non-ignorable in entry.
165 - fix: Maximum length for contraction must not be shortened
10d7ec48 166 by a shorter contraction following in table and/or entry.
584e761d 167 - added t/normal.t.
91ae00cb
NC
168 - some doc fixes
169
1700.29 Mon Oct 13 12:18:23 2003
abd1ec54 171 - now UCA Version 11 (but no functionality is different from Version 9).
91ae00cb
NC
172 - supported hangul_terminator.
173 - fix: Base_Unicode_Version falsely returns Perl's Unicode version.
174 C4 in UTS #10 requires UTS's Unicode version.
175 - For variable weighting, 'variable' is recommended
176 and 'alternate' is deprecated.
177 - added version() method.
584e761d 178 - added hangtype.t, trailwt.t, variable.t, and version.t in t.
91ae00cb 179
06c8fc8f
RGS
1800.28 Sat Sep 06 20:16:01 2003
181 - Fixed another inconsistency under (normalization => undef):
182 Non-contiguous contraction is always neglected.
183 - Fixed: according to S2.1 in UTS #10, a blocked combining character
584e761d
CBW
184 should not be contracted. One test in t/test.t was wrong, then removed.
185 - Added t/contract.t.
06c8fc8f
RGS
186 - (normalization => "prenormalized") is able to be used.
187
1d2654e1
JH
1880.27 Sun Aug 31 22:23:17 2003
189 some improvements:
06c8fc8f 190 - The maximum length of contracted CE was not checked (v0.22 to v0.26).
1d2654e1
JH
191 Collation of a large string including a first letter of a contraction
192 that is not a part of that contraction (say, 'c' of 'ca'
193 where 'ch' is defined) was too slow, inefficient.
91ae00cb
NC
194 - A form name for 'normalization', no longer restricted to
195 /^(?:NF)?K?[CD]\z/, will be allowed as long as
196 Unicode::Normalize::normalize() accepts it, since Unicode::Normalize
197 or UAX #15 may be changed/enhanced in future.
1d2654e1
JH
198 - When Hangul syllables are decomposed under <normalization => undef>,
199 contraction among jamo (LV, VT, LVT) derived from the same
584e761d
CBW
200 Hangul syllable is allowed.
201 - Added t/hangul.t.
1d2654e1 202
4c843366
JH
2030.26 Sun Aug 03 22:23:17 2003
204 - fix: an expansion in which a CE is level 3 ignorable and others are not
1d2654e1 205 was wrongly made level 3 ignorable as a whole entry.
4c843366
JH
206 (In DUCET, some precomposites in Musical Symbols are so)
207
ae6aa562
JH
2080.25 Mon Jun 06 23:20:17 2003
209 - fix Makefile.PL.
210 - internal tweak (again): pack_U() and unpack_U().
45394607 211
9f1f04a1
RGS
2120.24 Thu Apr 02 23:12:54 2003
213 - internal tweak for (?un)pack 'U'.
214
4d36a948
TS
2150.23 Wed Sep 04 19:25:20 2002
216 - fix: scalar match() no longer returns an lvalue substr ref.
217 - fix: "Ignorable after variable" should be made level 3 ignorable
218 even if alternate => 'blanked'.
219 - Now a grapheme may contain trailing level 2, level 3,
220 and completely ignorable characters.
221
2220.22 Mon Sep 02 23:15:14 2002
584e761d
CBW
223 - New File: t/index.t.
224 (The new t/test.t excludes tests for index.)
4d36a948
TS
225 - tweak on index(). POSITION is supported.
226 - add match, gmatch, subst, gsubst methods.
227 - fix: ignorable after variable in 'shift'-variable weight.
228
caffd4cf
TS
2290.21 Sat Aug 03 10:24:00 2002
230 - upgrade keys.txt and t/test.t for UCA Version 9.
231
0116f5dc
JH
2320.20 Fri Jul 26 02:15:25 2002
233 - now UCA Version 9.
234 - U+FDD0..U+FDEF are new non-characters.
235 - fix: whitespace characters before @backwards etc. in a table file.
236 - now values for 'alternate', 'backwards', etc.,
237 which are explicitly specified via new(),
238 are preferred to those specified in a table file.
239
327745dc
TS
2400.12 Sun May 05 09:43:10 2002
241 - add new methods, ->UCA_Version and ->Base_Unicode_Version.
242 - test fix: removed the needless requirement of Unicode::Normalize.
243 [reported by David Hand]
244
809c7673
TS
2450.11 Fri May 03 02:28:10 2002
246 - fix: now derived collation elements can be used for Hangul Jamo
247 when their weights are not defined.
327745dc 248 [reported by Andreas J. Koenig]
809c7673
TS
249 - fix: rearrangements had not worked.
250 - mentioned pleblem on index() in BUGS.
251 - more documents, more tests.
252 - tag names for 'alternate' are case-insensitive (i.e. 'SHIFTed' etc.).
253 - The <undef> value for the keys "overrideCJK", "overrideHangul",
254 "rearrange" has a special behavior (different from default).
255
905aa9f0
TS
2560.10 Tue Dec 11 23:26:42 2001
257 - now you are allowed to use no table file.
258 - fix: fetching CE with two or more combining characters.
259
5398038e
TS
2600.09 Sun Nov 11 17:02:40:18 2001
261 - add the following methods: eq, ne, lt, le, gt, le.
262 - relies on &Unicode::Normalize::getCombinClass()
263 in place of %Unicode::Normalize::Combin
264 (the hash is not defined in the XS version of Unicode::Normalize).
265 then you should install Unicode::Normalize 0.10 or later.
266 - now independent of Lingua::KO::Hangul::Util
267 (this module does decomposition of Hangul syllables for itself)
268
d16e9e3d
JH
2690.08 Mon Aug 20 22:40:18 2001
270 - add the index method.
271
45394607
JH
2720.07 Thu Aug 16 23:42:02 2001
273 - rename the module name to Unicode::Collate.
274
2750.06 Thu Aug 16 23:18:36 2001
276 - add description of the getSortKey method.
277
2780.05 Mon Aug 13 22:23:11 2001
279 - bug fix: on the things of 4.2.1, UTR #10
280 - getSortKey returns a string, but not an arrayref.
281
2820.04 Mon Aug 13 22:23:11 2001
283 - some bugs are fixed.
284 - some tailoring parameters are added.
285
2860.03 Mon Aug 06 06:26:35 2001
287 - modify README
288
2890.02 Sun Aug 05 20:20:01 2001
290 - some fix
291
2920.01 Sun Jul 29 16:16:15 2001
293 - original version; created by h2xs 1.21
294 with options -A -X -n Sort::UCA