Commit | Line | Data |
---|---|---|
ae6aa562 JH |
1 | Revision history for Perl module Unicode::Collate. |
2 | ||
211cc501 | 3 | 0.70 Sun Jan 16 20:31:07 2011 |
a509bac8 | 4 | - Now U::C::Locale->new will use the compiled DUCET via XS. |
211cc501 CBW |
5 | |
6 | 0.69 Sat Jan 15 19:41:11 2011 | |
7 | - clarified about XSUB. revised INSTALL in README. | |
8 | - xs: flag passed to utf8n_to_uvuni(). | |
9 | - doc and comments: [perl #81876] Fix typos by Peter J. Acklam. | |
10 | ||
68adb2b0 CBW |
11 | 0.68 Tue Nov 23 20:17:22 2010 |
12 | - doc: clarified about (backwards => [ ]) and (backwards => undef). | |
13 | - separated t/backwds.t from t/test.t. | |
14 | - added cjk_b5.t, cjk_gb.t, cjk_ja.t, cjk_ko.t, cjk_py.t, cjk_st.t in t | |
15 | for CJK/*.pm without Locale.pm. | |
16 | ||
b5d9a953 CBW |
17 | 0.67 Sun Nov 14 11:38:59 2010 |
18 | - supported UCA_Version 22 for Unicode 6.0.0. | |
19 | * 2B740..2B81D are new CJK unified ideographs. | |
20 | * noncharacters (e.g. U+FFFF) should be overridable, not be ignored. | |
21 | ! DUCET is NOT updated, as no maint perl supports Unicode 6.0.0. | |
22 | Thus the default UCA_Version is still 20. | |
23 | - added t/nonchar.t. | |
24 | - improved discontiguous contractions of 3 or more characters. | |
25 | (e.g. 0FB2 0F71 0F80 and 0FB3 0F71 0F80) | |
26 | - auxiliary: now 'mklocale' also copes with Korean.pm according to DUCET. | |
27 | ||
584e761d CBW |
28 | 0.66 Sun Nov 7 10:47:30 2010 |
29 | - U::C::Locale newly supports locale: ko. | |
30 | - added Unicode::Collate::CJK::Korean for ko. | |
31 | - added t/loc_ko.t. | |
32 | - 12 compat. ideographs (e.g. U+FA0E) are treated as unified ideographs. | |
33 | (though DUCET also does it, now Unicode::Collate does it without DUCET.) | |
34 | - added t/compatui.t. | |
211cc501 | 35 | ! Ideographs Ext.B (U+20000..U+2A6D6) can be overridden with UCA_Version 8. |
584e761d CBW |
36 | This is a long-standing behavior from Unicode::Collate 0.11 to 0.63. |
37 | A wrong fix at 0.64 should be abandoned. | |
38 | ||
028d3bfa CBW |
39 | 0.65 Wed Nov 3 13:10:20 2010 |
40 | - U::C::Locale newly supports locale: zh and its some variants. | |
584e761d | 41 | (zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke) |
028d3bfa CBW |
42 | - added Unicode::Collate::CJK::Big5 for zh__big5han. |
43 | - added Unicode::Collate::CJK::GB2312 for zh__gb2312han. | |
44 | - added Unicode::Collate::CJK::Pinyin for zh__pinyin. | |
45 | - added Unicode::Collate::CJK::Stroke for zh__stroke. | |
584e761d | 46 | - added loc_zh.t, loc_zhb5.t, loc_zhgb.t, loc_zhpy.t, loc_zhst.t in t. |
028d3bfa | 47 | |
539ce3d8 CBW |
48 | 0.64 Sun Oct 31 14:17:29 2010 |
49 | - U::C::Locale newly supports locale: ja. | |
50 | - added Unicode::Collate::CJK::JISX0208 for ja. | |
584e761d | 51 | - added loc_ja.t, loc_jait.t, loc_japr.t in t. |
539ce3d8 CBW |
52 | - a subroutine specified in 'overrideCJK' or 'overrideHangul' is allowed |
53 | to return an integer or undef value. | |
584e761d CBW |
54 | - fix: Ideographs Ext.B (U+20000..U+2A6D6) are assigned in Unicode 3.1, |
55 | then 'overrideCJK' should not override them with UCA_Version 8. | |
56 | !! sorry, this fix is based on a wrong idea. reverted at 0.66. !! | |
57 | - separated t/overcjk0.t and t/overcjk1.t from t/override.t. | |
539ce3d8 | 58 | |
aa7758f7 CBW |
59 | 0.63 Sun Oct 10 22:13:21 2010 |
60 | - supported suppress contractions (see 'suppress' in POD). | |
028d3bfa | 61 | - internal for 'hangul_terminator' in getSortKey(). |
aa7758f7 | 62 | - U::C::Locale newly supports locales: be, bg, kk, mk, ru, sr. |
584e761d CBW |
63 | - added loc_be.t, loc_bg.t, loc_cyrl.t, loc_kk.t, loc_mk.t, loc_ru.t, |
64 | loc_sr.t in t. | |
aa7758f7 CBW |
65 | - added tailoring with U+0340 or U+0341 instead of U+0300 or U+0301. |
66 | (affected locales: hr, is, pl, se, to, wo) | |
67 | ||
6709de88 CBW |
68 | 0.62 Wed Oct 6 21:35:54 2010 |
69 | - U::C::Locale newly supports locales: ar, hu, hy, se, to, uk. | |
584e761d | 70 | - added loc_ar.t, loc_hu.t, loc_hy.t, loc_se.t, loc_to.t, loc_uk.t in t. |
6709de88 CBW |
71 | - Vietnamese (vi): added tailoring for U+0340 and U+0341. |
72 | ||
c02ee425 CBW |
73 | 0.61 Sat Oct 2 11:41:29 2010 |
74 | - U::C::Locale newly supports locales: hr, ig, sq. | |
584e761d | 75 | - added loc_hr.t, loc_ig.t, loc_sq.t in t. |
c02ee425 CBW |
76 | - precomposites of e-dot-below, o-dot-below, o-tilde are tailored as well. |
77 | (affected locales: et, yo) | |
78 | - Vietnamese (vi): added contractions for non-blocked decompositions | |
aa7758f7 | 79 | * base + dot-below + mark such as a\x{323}\x{306}, \x{1EA1}\x{306} etc. |
6709de88 | 80 | * base + tone + horn such as o\x{309}\x{31B}, \x{1ECF}\x{31B} etc. |
c02ee425 | 81 | |
1393fe00 CBW |
82 | 0.60 Thu Sep 23 21:37:36 2010 |
83 | - bug fix: index() [and its friends including gmatch()] didn't remove | |
84 | ignorable characters in the substring correctly. | |
85 | Thanks for the bug report: | |
aa7758f7 | 86 | http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2010-09/msg00014.html |
1393fe00 CBW |
87 | |
88 | - U::C::Locale newly supports locales: de__phonebook, nso, om, tn, vi. | |
584e761d | 89 | - added loc_de.t, loc_deph.t, loc_nso.t, loc_om.t, loc_tn.t, loc_vi.t in t. |
1393fe00 CBW |
90 | - precomposites of a-breve, a-circ, e-circ, o-circ are tailored as well. |
91 | (affected locales: ro, sk, sv) | |
92 | ||
f1a7422f CBW |
93 | 0.59 Sun Sep 5 17:03:52 2010 |
94 | - U::C::Locale newly supports locales: az, fil, ha, lt, mt, tr, wo, yo. | |
584e761d CBW |
95 | - added loc_az.t, loc_fil.t, loc_ha.t, loc_lt.t, loc_mt.t, loc_tr.t, |
96 | loc_wo.t, loc_yo.t in t. | |
f1a7422f CBW |
97 | - precomposites of a-uml, o-uml, and u-uml are tailored as well. |
98 | (affected locales: da, et, fi, fo, is, kl, nb, nn, sk, sv) | |
99 | ||
6484f676 CBW |
100 | 0.58 Sun Aug 29 19:56:50 2010 |
101 | - U::C::Locale newly supports locales: af, cy, da, fo, haw, is, kl, sw. | |
584e761d CBW |
102 | - added loc_af.t, loc_cy.t, loc_da.t, loc_fo.t, loc_haw.t, loc_is.t, |
103 | loc_kl.t, loc_sw.t in t. | |
6484f676 | 104 | |
64dc7822 | 105 | 0.57 Sun Aug 22 22:39:58 2010 |
6484f676 | 106 | - U::C::Locale newly supports locales: ca, et, fi, lv, sk, sl. |
584e761d | 107 | - added loc_ca.t, loc_et.t, loc_fi.t, loc_lv.t, loc_sk.t, loc_sl.t in t. |
64dc7822 | 108 | |
456a1446 CBW |
109 | 0.56 Sun Aug 8 20:24:03 2010 |
110 | - Unicode::Collate::Locale newly supports locales: eo, nb, ro, sv. | |
584e761d | 111 | - added loc_eo.t, loc_es.t, loc_estr.t, loc_nb.t, loc_ro.t, loc_sv.t in t. |
456a1446 | 112 | ! renamed t/locale_{xy}.t to t/loc_{xy}.t (for safer 8.3 names) |
584e761d | 113 | (loc_cs.t, loc_fr.t, loc_nn.t, loc_pl.t, loc_test.t) |
456a1446 | 114 | |
00e00351 | 115 | 0.55 Sun Aug 1 21:21:23 2010 |
aa7758f7 CBW |
116 | - incorporated Unicode::Collate::Locale with some changes. see: |
117 | http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2004-03/msg00030.html | |
456a1446 | 118 | - supported locales: cs, es, es__traditional, fr, nn, pl. |
00e00351 | 119 | ! added t/locale*.t that uses DUCET. |
584e761d | 120 | (locale_cs.t, locale_fr.t, locale_nn.t, locale_pl.t, locale_test.t) |
b5d9a953 | 121 | - data/*.txt and mklocale for preparation of Locale/*.pl from DUCET. |
00e00351 CBW |
122 | |
123 | 0.54 Sun Jul 25 21:37:04 2010 | |
124 | - Now UCA Revision 20 (based on Unicode 5.2.0). | |
125 | - DUCET is also updated (for Unicode 5.2.0) as Collate/allkeys.txt, | |
126 | which *is required* to test this module. | |
127 | ! Please notice that allkeys.txt will be overwritten if you have had | |
128 | other allkeys.txt already. | |
b5d9a953 | 129 | - U+9FC4..U+9FCB and U+2A700..U+2B734 are new CJK unified ideographs. |
00e00351 CBW |
130 | - Many hangul jamo are assigned (affecting hangul_terminator). |
131 | ||
211cc501 CBW |
132 | ! Now XSUB will be built by default. (XSUB needs a C compiler.) |
133 | To build pure perl, run disableXS before Makefile.PL. | |
00e00351 CBW |
134 | ! DUCET will be compiled when XS is used. Explicit saying |
135 | <table => 'allkeys.txt'> (or using another table) will prevent | |
1393fe00 | 136 | this module from using the compiled DUCET. |
00e00351 CBW |
137 | |
138 | ! added t/default.t that uses DUCET. | |
139 | ||
74b94a79 CBW |
140 | 0.53 Sun Feb 14 20:46:27 2010 |
141 | - Now UCA Revision 18 (based on Unicode 5.1.0). | |
00e00351 | 142 | - DUCET is also updated (for Unicode 5.1.0) as Collate/allkeys.txt, |
74b94a79 CBW |
143 | which is not required to test this module. |
144 | ! Please notice that allkeys.txt will be overwritten if you have had | |
145 | other allkeys.txt already. | |
b5d9a953 | 146 | - U+9FBC..U+9FC3 are new CJK unified ideographs. |
74b94a79 | 147 | |
6d24ed10 SP |
148 | 0.52 Thu Oct 13 21:51:09 2005 |
149 | - The Unicode::Collate->new method does not destroy user's $_ any longer. | |
150 | (thanks to Jon Warbrick for bug report) | |
151 | ||
0d50d293 RGS |
152 | 0.51 Sun May 29 20:21:19 2005 |
153 | - Added the latest DUCET (for Unicode 4.1.0) as Collate/allkeys.txt, | |
154 | which is not required to test this module. | |
74b94a79 | 155 | ! Please notice that allkeys.txt will be overwritten if you have had |
0d50d293 RGS |
156 | other allkeys.txt already. |
157 | - Added INSTALL section in POD. | |
158 | ||
3756e7ca RGS |
159 | 0.50 Sun May 8 20:26:39 2005 |
160 | - Now UCA Revision 14 (based on Unicode 4.1.0). | |
161 | - Some tests are modified. | |
584e761d | 162 | - Added cjkrange.t, ignor.t, override.t in t. |
3756e7ca RGS |
163 | - Added META.yml. |
164 | ||
165 | 0.40 Sat Apr 24 06:54:40 2004 | |
166 | - Now a table file is searched in @INC. | |
167 | ||
abd1ec54 NC |
168 | 0.33 Sat Dec 13 14:07:27 2003 |
169 | - documentation improvement: in "entry", "overrideHangul", etc. | |
170 | ||
171 | 0.32 Wed Dec 3 23:38:18 2003 | |
172 | - A matching part from index(), match() etc. will include illegal | |
173 | code points (as well as ignorable characters) following a grapheme. | |
174 | - Contraction with illegal code point will be invalid. | |
584e761d CBW |
175 | - Added t/view.t. |
176 | - Added some tests in t/illegal.t. | |
177 | - Separated t/altern.t and t/rearrang.t from t/test.t. | |
abd1ec54 NC |
178 | - modified XSUB internals. |
179 | ||
10d7ec48 NC |
180 | 0.31 Sun Nov 16 15:40:15 2003 |
181 | - Illegal code points (surrogate and noncharacter; they are definitely | |
182 | ignorable) will be distinguished from NULL ("\0"); | |
183 | but porting is not successful in the case of ((Pure Perl) and | |
184 | (Perl 5.7.3 or before)). If perl 5.6.X is used, XSUB may help it | |
185 | in place of broken CORE::unpack('U*') in older perl. | |
584e761d | 186 | - added illegal.t and illegalp.t in t. |
211cc501 CBW |
187 | - added XSUB where some functions are implemented in XSUB. |
188 | Pure Perl is also supported. | |
10d7ec48 | 189 | |
91ae00cb | 190 | 0.30 Mon Oct 13 21:26:37 2003 |
211cc501 | 191 | - fix: Completely ignorable in table should be able to be overridden |
91ae00cb NC |
192 | by non-ignorable in entry. |
193 | - fix: Maximum length for contraction must not be shortened | |
10d7ec48 | 194 | by a shorter contraction following in table and/or entry. |
584e761d | 195 | - added t/normal.t. |
91ae00cb NC |
196 | - some doc fixes |
197 | ||
198 | 0.29 Mon Oct 13 12:18:23 2003 | |
abd1ec54 | 199 | - now UCA Version 11 (but no functionality is different from Version 9). |
91ae00cb NC |
200 | - supported hangul_terminator. |
201 | - fix: Base_Unicode_Version falsely returns Perl's Unicode version. | |
202 | C4 in UTS #10 requires UTS's Unicode version. | |
203 | - For variable weighting, 'variable' is recommended | |
204 | and 'alternate' is deprecated. | |
205 | - added version() method. | |
584e761d | 206 | - added hangtype.t, trailwt.t, variable.t, and version.t in t. |
91ae00cb | 207 | |
06c8fc8f RGS |
208 | 0.28 Sat Sep 06 20:16:01 2003 |
209 | - Fixed another inconsistency under (normalization => undef): | |
210 | Non-contiguous contraction is always neglected. | |
211 | - Fixed: according to S2.1 in UTS #10, a blocked combining character | |
584e761d CBW |
212 | should not be contracted. One test in t/test.t was wrong, then removed. |
213 | - Added t/contract.t. | |
06c8fc8f RGS |
214 | - (normalization => "prenormalized") is able to be used. |
215 | ||
1d2654e1 JH |
216 | 0.27 Sun Aug 31 22:23:17 2003 |
217 | some improvements: | |
06c8fc8f | 218 | - The maximum length of contracted CE was not checked (v0.22 to v0.26). |
1d2654e1 JH |
219 | Collation of a large string including a first letter of a contraction |
220 | that is not a part of that contraction (say, 'c' of 'ca' | |
221 | where 'ch' is defined) was too slow, inefficient. | |
91ae00cb NC |
222 | - A form name for 'normalization', no longer restricted to |
223 | /^(?:NF)?K?[CD]\z/, will be allowed as long as | |
224 | Unicode::Normalize::normalize() accepts it, since Unicode::Normalize | |
225 | or UAX #15 may be changed/enhanced in future. | |
1d2654e1 JH |
226 | - When Hangul syllables are decomposed under <normalization => undef>, |
227 | contraction among jamo (LV, VT, LVT) derived from the same | |
584e761d CBW |
228 | Hangul syllable is allowed. |
229 | - Added t/hangul.t. | |
1d2654e1 | 230 | |
4c843366 JH |
231 | 0.26 Sun Aug 03 22:23:17 2003 |
232 | - fix: an expansion in which a CE is level 3 ignorable and others are not | |
1d2654e1 | 233 | was wrongly made level 3 ignorable as a whole entry. |
4c843366 JH |
234 | (In DUCET, some precomposites in Musical Symbols are so) |
235 | ||
ae6aa562 JH |
236 | 0.25 Mon Jun 06 23:20:17 2003 |
237 | - fix Makefile.PL. | |
238 | - internal tweak (again): pack_U() and unpack_U(). | |
45394607 | 239 | |
9f1f04a1 RGS |
240 | 0.24 Thu Apr 02 23:12:54 2003 |
241 | - internal tweak for (?un)pack 'U'. | |
242 | ||
4d36a948 TS |
243 | 0.23 Wed Sep 04 19:25:20 2002 |
244 | - fix: scalar match() no longer returns an lvalue substr ref. | |
245 | - fix: "Ignorable after variable" should be made level 3 ignorable | |
246 | even if alternate => 'blanked'. | |
247 | - Now a grapheme may contain trailing level 2, level 3, | |
248 | and completely ignorable characters. | |
249 | ||
250 | 0.22 Mon Sep 02 23:15:14 2002 | |
584e761d CBW |
251 | - New File: t/index.t. |
252 | (The new t/test.t excludes tests for index.) | |
4d36a948 TS |
253 | - tweak on index(). POSITION is supported. |
254 | - add match, gmatch, subst, gsubst methods. | |
255 | - fix: ignorable after variable in 'shift'-variable weight. | |
256 | ||
caffd4cf TS |
257 | 0.21 Sat Aug 03 10:24:00 2002 |
258 | - upgrade keys.txt and t/test.t for UCA Version 9. | |
259 | ||
0116f5dc JH |
260 | 0.20 Fri Jul 26 02:15:25 2002 |
261 | - now UCA Version 9. | |
262 | - U+FDD0..U+FDEF are new non-characters. | |
263 | - fix: whitespace characters before @backwards etc. in a table file. | |
264 | - now values for 'alternate', 'backwards', etc., | |
265 | which are explicitly specified via new(), | |
266 | are preferred to those specified in a table file. | |
267 | ||
327745dc TS |
268 | 0.12 Sun May 05 09:43:10 2002 |
269 | - add new methods, ->UCA_Version and ->Base_Unicode_Version. | |
270 | - test fix: removed the needless requirement of Unicode::Normalize. | |
271 | [reported by David Hand] | |
272 | ||
809c7673 TS |
273 | 0.11 Fri May 03 02:28:10 2002 |
274 | - fix: now derived collation elements can be used for Hangul Jamo | |
275 | when their weights are not defined. | |
327745dc | 276 | [reported by Andreas J. Koenig] |
809c7673 TS |
277 | - fix: rearrangements had not worked. |
278 | - mentioned pleblem on index() in BUGS. | |
279 | - more documents, more tests. | |
280 | - tag names for 'alternate' are case-insensitive (i.e. 'SHIFTed' etc.). | |
281 | - The <undef> value for the keys "overrideCJK", "overrideHangul", | |
282 | "rearrange" has a special behavior (different from default). | |
283 | ||
905aa9f0 TS |
284 | 0.10 Tue Dec 11 23:26:42 2001 |
285 | - now you are allowed to use no table file. | |
286 | - fix: fetching CE with two or more combining characters. | |
287 | ||
5398038e TS |
288 | 0.09 Sun Nov 11 17:02:40:18 2001 |
289 | - add the following methods: eq, ne, lt, le, gt, le. | |
290 | - relies on &Unicode::Normalize::getCombinClass() | |
291 | in place of %Unicode::Normalize::Combin | |
292 | (the hash is not defined in the XS version of Unicode::Normalize). | |
293 | then you should install Unicode::Normalize 0.10 or later. | |
294 | - now independent of Lingua::KO::Hangul::Util | |
295 | (this module does decomposition of Hangul syllables for itself) | |
296 | ||
d16e9e3d JH |
297 | 0.08 Mon Aug 20 22:40:18 2001 |
298 | - add the index method. | |
299 | ||
45394607 JH |
300 | 0.07 Thu Aug 16 23:42:02 2001 |
301 | - rename the module name to Unicode::Collate. | |
302 | ||
303 | 0.06 Thu Aug 16 23:18:36 2001 | |
304 | - add description of the getSortKey method. | |
305 | ||
306 | 0.05 Mon Aug 13 22:23:11 2001 | |
307 | - bug fix: on the things of 4.2.1, UTR #10 | |
308 | - getSortKey returns a string, but not an arrayref. | |
309 | ||
310 | 0.04 Mon Aug 13 22:23:11 2001 | |
311 | - some bugs are fixed. | |
312 | - some tailoring parameters are added. | |
313 | ||
314 | 0.03 Mon Aug 06 06:26:35 2001 | |
315 | - modify README | |
316 | ||
317 | 0.02 Sun Aug 05 20:20:01 2001 | |
318 | - some fix | |
319 | ||
320 | 0.01 Sun Jul 29 16:16:15 2001 | |
321 | - original version; created by h2xs 1.21 | |
322 | with options -A -X -n Sort::UCA |