1 Revision history for Perl module Unicode::Collate.
3 0.89 Sat Mar 10 20:19:11 2012
6 0.88 Mon Mar 5 21:56:13 2012
7 - DUCET is updated (for Unicode 6.1.0) as Collate/allkeys.txt.
8 ! Please notice that allkeys.txt will be overwritten if you have had
9 other allkeys.txt already.
10 - U+9FCC is a new CJK unified ideograph.
11 - The default UCA_Version is 24.
12 - Locale/*.pl (except fr.pl) and CJK/Korean.pm are updated.
13 - modified tests: cjkrange.t, compatui.t, hangtype.t, loc_cjkc.t,
14 loc_es.t, loc_estr.t, overcjk0.t, overcjk1.t, version.t in t.
16 0.87 Sat Nov 26 17:01:42 2011
17 - Now Locale/*.pl files are searched in @INC. (see [rt.cpan.org #72666])
18 - added locale_version method to access the version number of Locale/*.pl.
20 0.86 Wed Nov 23 17:16:00 2011
21 - tailored compatibility ideographs as well as unified ideographs for
22 the locales: ja, ko, zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke.
23 - added loc_cjkc.t in t.
25 0.85 Sat Nov 19 20:01:57 2011
26 - U::C::Locale newly supports locales: bn, sa.
27 - updated some locales to CLDR 2.0 : zh__pinyin, zh__stroke.
28 * supported compatibility decomposable characters and U+FDD0 indexes.
29 * updated CJK/Pinyin.pm and CJK/Stroke.pm.
30 - added loc_bn.t, loc_cjk.t, loc_sa.t in t.
32 0.84 Sun Nov 6 14:44:51 2011
33 - U::C::Locale supports script codes.
34 - U::C::Locale newly supports locales: fa, sr_Latn, ur.
35 - added loc_fa.t, loc_srla.t, loc_ur.t in t.
37 0.83 Sun Oct 30 20:22:04 2011
38 - mklocale: auto-generate equivalents for suppressed contractions.
39 * be.txt, bg.txt, kk.txt, mk.txt, ru.txt, sr.txt, uk.txt in data
41 * but no Locale/*.pl will be modified.
43 0.82 Sun Oct 30 10:03:48 2011
44 - U::C::Locale newly supports locales: si, si__dictionary,
45 sv__reformed, ta, te, th, wae.
46 - added loc_si.t, loc_sidt.t, loc_svrf.t, loc_ta.t, loc_te.t,
47 loc_th.t, loc_wae.t in t.
48 - updated some locales to CLDR 2.0 : sk, sr, sv, uk.
49 - updated CJK/Pinyin.pm according to CLDR 2.0.
51 0.81 Sun Oct 23 21:32:36 2011
52 - U::C::Locale newly supports locales: ml, mr, or, pa.
53 - added loc_ml.t, loc_mr.t, loc_or.t, loc_pa.t in t.
54 - updated some locales to CLDR 2.0 : mk, mt, nb, nn, ro, ru.
56 0.80 Sun Oct 9 21:00:21 2011
57 - U::C::Locale newly supports locales: bs, hi, kn, kok, ln.
58 - added loc_bs.t, loc_hi.t, loc_kn.t, loc_kok.t, loc_ln.t in t.
59 - updated some locales to CLDR 2.0 : ha, hr, kk, lt.
61 0.79 Sun Oct 2 20:31:01 2011
62 - pod: [rt.cpan.org #70241] Fix minor grammar error in manpage
63 by Harlan Lieberman-Berg.
64 - 'suppress' no longer affects contractions via 'entry'.
65 - U::C::Locale newly supports locales: as, fi__phonebook, gu.
66 - added loc_as.t, loc_fiph.t, loc_gu.t in t.
67 - updated some locales to CLDR 2.0 : ar, be, bg.
69 0.78 Mon Jul 25 21:29:50 2011
70 - tried fixing the tarball with world writable files.
71 ( http://www.perlmonks.org/?node_id=731935 )
73 0.77 Sun Jul 3 21:15:08 2011
74 - xs: [perl #93470] [PATCH] consting in Collate.xs by Robin Barker.
76 0.76 Sun May 15 10:06:59 2011
77 - updated CJK/Pinyin.pm and CJK/Stroke.pm according to CLDR 1.9.1.
78 (type='pinyin' alt='short' and type='stroke' alt='short' respectively)
80 0.75 Sat May 7 21:07:38 2011
81 - supported ignore_level2 and rewrite.
82 - added iglevel2.t and rewrite.t in t.
84 0.74 Mon Mar 21 19:07:38 2011
85 - removed sw (Swahili) collation according to CLDR 1.9.
86 (removed files: Collate/Locale/sw.pl and data/sw.txt)
87 - shifted primary weights of letters > Z for some languages.
88 (affected locales: da, fi, fo, kl, nb, nn, sv)
90 0.73 Sun Mar 6 13:24:22 2011
91 - DUCET is updated (for Unicode 6.0.0) as Collate/allkeys.txt.
92 ! However no maint perl has supported Unicode 6.0.0 yet;
93 wait for 5.14, or try developing 5.13.7 or later.
94 ! Please notice that allkeys.txt will be overwritten if you have had
95 other allkeys.txt already.
96 - The default UCA_Version is 22.
97 - Locale/*.pl (except fr.pl and ko.pl) and CJK/Korean.pm are updated.
98 - test: compare allkeys.txt's version with Base_Unicode_Version
101 0.72 Sat Jan 22 17:28:32 2011
102 - xs: fix mixing char* and U8*.
104 0.71 Tue Jan 18 22:29:44 2011
105 - t/loc_test.t should not fail without Unicode::Normalize.
107 0.70 Sun Jan 16 20:31:07 2011
108 - Now U::C::Locale->new will use the compiled DUCET via XS if available.
109 added some tests in t/loc_test.t.
111 0.69 Sat Jan 15 19:41:11 2011
112 - clarified about XSUB. revised INSTALL in README.
113 - xs: flag passed to utf8n_to_uvuni().
114 - doc and comments: [perl #81876] Fix typos by Peter J. Acklam.
116 0.68 Tue Nov 23 20:17:22 2010
117 - doc: clarified about (backwards => [ ]) and (backwards => undef).
118 - separated t/backwds.t from t/test.t.
119 - added cjk_b5.t, cjk_gb.t, cjk_ja.t, cjk_ko.t, cjk_py.t, cjk_st.t in t
120 for CJK/*.pm without Locale.pm.
122 0.67 Sun Nov 14 11:38:59 2010
123 - supported UCA_Version 22 for Unicode 6.0.0.
124 * 2B740..2B81D are new CJK unified ideographs.
125 * noncharacters (e.g. U+FFFF) should be overridable, not be ignored.
126 ! DUCET is NOT updated, as no maint perl supports Unicode 6.0.0.
127 Thus the default UCA_Version is still 20.
129 - improved discontiguous contractions of 3 or more characters.
130 (e.g. 0FB2 0F71 0F80 and 0FB3 0F71 0F80)
131 - auxiliary: now 'mklocale' also copes with Korean.pm according to DUCET.
133 0.66 Sun Nov 7 10:47:30 2010
134 - U::C::Locale newly supports locale: ko.
135 - added Unicode::Collate::CJK::Korean for ko.
137 - 12 compat. ideographs (e.g. U+FA0E) are treated as unified ideographs.
138 (though DUCET also does it, now Unicode::Collate does it without DUCET.)
139 - added t/compatui.t.
140 ! Ideographs Ext.B (U+20000..U+2A6D6) can be overridden with UCA_Version 8.
141 This is a long-standing behavior from Unicode::Collate 0.11 to 0.63.
142 A wrong fix at 0.64 should be abandoned.
144 0.65 Wed Nov 3 13:10:20 2010
145 - U::C::Locale newly supports locale: zh and its some variants.
146 (zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke)
147 - added Unicode::Collate::CJK::Big5 for zh__big5han.
148 - added Unicode::Collate::CJK::GB2312 for zh__gb2312han.
149 - added Unicode::Collate::CJK::Pinyin for zh__pinyin.
150 - added Unicode::Collate::CJK::Stroke for zh__stroke.
151 - added loc_zh.t, loc_zhb5.t, loc_zhgb.t, loc_zhpy.t, loc_zhst.t in t.
153 0.64 Sun Oct 31 14:17:29 2010
154 - U::C::Locale newly supports locale: ja.
155 - added Unicode::Collate::CJK::JISX0208 for ja.
156 - added loc_ja.t, loc_jait.t, loc_japr.t in t.
157 - a subroutine specified in 'overrideCJK' or 'overrideHangul' is allowed
158 to return an integer or undef value.
159 - fix: Ideographs Ext.B (U+20000..U+2A6D6) are assigned in Unicode 3.1,
160 then 'overrideCJK' should not override them with UCA_Version 8.
161 !! sorry, this fix is based on a wrong idea. reverted at 0.66. !!
162 - separated t/overcjk0.t and t/overcjk1.t from t/override.t.
164 0.63 Sun Oct 10 22:13:21 2010
165 - supported suppress contractions (see 'suppress' in POD).
166 - internal for 'hangul_terminator' in getSortKey().
167 - U::C::Locale newly supports locales: be, bg, kk, mk, ru, sr.
168 - added loc_be.t, loc_bg.t, loc_cyrl.t, loc_kk.t, loc_mk.t, loc_ru.t,
170 - added tailoring with U+0340 or U+0341 instead of U+0300 or U+0301.
171 (affected locales: hr, is, pl, se, to, wo)
173 0.62 Wed Oct 6 21:35:54 2010
174 - U::C::Locale newly supports locales: ar, hu, hy, se, to, uk.
175 - added loc_ar.t, loc_hu.t, loc_hy.t, loc_se.t, loc_to.t, loc_uk.t in t.
176 - Vietnamese (vi): added tailoring for U+0340 and U+0341.
178 0.61 Sat Oct 2 11:41:29 2010
179 - U::C::Locale newly supports locales: hr, ig, sq.
180 - added loc_hr.t, loc_ig.t, loc_sq.t in t.
181 - precomposed e-dot-below, o-dot-below, o-tilde are tailored as well.
182 (affected locales: et, yo)
183 - Vietnamese (vi): added contractions for non-blocked decompositions
184 * base + dot-below + mark such as a\x{323}\x{306}, \x{1EA1}\x{306} etc.
185 * base + tone + horn such as o\x{309}\x{31B}, \x{1ECF}\x{31B} etc.
187 0.60 Thu Sep 23 21:37:36 2010
188 - bug fix: index() [and its friends including gmatch()] didn't remove
189 ignorable characters in the substring correctly.
190 Thanks for the bug report:
191 http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2010-09/msg00014.html
193 - U::C::Locale newly supports locales: de__phonebook, nso, om, tn, vi.
194 - added loc_de.t, loc_deph.t, loc_nso.t, loc_om.t, loc_tn.t, loc_vi.t in t.
195 - precomposed a-breve, a-circ, e-circ, o-circ are tailored as well.
196 (affected locales: ro, sk, sv)
198 0.59 Sun Sep 5 17:03:52 2010
199 - U::C::Locale newly supports locales: az, fil, ha, lt, mt, tr, wo, yo.
200 - added loc_az.t, loc_fil.t, loc_ha.t, loc_lt.t, loc_mt.t, loc_tr.t,
201 loc_wo.t, loc_yo.t in t.
202 - precomposed a-uml, o-uml, and u-uml are tailored as well.
203 (affected locales: da, et, fi, fo, is, kl, nb, nn, sk, sv)
205 0.58 Sun Aug 29 19:56:50 2010
206 - U::C::Locale newly supports locales: af, cy, da, fo, haw, is, kl, sw.
207 - added loc_af.t, loc_cy.t, loc_da.t, loc_fo.t, loc_haw.t, loc_is.t,
208 loc_kl.t, loc_sw.t in t.
210 0.57 Sun Aug 22 22:39:58 2010
211 - U::C::Locale newly supports locales: ca, et, fi, lv, sk, sl.
212 - added loc_ca.t, loc_et.t, loc_fi.t, loc_lv.t, loc_sk.t, loc_sl.t in t.
214 0.56 Sun Aug 8 20:24:03 2010
215 - Unicode::Collate::Locale newly supports locales: eo, nb, ro, sv.
216 - added loc_eo.t, loc_es.t, loc_estr.t, loc_nb.t, loc_ro.t, loc_sv.t in t.
217 ! renamed t/locale_{xy}.t to t/loc_{xy}.t (for safer 8.3 names)
218 (loc_cs.t, loc_fr.t, loc_nn.t, loc_pl.t, loc_test.t)
220 0.55 Sun Aug 1 21:21:23 2010
221 - incorporated Unicode::Collate::Locale with some changes. see:
222 http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2004-03/msg00030.html
223 - supported locales: cs, es, es__traditional, fr, nn, pl.
224 ! added t/locale*.t that uses DUCET.
225 (locale_cs.t, locale_fr.t, locale_nn.t, locale_pl.t, locale_test.t)
226 - data/*.txt and mklocale for preparation of Locale/*.pl from DUCET.
228 0.54 Sun Jul 25 21:37:04 2010
229 - Now UCA Revision 20 (based on Unicode 5.2.0).
230 - DUCET is also updated (for Unicode 5.2.0) as Collate/allkeys.txt,
231 which *is required* to test this module.
232 ! Please notice that allkeys.txt will be overwritten if you have had
233 other allkeys.txt already.
234 - U+9FC4..U+9FCB and U+2A700..U+2B734 are new CJK unified ideographs.
235 - Many hangul jamo are assigned (affecting hangul_terminator).
237 ! Now XSUB will be built by default. (XSUB needs a C compiler.)
238 To build pure perl, run disableXS before Makefile.PL.
239 ! DUCET will be compiled when XS is used. Explicit saying
240 <table => 'allkeys.txt'> (or using another table) will prevent
241 this module from using the compiled DUCET.
243 ! added t/default.t that uses DUCET.
245 0.53 Sun Feb 14 20:46:27 2010
246 - Now UCA Revision 18 (based on Unicode 5.1.0).
247 - DUCET is also updated (for Unicode 5.1.0) as Collate/allkeys.txt,
248 which is not required to test this module.
249 ! Please notice that allkeys.txt will be overwritten if you have had
250 other allkeys.txt already.
251 - U+9FBC..U+9FC3 are new CJK unified ideographs.
253 0.52 Thu Oct 13 21:51:09 2005
254 - The Unicode::Collate->new method does not destroy user's $_ any longer.
255 (thanks to Jon Warbrick for bug report)
257 0.51 Sun May 29 20:21:19 2005
258 - Added the latest DUCET (for Unicode 4.1.0) as Collate/allkeys.txt,
259 which is not required to test this module.
260 ! Please notice that allkeys.txt will be overwritten if you have had
261 other allkeys.txt already.
262 - Added INSTALL section in POD.
264 0.50 Sun May 8 20:26:39 2005
265 - Now UCA Revision 14 (based on Unicode 4.1.0).
266 - Some tests are modified.
267 - Added cjkrange.t, ignor.t, override.t in t.
270 0.40 Sat Apr 24 06:54:40 2004
271 - Now a table file is searched in @INC.
273 0.33 Sat Dec 13 14:07:27 2003
274 - documentation improvement: in "entry", "overrideHangul", etc.
276 0.32 Wed Dec 3 23:38:18 2003
277 - A matching part from index(), match() etc. will include illegal
278 code points (as well as ignorable characters) following a grapheme.
279 - Contraction with illegal code point will be invalid.
281 - Added some tests in t/illegal.t.
282 - Separated t/altern.t and t/rearrang.t from t/test.t.
283 - modified XSUB internals.
285 0.31 Sun Nov 16 15:40:15 2003
286 - Illegal code points (surrogate and noncharacter; they are definitely
287 ignorable) will be distinguished from NULL ("\0");
288 but porting is not successful in the case of ((Pure Perl) and
289 (Perl 5.7.3 or before)). If perl 5.6.X is used, XSUB may help it
290 in place of broken CORE::unpack('U*') in older perl.
291 - added illegal.t and illegalp.t in t.
292 - added XSUB where some functions are implemented in XSUB.
293 Pure Perl is also supported.
295 0.30 Mon Oct 13 21:26:37 2003
296 - fix: Completely ignorable in table should be able to be overridden
297 by non-ignorable in entry.
298 - fix: Maximum length for contraction must not be shortened
299 by a shorter contraction following in table and/or entry.
303 0.29 Mon Oct 13 12:18:23 2003
304 - now UCA Version 11 (but no functionality is different from Version 9).
305 - supported hangul_terminator.
306 - fix: Base_Unicode_Version falsely returns Perl's Unicode version.
307 C4 in UTS #10 requires UTS's Unicode version.
308 - For variable weighting, 'variable' is recommended
309 and 'alternate' is deprecated.
310 - added version() method.
311 - added hangtype.t, trailwt.t, variable.t, and version.t in t.
313 0.28 Sat Sep 06 20:16:01 2003
314 - Fixed another inconsistency under (normalization => undef):
315 Non-contiguous contraction is always neglected.
316 - Fixed: according to S2.1 in UTS #10, a blocked combining character
317 should not be contracted. One test in t/test.t was wrong, then removed.
318 - Added t/contract.t.
319 - (normalization => "prenormalized") is able to be used.
321 0.27 Sun Aug 31 22:23:17 2003
323 - The maximum length of contracted CE was not checked (v0.22 to v0.26).
324 Collation of a large string including a first letter of a contraction
325 that is not a part of that contraction (say, 'c' of 'ca'
326 where 'ch' is defined) was too slow, inefficient.
327 - A form name for 'normalization', no longer restricted to
328 /^(?:NF)?K?[CD]\z/, will be allowed as long as
329 Unicode::Normalize::normalize() accepts it, since Unicode::Normalize
330 or UAX #15 may be changed/enhanced in future.
331 - When Hangul syllables are decomposed under <normalization => undef>,
332 contraction among jamo (LV, VT, LVT) derived from the same
333 Hangul syllable is allowed.
336 0.26 Sun Aug 03 22:23:17 2003
337 - fix: an expansion in which a CE is level 3 ignorable and others are not
338 was wrongly made level 3 ignorable as a whole entry.
339 (In DUCET, some precomposed characters in Musical Symbols are so)
341 0.25 Mon Jun 06 23:20:17 2003
343 - internal tweak (again): pack_U() and unpack_U().
345 0.24 Thu Apr 02 23:12:54 2003
346 - internal tweak for (?un)pack 'U'.
348 0.23 Wed Sep 04 19:25:20 2002
349 - fix: scalar match() no longer returns an lvalue substr ref.
350 - fix: "Ignorable after variable" should be made level 3 ignorable
351 even if alternate => 'blanked'.
352 - Now a grapheme may contain trailing level 2, level 3,
353 and completely ignorable characters.
355 0.22 Mon Sep 02 23:15:14 2002
356 - New File: t/index.t.
357 (The new t/test.t excludes tests for index.)
358 - tweak on index(). POSITION is supported.
359 - add match, gmatch, subst, gsubst methods.
360 - fix: ignorable after variable in 'shift'-variable weight.
362 0.21 Sat Aug 03 10:24:00 2002
363 - upgrade keys.txt and t/test.t for UCA Version 9.
365 0.20 Fri Jul 26 02:15:25 2002
367 - U+FDD0..U+FDEF are new non-characters.
368 - fix: whitespace characters before @backwards etc. in a table file.
369 - now values for 'alternate', 'backwards', etc.,
370 which are explicitly specified via new(),
371 are preferred to those specified in a table file.
373 0.12 Sun May 05 09:43:10 2002
374 - add new methods, ->UCA_Version and ->Base_Unicode_Version.
375 - test fix: removed the needless requirement of Unicode::Normalize.
376 [reported by David Hand]
378 0.11 Fri May 03 02:28:10 2002
379 - fix: now derived collation elements can be used for Hangul Jamo
380 when their weights are not defined.
381 [reported by Andreas J. Koenig]
382 - fix: rearrangements had not worked.
383 - mentioned pleblem on index() in BUGS.
384 - more documents, more tests.
385 - tag names for 'alternate' are case-insensitive (i.e. 'SHIFTed' etc.).
386 - The <undef> value for the keys "overrideCJK", "overrideHangul",
387 "rearrange" has a special behavior (different from default).
389 0.10 Tue Dec 11 23:26:42 2001
390 - now you are allowed to use no table file.
391 - fix: fetching CE with two or more combining characters.
393 0.09 Sun Nov 11 17:02:40:18 2001
394 - add the following methods: eq, ne, lt, le, gt, le.
395 - relies on &Unicode::Normalize::getCombinClass()
396 in place of %Unicode::Normalize::Combin
397 (the hash is not defined in the XS version of Unicode::Normalize).
398 then you should install Unicode::Normalize 0.10 or later.
399 - now independent of Lingua::KO::Hangul::Util
400 (this module does decomposition of Hangul syllables for itself)
402 0.08 Mon Aug 20 22:40:18 2001
403 - add the index method.
405 0.07 Thu Aug 16 23:42:02 2001
406 - rename the module name to Unicode::Collate.
408 0.06 Thu Aug 16 23:18:36 2001
409 - add description of the getSortKey method.
411 0.05 Mon Aug 13 22:23:11 2001
412 - bug fix: on the things of 4.2.1, UTR #10
413 - getSortKey returns a string, but not an arrayref.
415 0.04 Mon Aug 13 22:23:11 2001
416 - some bugs are fixed.
417 - some tailoring parameters are added.
419 0.03 Mon Aug 06 06:26:35 2001
422 0.02 Sun Aug 05 20:20:01 2001
425 0.01 Sun Jul 29 16:16:15 2001
426 - original version; created by h2xs 1.21
427 with options -A -X -n Sort::UCA