All but ../lib/Unicode/UCD.t pass.
p4raw-id: //depot/perlio@14412
Version v5.7.2 Development release working toward v5.8
--------------
____________________________________________________________________________
+[ 14388] By: jhi on 2002/01/23 15:04:06
+ Log: Small update on todo. Could use a lot more.
+ Branch: perl
+ ! pod/perltodo.pod
+____________________________________________________________________________
+[ 14386] By: jhi on 2002/01/23 14:19:01
+ Log: Subject: [ID 20020121.003] perldata doco bug
+ From: John Stumbles <jstumbles@bluearc.com>
+ Date: Mon, 21 Jan 2002 14:19:56 -0000
+ Message-Id: <4586CA8FDDC2D411A1C700508BB4AC33016765D1@ukexchange.synaxia.com>
+ Branch: perl
+ ! pod/perldata.pod
+____________________________________________________________________________
+[ 14385] By: jhi on 2002/01/23 14:17:52
+ Log: Subject: [PATCH] Re: bless() bug ? Why fails reblessing of 'main::Object' to 'Object' ?
+ From: Michael G Schwern <schwern@pobox.com>
+ Date: Mon, 21 Jan 2002 15:16:42 -0500
+ Message-ID: <20020121201642.GA6659@blackrider>
+ Branch: perl
+ ! embed.fnc embed.h proto.h t/op/universal.t universal.c
+____________________________________________________________________________
+[ 14384] By: jhi on 2002/01/23 13:45:30
+ Log: Subject: [PATCH] Re: eval not catching warnings?
+ From: abigail@foad.org
+ Date: Wed, 23 Jan 2002 15:35:47 +0100
+ Message-ID: <20020123143547.24798.qmail@foad.org>
+ Branch: perl
+ ! pod/perlfunc.pod
+____________________________________________________________________________
+[ 14383] By: jhi on 2002/01/23 03:57:58
+ Log: Subject: [ID 20020122.012] Not OK: perl v5.7.2 +DEVEL14368 on cygwin-multi-64int 1.3.6(0.4732) (UNINSTALLED)
+ From: sthoenna@efn.org (Yitzchak Scott-Thoennes)
+ Date: Tue, 22 Jan 2002 17:00:54 -0800
+ Message-Id: <GtgT8gzkg+CG092yn@efn.org>
+ Branch: perl
+ ! t/run/fresh_perl.t
+____________________________________________________________________________
+[ 14381] By: jhi on 2002/01/23 03:47:05
+ Log: HP-UX 10.20 cc pacifying from Merijn.
+ Branch: perl
+ ! Configure
+____________________________________________________________________________
+[ 14380] By: jhi on 2002/01/23 03:41:26
+ Log: Subject: [PATCH] test.com shebang handling (was Re: VMS@14369)
+ From: "Craig A. Berry" <craigberry@mac.com>
+ Date: Tue, 22 Jan 2002 22:14:26 -0600
+ Message-Id: <a05101000b873de9cf801@[172.16.52.1]>
+ Branch: perl
+ ! vms/test.com
+____________________________________________________________________________
+[ 14379] By: jhi on 2002/01/23 03:11:31
+ Log: AIX cpp bug: having macro arguments and character constants
+ "the same" means trouble (here s and 's')
+ What broke now was 841 and 842 of t/op/pat.t, because of the
+ ANYOF_UNICODE_FOLD_SHARP_S() in utf8.h, ccversion 5.0.1.0
+ (note that breakage happened only under cc_r and usethreads+
+ useithreads)
+ Branch: perl
+ ! utf8.h
+____________________________________________________________________________
+[ 14376] By: jhi on 2002/01/22 16:46:48
+ Log: Subject: [PATCH] Support truncate() in VOS port
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Mon, 21 Jan 02 23:27 est
+ Message-Id: <200201220428.XAA15304@mailhub1.stratus.com>
+ Branch: perl
+ + vos/vos.c
+ ! MANIFEST hints/vos.sh vos/vosish.h
+____________________________________________________________________________
+[ 14374] By: jhi on 2002/01/22 16:36:41
+ Log: Get rid of the _() macro since (1) we require ANSI anyway
+ (2) Other software (NU)seems to use it and we don't need it,
+ so let's give it up.
+ Branch: perl
+ ! Configure Porting/Glossary Porting/config.sh Porting/config_H
+ ! config_h.SH ext/Devel/PPPort/PPPort.pm malloc.c
+ ! plan9/config.plan9 pod/perltoc.pod uconfig.h
+ ! vos/config.alpha.h vos/config.ga.h win32/config_H.bc
+ ! win32/config_H.gc win32/config_H.vc win32/config_H.win64
+ ! wince/config.h wince/config_H.ce
+____________________________________________________________________________
+[ 14371] By: jhi on 2002/01/21 19:36:04
+ Log: Subject: [BUG] /\_/ an unrecognized escape?
+ From: Michael G Schwern <schwern@pobox.com>
+ Date: Mon, 21 Jan 2002 15:22:54 -0500
+ Message-ID: <20020121202254.GA6731@blackrider>
+ Branch: perl
+ ! t/lib/warnings/toke toke.c
+____________________________________________________________________________
+[ 14370] By: jhi on 2002/01/21 19:16:00
+ Log: Undo the renaming of the Unicode data files; the simple
+ solution being not including the *.html files. This means
+ that in the future there is no need for any renamings
+ (well, assuming that the Consortium doesn't cause any),
+ and the files are named like they are in the Consortium
+ website, thus alleviating confusion.
+ Branch: perl
+ + lib/unicore/ArabicShaping.txt lib/unicore/BidiMirroring.txt
+ + lib/unicore/CaseFolding.txt
+ + lib/unicore/CompositionExclusions.txt
+ + lib/unicore/EastAsianWidth.txt lib/unicore/LineBreak.txt
+ + lib/unicore/SpecialCasing.txt lib/unicore/UnicodeData.txt
+ - lib/unicore/ArabShap.txt lib/unicore/BidiMirr.txt
+ - lib/unicore/CaseFold.txt lib/unicore/CompExcl.txt
+ - lib/unicore/EAWidth.txt lib/unicore/LineBrk.txt
+ - lib/unicore/NamesList.html lib/unicore/PropList.html
+ - lib/unicore/SpecCase.txt lib/unicore/UCD.html
+ - lib/unicore/Unicode.html lib/unicore/Unicode.txt
+ - lib/unicore/rename
+ ! (edit 285 files)
+____________________________________________________________________________
+[ 14369] By: jhi on 2002/01/21 15:10:59
+ Log: Update Changes.
+ Branch: perl
+ ! Changes patchlevel.h
+____________________________________________________________________________
+[ 14368] By: jhi on 2002/01/21 14:56:42
+ Log: Subject: RE: BCC for Win32 is unhappy @14331
+ From: "Konovalov, Vadim" <vkonovalov@spb.Lucent.com>
+ Date: Mon, 21 Jan 2002 09:48:40 +0300
+ Message-ID: <80C37C8B4041FB4F9135D70A0AAD71B30F00E5@ru0028exch01.spb.lucent.com>
+
+ (using IV instead of int)
+ Branch: perl
+ ! hv.h
+____________________________________________________________________________
+[ 14367] By: jhi on 2002/01/21 14:29:05
+ Log: path_is_absolute() needs a prototype (and maybe
+ a better name, but that's another story).
+ Branch: perl
+ ! embed.fnc embed.h pp_ctl.c proto.h
+____________________________________________________________________________
+[ 14366] By: jhi on 2002/01/21 14:07:35
+ Log: Subject: Re: coderefs in @INC
+ From: Chris Nandor <pudge@pobox.com>
+ Date: Thu, 17 Jan 2002 11:26:02 -0500
+ Message-Id: <p0510030eb86ca7bc03e0@[10.0.1.177]>
+ Branch: perl
+ ! pp_ctl.c
+____________________________________________________________________________
+[ 14365] By: jhi on 2002/01/21 14:06:11
+ Log: Subject: Re: [PATCH warnings, perldiag] document diagnostics
+ From: Rafael Garcia-Suarez <rgarciasuarez@free.fr>
+ Date: Sun, 20 Jan 2002 22:53:56 +0100
+ Message-ID: <20020120225356.A12093@rafael>
+ Branch: perl
+ ! lib/warnings.pm pod/perldiag.pod t/lib/warnings/2use
+ ! t/lib/warnings/9enabled warnings.h warnings.pl
+____________________________________________________________________________
+[ 14364] By: jhi on 2002/01/21 03:53:08
+ Log: Subject: [PATCH] MakeMaker.pm sub-Makefile.PL tweak for VMS
+ From: "Craig A. Berry" <craigberry@mac.com>
+ Date: Sun, 20 Jan 2002 22:49:42 -0600
+ Message-Id: <a05101004b8714273d60c@[172.16.52.1]>
+ Branch: perl
+ ! lib/ExtUtils/MakeMaker.pm
+____________________________________________________________________________
+[ 14360] By: jhi on 2002/01/20 17:11:12
+ Log: Subject: [PATCH] lib/ExtUtils/t/Command.t -- VOS Fixes
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Sat, 19 Jan 02 17:17 est
+ Message-Id: <200201192218.RAA07911@mailhub1.stratus.com>
+
+ Subject: [PATCH] lib/ExtUtils/t/Command.t -- add more tests
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Sat, 19 Jan 02 17:19 est
+ Message-Id: <200201192218.RAA12331@mailhub2.stratus.com>
+ Branch: perl
+ ! lib/ExtUtils/t/Command.t
+____________________________________________________________________________
+[ 14359] By: jhi on 2002/01/20 17:09:40
+ Log: Subject: [PATCH] lib/AnyDBM_File.t -- VOS fix
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Sat, 19 Jan 02 14:30 est
+ Message-Id: <200201191932.OAA05601@mailhub1.stratus.com>
+ Branch: perl
+ ! lib/AnyDBM_File.t
+____________________________________________________________________________
+[ 14358] By: jhi on 2002/01/20 17:01:38
+ Log: Subject: [REPATCH] Re: [PATCH] ext/File/Glob/t/basic.t - VOS patch
+ From: Rafael Garcia-Suarez <rgarciasuarez@free.fr>
+ Date: Sun, 20 Jan 2002 13:51:36 +0100
+ Message-ID: <20020120135136.A710@rafael>
+ Branch: perl
+ ! ext/File/Glob/t/basic.t
+____________________________________________________________________________
+[ 14354] By: jhi on 2002/01/20 06:35:54
+ Log: Make also hex() and oct() to croak if their arguments
+ cannot be downgraded (that is, if they contain wide
+ characters), just like crypt() does (and use the croak
+ feature of sv_utf8_downgrade()).
+ Branch: perl
+ ! pp.c t/op/oct.t
+____________________________________________________________________________
+[ 14351] By: jhi on 2002/01/19 21:06:58
+ Log: Regen toc.
+ Branch: perl
+ ! pod/perltoc.pod
+____________________________________________________________________________
+[ 14350] By: jhi on 2002/01/19 21:03:07
+ Log: Subject: Re: some file names inside pod/perl*delta files don't match .pod names
+ From: David Dyck <dcd@tc.fluke.com>
+ Date: Sat, 19 Jan 2002 13:31:36 -0800 (PST)
+ Message-ID: <Pine.LNX.4.33.0201191329420.21630-100000@dd.tc.fluke.com>
+ Branch: perl
+ ! pod/perl5004delta.pod pod/perl5005delta.pod
+ ! pod/perl561delta.pod pod/perl56delta.pod
+____________________________________________________________________________
+[ 14349] By: jhi on 2002/01/19 20:07:17
+ Log: FAQ sync.
+ Branch: perl
+ ! pod/perlfaq1.pod pod/perlfaq2.pod
+____________________________________________________________________________
+[ 14348] By: jhi on 2002/01/19 17:58:34
+ Log: Subject: [PATCH lib/Pod/t/Usage.t]
+ From: Abe Timmerman <abe@ztreet.demon.nl>
+ Date: Sat, 19 Jan 2002 19:57:57 +0100
+ Message-ID: <f1gj4usu5m76bv88a3ldptnmo6ld7d44ri@4ax.com>
+ Branch: perl
+ + lib/Pod/t/Usage.t
+ ! MANIFEST
+____________________________________________________________________________
+[ 14347] By: jhi on 2002/01/19 17:57:08
+ Log: Subject: [PATCH] ext/SDBM_File/sdbm.t -- VOS fix
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Sat, 19 Jan 02 13:15 est
+ Message-Id: <200201191814.NAA09367@mailhub2.stratus.com>
+ Branch: perl
+ ! ext/SDBM_File/sdbm.t
+____________________________________________________________________________
+[ 14346] By: jhi on 2002/01/19 17:55:55
+ Log: Integrate perlio; Split out core of sv_magic() into sv_magicext().
+ Branch: perl
+ !> embed.fnc embed.h embedvar.h global.sym perlapi.c perlapi.h
+ !> pod/perlapi.pod pod/perlintern.pod proto.h sv.c util.c
+____________________________________________________________________________
+[ 14345] By: gbarr on 2002/01/19 16:51:23
+ Log: Alternative Time::Local algorithm that uses matematical formula
+ for timegm instead of progressive guessing
+ Branch: perl
+ ! lib/Time/Local.pm
+____________________________________________________________________________
+[ 14344] By: jhi on 2002/01/19 16:11:09
+ Log: Subject: [PATCH] ext/File/Glob/t/basic.t - VOS patch
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Sat, 19 Jan 02 11:20 est
+ Message-Id: <200201191619.LAA07751@mailhub2.stratus.com>
+ Branch: perl
+ ! ext/File/Glob/t/basic.t
+____________________________________________________________________________
+[ 14343] By: jhi on 2002/01/19 16:09:52
+ Log: Subject: [PATCH] ext/Cwd/t/cwd.t -- for VOS
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Fri, 18 Jan 02 22:56 est
+ Message-Id: <200201190400.XAA16899@mailhub1.stratus.com>
+ Branch: perl
+ ! ext/Cwd/t/cwd.t
+____________________________________________________________________________
+[ 14342] By: jhi on 2002/01/19 16:08:19
+ Log: Subject: [PATCH] lib/Time/Local.t patch w/o 2038 check
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Fri, 18 Jan 02 22:48 est
+ Message-Id: <200201190349.WAA28294@mailhub2.stratus.com>
+ Branch: perl
+ ! lib/Time/Local.t
+____________________________________________________________________________
+[ 14341] By: jhi on 2002/01/19 16:06:56
+ Log: Subject: [PATCH] ext/Devel/DProf/DProf.t - different way
+ From: Paul_GreenVOS@vos.stratus.com
+ Date: Fri, 18 Jan 02 16:22 est
+ Message-Id: <200201182125.QAA08563@mailhub1.stratus.com>
+ Branch: perl
+ ! ext/Devel/DProf/DProf.t
+____________________________________________________________________________
+[ 14340] By: jhi on 2002/01/19 03:00:26
+ Log: Subject: [PATCH] Re: ext/Devel/DProf/DProf.t -- use exe_ext
+ From: Tels <perl_dummy@bloodgate.com>
+ Date: Fri, 18 Jan 2002 22:07:14 +0100 (CET)
+ Message-Id: <200201182106.XAA19133@taku.hut.fi>
+ Branch: perl
+ ! t/comp/script.t t/op/stat.t t/run/fresh_perl.t
+____________________________________________________________________________
+[ 14339] By: jhi on 2002/01/19 02:56:26
+ Log: Don't do socketpair udp unless you've got all what it takes.
+ Branch: perl
+ ! util.c
+____________________________________________________________________________
+[ 14338] By: jhi on 2002/01/19 02:51:34
+ Log: Subject: [PATCH] Re: perl@14331 - BeOS now quite happy
+ From: Tels <perl_dummy@bloodgate.com>
+ Date: Fri, 18 Jan 2002 23:07:01 +0100 (CET)
+ Message-Id: <200201182206.AAA15310@taku.hut.fi>
+ Branch: perl
+ ! lib/ExtUtils/MM_BeOS.pm
+____________________________________________________________________________
+[ 14336] By: jhi on 2002/01/19 02:47:50
+ Log: Avoid bare "set", and kiss .uucp goodbye.
+ Branch: perl
+ ! Configure
+____________________________________________________________________________
+[ 14334] By: jhi on 2002/01/18 21:16:08
+ Log: Retract #14327 for now, going to the limit seems
+ to be too much for many platforms.
+ Branch: perl
+ ! lib/Time/Local.t
+____________________________________________________________________________
+[ 14331] By: jhi on 2002/01/18 16:09:38
+ Log: Update Changes.
+ Branch: perl
+ ! Changes patchlevel.h
+____________________________________________________________________________
[ 14330] By: jhi on 2002/01/18 15:07:49
Log: Try to make the connect/read/write timeouting.
Branch: perl
# $Id: Head.U,v 3.0.1.9 1997/02/28 15:02:09 ram Exp $
#
-# Generated on Sat Jan 19 05:47:21 EET 2002 [metaconfig 3.0 PL70]
+# Generated on Thu Jan 24 16:12:51 EET 2002 [metaconfig 3.0 PL70]
# (with additional metaconfig patches by perlbug@perl.org)
cat >c1$$ <<EOF
#endif
int main() {
#if BYTEORDER == 0x1234 || BYTEORDER == 0x4321
- U8 buf[] = "\0\0\0\1\0\0\0\0";
+ U8 *buf = (U8*)"\0\0\0\1\0\0\0\0";
U32 *up;
int i;
/* EXTERN.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* INTERN.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
lib/Unicode/UCD.t See if Unicode character database works
lib/unicore/ArabLink.pl Unicode character database
lib/unicore/ArabLnkGrp.pl Unicode character database
-lib/unicore/ArabShap.txt Unicode character database
-lib/unicore/BidiMirr.txt Unicode character database
+lib/unicore/ArabicShaping.txt Unicode character database
+lib/unicore/BidiMirroring.txt Unicode character database
lib/unicore/Bidirectional.pl Unicode character database
lib/unicore/Blocks.pl Unicode character database
lib/unicore/Blocks.txt Unicode character database
lib/unicore/Canonical.pl Unicode character database
-lib/unicore/CaseFold.txt Unicode character database
+lib/unicore/CaseFolding.txt Unicode character database
lib/unicore/Category.pl Unicode character database
lib/unicore/CombiningClass.pl Unicode character database
-lib/unicore/CompExcl.txt Unicode character database
+lib/unicore/CompositionExclusions.txt Unicode character database
lib/unicore/Decomposition.pl Unicode character database
-lib/unicore/EAWidth.txt Unicode character database
+lib/unicore/EastAsianWidth.txt Unicode character database
lib/unicore/Exact.pl Unicode character database
lib/unicore/Index.txt Unicode character database
lib/unicore/Jamo.txt Unicode character database
lib/unicore/lib/_CanonDC.pl Unicode character database
lib/unicore/lib/_CaseIgn.pl Unicode character database
lib/unicore/lib/_CombAbo.pl Unicode character database
-lib/unicore/LineBrk.txt Unicode character database
+lib/unicore/LineBreak.txt Unicode character database
lib/unicore/Makefile Unicode character database
lib/unicore/mktables Unicode character database generator
lib/unicore/Name.pl Unicode character database
-lib/unicore/NamesList.html Unicode character database
lib/unicore/NamesList.txt Unicode character database
lib/unicore/Number.pl Unicode character database
lib/unicore/Properties Built-in \p{...} / \P{...} property list
-lib/unicore/PropList.html Unicode character database
lib/unicore/PropList.txt Unicode character database
lib/unicore/README.perl Unicode character database
lib/unicore/ReadMe.txt Unicode character database info
-lib/unicore/rename Filename mappings used
lib/unicore/Scripts.pl Unicode character database
lib/unicore/Scripts.txt Unicode character database
-lib/unicore/SpecCase.txt Unicode character database
+lib/unicore/SpecialCasing.txt Unicode character database
lib/unicore/To/Digit.pl Unicode character database
lib/unicore/To/Fold.pl Unicode character database
lib/unicore/To/Lower.pl Unicode character database
lib/unicore/To/Title.pl Unicode character database
lib/unicore/To/Upper.pl Unicode character database
-lib/unicore/UCD.html Unicode character database
-lib/unicore/Unicode.html Unicode character database
-lib/unicore/Unicode.txt Unicode character database
+lib/unicore/UnicodeData.txt Unicode character database
lib/unicore/version The version of the Unicode
lib/UNIVERSAL.pm Base class for ALL classes
lib/User/grent.pm By-name interface to Perl's builtin getgr*
vos/Makefile A helper for maintaining the config.*.* in UNIX
vos/perl.bind VOS bind control file
vos/test_vos_dummies.c Test program for "vos_dummies.c"
+vos/vos.c VOS emulations for missing POSIX functions
vos/vosish.h VOS-specific header file
vos/vos_dummies.c Wrappers to soak up undefined functions
warnings.h The warning numbers
find libraries. It may contain directories that do not exist on
this platform, libpth is the cleaned-up version.
+gmake (Loc.U):
+ This variable is used internally by Configure to determine the
+ full pathname (if any) of the gmake program. After Configure runs,
+ the value is reset to a plain "gmake" and is not useful.
+
grep (Loc.U):
This variable is used internally by Configure to determine the
full pathname (if any) of the grep program. After Configure runs,
shrpenv=''
See the main perl Makefile.SH for actual working usage.
Alternatively, we might be able to use a command line option such
- as -R $archlibexp/CORE (Solaris, NetBSD) or -Wl,-rpath
+ as -R $archlibexp/CORE (Solaris) or -Wl,-rpath
$archlibexp/CORE (Linux).
shsharp (spitshell.U):
# Package name : perl5
# Source directory : .
-# Configuration time: Fri Nov 23 21:51:58 EET 2001
+# Configuration time: Tue Jan 22 18:37:28 EET 2002
# Configured by : jhi
# Target system : osf1 alpha.hut.fi v4.0 878 alpha
ccversion='V5.6-082'
cf_by='jhi'
cf_email='yourname@yourhost.yourplace.com'
-cf_time='Fri Nov 23 21:51:58 EET 2001'
+cf_time='Tue Jan 22 18:37:28 EET 2002'
charsize='1'
chgrp=''
chmod='chmod'
dlsrc='dl_dlopen.xs'
doublesize='8'
drand01='drand48()'
-dynamic_ext='B ByteLoader Cwd DB_File Data/Dumper Devel/DProf Devel/Peek Digest/MD5 Encode Fcntl File/Glob Filter/Util/Call I18N/Langinfo IO IPC/SysV List/Util MIME/Base64 NDBM_File ODBM_File Opcode POSIX PerlIO/Scalar PerlIO/Via SDBM_File Socket Storable Sys/Hostname Sys/Syslog Time/HiRes Unicode/Normalize XS/Typemap attrs re'
+dynamic_ext='B ByteLoader Cwd DB_File Data/Dumper Devel/DProf Devel/PPPort Devel/Peek Digest/MD5 Encode Fcntl File/Glob Filter/Util/Call I18N/Langinfo IO IPC/SysV List/Util MIME/Base64 NDBM_File ODBM_File Opcode POSIX PerlIO/Scalar PerlIO/Via SDBM_File Socket Storable Sys/Hostname Sys/Syslog Time/HiRes Unicode/Normalize XS/Typemap attrs re'
eagain='EAGAIN'
ebcdic='undef'
echo='echo'
eunicefix=':'
exe_ext=''
expr='expr'
-extensions='B ByteLoader Cwd DB_File Data/Dumper Devel/DProf Devel/Peek Digest/MD5 Encode Fcntl File/Glob Filter/Util/Call I18N/Langinfo IO IPC/SysV List/Util MIME/Base64 NDBM_File ODBM_File Opcode POSIX PerlIO/Scalar PerlIO/Via SDBM_File Socket Storable Sys/Hostname Sys/Syslog Time/HiRes Unicode/Normalize XS/Typemap attrs re Devel/PPPort Errno'
+extensions='B ByteLoader Cwd DB_File Data/Dumper Devel/DProf Devel/PPPort Devel/Peek Digest/MD5 Encode Fcntl File/Glob Filter/Util/Call I18N/Langinfo IO IPC/SysV List/Util MIME/Base64 NDBM_File ODBM_File Opcode POSIX PerlIO/Scalar PerlIO/Via SDBM_File Socket Storable Sys/Hostname Sys/Syslog Time/HiRes Unicode/Normalize XS/Typemap attrs re Errno'
extras=''
fflushNULL='define'
fflushall='undef'
gidsize='4'
gidtype='gid_t'
glibpth='/usr/shlib /usr/ccs/lib /usr/lib/cmplrs/cc /usr/lib /usr/local/lib /var/shlib'
+gmake='gmake'
grep='grep'
groupcat='cat /etc/group'
groupstype='gid_t'
ivdformat='"ld"'
ivsize='8'
ivtype='long'
-known_extensions='B ByteLoader Cwd DB_File Data/Dumper Devel/DProf Devel/Peek Digest/MD5 Encode Fcntl File/Glob Filter/Util/Call GDBM_File I18N/Langinfo IO IPC/SysV List/Util MIME/Base64 NDBM_File ODBM_File Opcode POSIX PerlIO/Scalar PerlIO/Via SDBM_File Socket Storable Sys/Hostname Sys/Syslog Thread Time/HiRes Unicode/Normalize XS/Typemap attrs re threads threads/shared'
+known_extensions='B ByteLoader Cwd DB_File Data/Dumper Devel/DProf Devel/PPPort Devel/Peek Digest/MD5 Encode Fcntl File/Glob Filter/Util/Call GDBM_File I18N/Langinfo IO IPC/SysV List/Util MIME/Base64 NDBM_File ODBM_File Opcode POSIX PerlIO/Scalar PerlIO/Via SDBM_File Socket Storable Sys/Hostname Sys/Syslog Thread Time/HiRes Unicode/Normalize XS/Typemap attrs re threads threads/shared'
ksh=''
ld='ld'
lddlflags='-shared -expect_unresolved "*" -msym -std -s'
nm='nm'
nm_opt='-p'
nm_so_opt=''
-nonxs_ext='Devel/PPPort Errno'
+nonxs_ext='Errno'
nroff='nroff'
nvEUformat='"E"'
nvFUformat='"F"'
path_sep=':'
perl5='perl'
perl=''
-perl_patchlevel='13165'
+perl_patchlevel='14368'
perladmin='yourname@yourhost.yourplace.com'
perllibs='-lm -lutil'
-perlpath='/opt/perl/bin/perl'
+perlpath='/opt/perl/bin/perl5.7.2'
pg='pg'
phostname=''
pidtype='pid_t'
vendorprefix=''
vendorprefixexp=''
version='5.7.2'
-version_patchlevel_string='version 7 subversion 2 patch 13165'
+version_patchlevel_string='version 7 subversion 2 patch 14368'
versiononly='define'
vi=''
voidflags='15'
PERL_API_REVISION=5
PERL_API_VERSION=5
PERL_API_SUBVERSION=0
-PERL_PATCHLEVEL=13165
+PERL_PATCHLEVEL=14368
PERL_CONFIG_SH=true
# Variables propagated from previous config.sh file.
pp_sys_cflags='ccflags="$ccflags -DNO_EFF_ONLY_OK"'
/*
* Package name : perl5
* Source directory : .
- * Configuration time: Fri Nov 23 21:51:58 EET 2001
+ * Configuration time: Tue Jan 22 18:37:28 EET 2002
* Configured by : jhi
* Target system : osf1 alpha.hut.fi v4.0 878 alpha
*/
#define const
#endif
-/* HAS_CRYPT:
- * This symbol, if defined, indicates that the crypt routine is available
- * to encrypt passwords and the like.
- */
-#define HAS_CRYPT /**/
-
/* HAS_CUSERID:
* This symbol, if defined, indicates that the cuserid routine is
* available to get character login names.
*/
#define HAS_SETSID /**/
-/* Shmat_t:
- * This symbol holds the return type of the shmat() system call.
- * Usually set to 'void *' or 'char *'.
- */
-/* HAS_SHMAT_PROTOTYPE:
- * This symbol, if defined, indicates that the sys/shm.h includes
- * a prototype for shmat(). Otherwise, it is up to the program to
- * guess one. Shmat_t shmat _((int, Shmat_t, int)) is a good guess,
- * but not always right so it should be emitted by the program only
- * when HAS_SHMAT_PROTOTYPE is not defined to avoid conflicting defs.
- */
-#define Shmat_t void * /**/
-#define HAS_SHMAT_PROTOTYPE /**/
-
/* HAS_STRCHR:
* This symbol is defined to indicate that the strchr()/strrchr()
* functions are available for string searching. If not, try the
*/
/*#define I_MEMORY / **/
-/* I_NDBM:
- * This symbol, if defined, indicates that <ndbm.h> exists and should
- * be included.
- */
-#define I_NDBM /**/
-
/* I_NET_ERRNO:
* This symbol, if defined, indicates that <net/errno.h> exists and
* should be included.
*/
/*#define I_VFORK / **/
-/* CAN_PROTOTYPE:
- * If defined, this macro indicates that the C compiler can handle
- * function prototypes.
- */
-/* _:
- * This macro is used to declare function parameters for folks who want
- * to make declarations with prototypes using a different style than
- * the above macros. Use double parentheses. For example:
- *
- * int main _((int argc, char *argv[]));
- */
-#define CAN_PROTOTYPE /**/
-#ifdef CAN_PROTOTYPE
-#define _(args) args
-#else
-#define _(args) ()
-#endif
-
/* INTSIZE:
* This symbol contains the value of sizeof(int) so that the C
* preprocessor can make decisions based on it.
* This symbol, if defined, indicates that the system provides
* a prototype for the drand48() function. Otherwise, it is up
* to the program to supply one. A good guess is
- * extern double drand48 _((void));
+ * extern double drand48(void);
*/
#define HAS_DRAND48_PROTO /**/
* This symbol, if defined, indicates that the system provides
* a prototype for the sbrk() function. Otherwise, it is up
* to the program to supply one. Good guesses are
- * extern void* sbrk _((int));
- * extern void* sbrk _((size_t));
+ * extern void* sbrk(int);
+ * extern void* sbrk(size_t);
*/
#define HAS_SBRK_PROTO /**/
* This symbol, if defined, indicates that the system provides
* a prototype for the telldir() function. Otherwise, it is up
* to the program to supply one. A good guess is
- * extern long telldir _((DIR*));
+ * extern long telldir(DIR*);
*/
#define HAS_TELLDIR_PROTO /**/
#define PERL_XS_APIVERSION "5.005"
#define PERL_PM_APIVERSION "5.005"
+/* HAS_CRYPT:
+ * This symbol, if defined, indicates that the crypt routine is available
+ * to encrypt passwords and the like.
+ */
+#define HAS_CRYPT /**/
+
/* SETUID_SCRIPTS_ARE_SECURE_NOW:
* This symbol, if defined, indicates that the bug that prevents
* setuid scripts from being secure is not present in this kernel.
/*#define SETUID_SCRIPTS_ARE_SECURE_NOW / **/
/*#define DOSUID / **/
+/* Shmat_t:
+ * This symbol holds the return type of the shmat() system call.
+ * Usually set to 'void *' or 'char *'.
+ */
+/* HAS_SHMAT_PROTOTYPE:
+ * This symbol, if defined, indicates that the sys/shm.h includes
+ * a prototype for shmat(). Otherwise, it is up to the program to
+ * guess one. Shmat_t shmat(int, Shmat_t, int) is a good guess,
+ * but not always right so it should be emitted by the program only
+ * when HAS_SHMAT_PROTOTYPE is not defined to avoid conflicting defs.
+ */
+#define Shmat_t void * /**/
+#define HAS_SHMAT_PROTOTYPE /**/
+
+/* I_NDBM:
+ * This symbol, if defined, indicates that <ndbm.h> exists and should
+ * be included.
+ */
+#define I_NDBM /**/
+
/* I_STDARG:
* This symbol, if defined, indicates that <stdarg.h> exists and should
* be included.
#define I_STDARG /**/
/*#define I_VARARGS / **/
+/* CAN_PROTOTYPE:
+ * If defined, this macro indicates that the C compiler can handle
+ * function prototypes.
+ */
+/* PROTO_:
+ * This macro is used to declare function parameters for folks who want
+ * to make declarations with prototypes using a different style than
+ * the above macros. Use double parentheses. For example:
+ *
+ * int main PROTO_((int argc, char *argv[]));
+ */
+#define CAN_PROTOTYPE /**/
+#ifdef CAN_PROTOTYPE
+#define PROTO_(args) args
+#else
+#define PROTO_(args) ()
+#endif
+
/* SH_PATH:
* This symbol contains the full pathname to the shell used on this
* on this system to execute Bourne shell scripts. Usually, this will be
* This symbol, if defined, indicates that the system provides
* a prototype for the sockatmark() function. Otherwise, it is up
* to the program to supply one. A good guess is
- * extern int sockatmark _((int));
+ * extern int sockatmark(int);
*/
/*#define HAS_SOCKATMARK_PROTO / **/
Perl Kit, Version 5.0
- Copyright 1989-2001, Larry Wall
+ Copyright 1989-2002, Larry Wall
All rights reserved.
This program is free software; you can redistribute it and/or modify
+/* XSUB.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
+
#ifndef _INC_PERL_XSUB_H
#define _INC_PERL_XSUB_H 1
/* av.c
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* av.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
+/* cc_runtime.h
+ *
+ * Copyright (c) 1998-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
+
#define DOOP(ppname) PUTBACK; PL_op = ppname(aTHX); SPAGAIN
#define CCPP(s) OP * s(pTHX)
*/
#$d_setsid HAS_SETSID /**/
-/* Shmat_t:
- * This symbol holds the return type of the shmat() system call.
- * Usually set to 'void *' or 'char *'.
- */
-/* HAS_SHMAT_PROTOTYPE:
- * This symbol, if defined, indicates that the sys/shm.h includes
- * a prototype for shmat(). Otherwise, it is up to the program to
- * guess one. Shmat_t shmat _((int, Shmat_t, int)) is a good guess,
- * but not always right so it should be emitted by the program only
- * when HAS_SHMAT_PROTOTYPE is not defined to avoid conflicting defs.
- */
-#define Shmat_t $shmattype /**/
-#$d_shmatprototype HAS_SHMAT_PROTOTYPE /**/
-
/* HAS_STRCHR:
* This symbol is defined to indicate that the strchr()/strrchr()
* functions are available for string searching. If not, try the
*/
#$i_vfork I_VFORK /**/
-/* CAN_PROTOTYPE:
- * If defined, this macro indicates that the C compiler can handle
- * function prototypes.
- */
-/* _:
- * This macro is used to declare function parameters for folks who want
- * to make declarations with prototypes using a different style than
- * the above macros. Use double parentheses. For example:
- *
- * int main _((int argc, char *argv[]));
- */
-#$prototype CAN_PROTOTYPE /**/
-#ifdef CAN_PROTOTYPE
-#define _(args) args
-#else
-#define _(args) ()
-#endif
-
/* INTSIZE:
* This symbol contains the value of sizeof(int) so that the C
* preprocessor can make decisions based on it.
* This symbol, if defined, indicates that the system provides
* a prototype for the drand48() function. Otherwise, it is up
* to the program to supply one. A good guess is
- * extern double drand48 _((void));
+ * extern double drand48(void);
*/
#$d_drand48proto HAS_DRAND48_PROTO /**/
* This symbol, if defined, indicates that the system provides
* a prototype for the sbrk() function. Otherwise, it is up
* to the program to supply one. Good guesses are
- * extern void* sbrk _((int));
- * extern void* sbrk _((size_t));
+ * extern void* sbrk(int);
+ * extern void* sbrk(size_t);
*/
#$d_sbrkproto HAS_SBRK_PROTO /**/
* This symbol, if defined, indicates that the system provides
* a prototype for the telldir() function. Otherwise, it is up
* to the program to supply one. A good guess is
- * extern long telldir _((DIR*));
+ * extern long telldir(DIR*);
*/
#$d_telldirproto HAS_TELLDIR_PROTO /**/
#$d_suidsafe SETUID_SCRIPTS_ARE_SECURE_NOW /**/
#$d_dosuid DOSUID /**/
+/* Shmat_t:
+ * This symbol holds the return type of the shmat() system call.
+ * Usually set to 'void *' or 'char *'.
+ */
+/* HAS_SHMAT_PROTOTYPE:
+ * This symbol, if defined, indicates that the sys/shm.h includes
+ * a prototype for shmat(). Otherwise, it is up to the program to
+ * guess one. Shmat_t shmat(int, Shmat_t, int) is a good guess,
+ * but not always right so it should be emitted by the program only
+ * when HAS_SHMAT_PROTOTYPE is not defined to avoid conflicting defs.
+ */
+#define Shmat_t $shmattype /**/
+#$d_shmatprototype HAS_SHMAT_PROTOTYPE /**/
+
/* I_NDBM:
* This symbol, if defined, indicates that <ndbm.h> exists and should
* be included.
#$i_stdarg I_STDARG /**/
#$i_varargs I_VARARGS /**/
+/* CAN_PROTOTYPE:
+ * If defined, this macro indicates that the C compiler can handle
+ * function prototypes.
+ */
+/* PROTO_:
+ * This macro is used to declare function parameters for folks who want
+ * to make declarations with prototypes using a different style than
+ * the above macros. Use double parentheses. For example:
+ *
+ * int main PROTO_((int argc, char *argv[]));
+ */
+#$prototype CAN_PROTOTYPE /**/
+#ifdef CAN_PROTOTYPE
+#define PROTO_(args) args
+#else
+#define PROTO_(args) ()
+#endif
+
/* SH_PATH:
* This symbol contains the full pathname to the shell used on this
* on this system to execute Bourne shell scripts. Usually, this will be
* This symbol, if defined, indicates that the system provides
* a prototype for the sockatmark() function. Otherwise, it is up
* to the program to supply one. A good guess is
- * extern int sockatmark _((int));
+ * extern int sockatmark(int);
*/
#$d_sockatmarkproto HAS_SOCKATMARK_PROTO /**/
/* cop.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* cv.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* deb.c
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* doio.c
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* doop.c
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
+/* dosish.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
+
#define ABORT() abort();
#ifndef SH_PATH
/* dump.c
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
s |void |save_lines |AV *array|SV *sv
s |OP* |doeval |int gimme|OP** startop
s |PerlIO *|doopen_pmc |const char *name|const char *mode
+s |bool |path_is_absolute|char *name
#endif
#if defined(PERL_IN_PP_HOT_C) || defined(PERL_DECL_PROT)
#endif
#if defined(PERL_IN_UNIVERSAL_C) || defined(PERL_DECL_PROT)
-s |SV*|isa_lookup |HV *stash|const char *name|int len|int level
+s |SV*|isa_lookup |HV *stash|const char *name|HV *name_stash|int len|int level
#endif
#if defined(PERL_IN_LOCALE_C) || defined(PERL_DECL_PROT)
-/* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
- This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
- perlvars.h and thrdvar.h. Any changes made here will be lost!
-*/
+/*
+ * embed.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ * !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
+ * This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
+ * perlvars.h and thrdvar.h. Any changes made here will be lost!
+ */
/* (Doing namespace management portably in C is really gross.) */
#define save_lines S_save_lines
#define doeval S_doeval
#define doopen_pmc S_doopen_pmc
+#define path_is_absolute S_path_is_absolute
#endif
#if defined(PERL_IN_PP_HOT_C) || defined(PERL_DECL_PROT)
#define do_maybe_phash S_do_maybe_phash
#define save_lines(a,b) S_save_lines(aTHX_ a,b)
#define doeval(a,b) S_doeval(aTHX_ a,b)
#define doopen_pmc(a,b) S_doopen_pmc(aTHX_ a,b)
+#define path_is_absolute(a) S_path_is_absolute(aTHX_ a)
#endif
#if defined(PERL_IN_PP_HOT_C) || defined(PERL_DECL_PROT)
#define do_maybe_phash(a,b,c,d,e) S_do_maybe_phash(aTHX_ a,b,c,d,e)
# endif
#endif
#if defined(PERL_IN_UNIVERSAL_C) || defined(PERL_DECL_PROT)
-#define isa_lookup(a,b,c,d) S_isa_lookup(aTHX_ a,b,c,d)
+#define isa_lookup(a,b,c,d,e) S_isa_lookup(aTHX_ a,b,c,d,e)
#endif
#if defined(PERL_IN_LOCALE_C) || defined(PERL_DECL_PROT)
#define stdize_locale(a) S_stdize_locale(aTHX_ a)
walk_table(\&write_protos, 'proto.h', <<'EOT');
/*
+ * proto.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
* This file is autogenerated from data in embed.pl. Edit that file
* and run 'make regen_headers' to effect changes.
walk_table(\&write_global_sym, 'global.sym', <<'EOT');
#
+# global.sym
+#
+# Copyright (c) 1997-2002, Larry Wall
+#
+# You may distribute under the terms of either the GNU General Public
+# License or the Artistic License, as specified in the README file.
+#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
# This file is autogenerated from data in embed.pl. Edit that file
# and run 'make regen_headers' to effect changes.
open(EM, '> embed.h') or die "Can't create embed.h: $!\n";
print EM <<'END';
-/* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
- This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
- perlvars.h and thrdvar.h. Any changes made here will be lost!
-*/
+/*
+ * embed.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ * !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
+ * This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
+ * perlvars.h and thrdvar.h. Any changes made here will be lost!
+ */
/* (Doing namespace management portably in C is really gross.) */
or die "Can't create embedvar.h: $!\n";
print EM <<'END';
-/* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
- This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
- perlvars.h and thrdvar.h. Any changes made here will be lost!
-*/
+/*
+ * embedvar.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ *
+ * !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
+ * This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
+ * perlvars.h and thrdvar.h. Any changes made here will be lost!
+ */
/* (Doing namespace management portably in C is really gross.) */
open(CAPIH, '> perlapi.h') or die "Can't create perlapi.h: $!\n";
print CAPIH <<'EOT';
-/* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
- This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
- perlvars.h and thrdvar.h. Any changes made here will be lost!
-*/
+/*
+ * perlapi.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ *
+ * !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
+ * This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
+ * perlvars.h and thrdvar.h. Any changes made here will be lost!
+ */
/* declare accessor functions for Perl variables */
#ifndef __perlapi_h__
close CAPIH;
print CAPI <<'EOT';
-/* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
- This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
- perlvars.h and thrdvar.h. Any changes made here will be lost!
-*/
+/*
+ * perlapi.c
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ *
+ * !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
+ * This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
+ * perlvars.h and thrdvar.h. Any changes made here will be lost!
+ */
#include "EXTERN.h"
#include "perl.h"
-/* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
- This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
- perlvars.h and thrdvar.h. Any changes made here will be lost!
-*/
+/*
+ * embedvar.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ *
+ * !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
+ * This file is built by embed.pl from data in embed.pl, pp.sym, intrpvar.h,
+ * perlvars.h and thrdvar.h. Any changes made here will be lost!
+ */
/* (Doing namespace management portably in C is really gross.) */
#if defined(NEED_newCONSTSUB)
static
#else
-extern void newCONSTSUB _((HV * stash, char * name, SV *sv));
+extern void newCONSTSUB(HV * stash, char * name, SV *sv);
#endif
#if defined(NEED_newCONSTSUB) || defined(NEED_newCONSTSUB_GLOBAL)
rmdir $dir;
if (scalar(@a) != 0 || GLOB_ERROR == 0) {
if ($^O eq 'vos') {
- print "not ok 6 -- hit VOS bug posix-956\n";
+ print "not ok 6 # TODO hit VOS bug posix-956\n";
} else {
print "not ok 6\n";
}
Exporter
File::Copy
File::Spec
-unicore/CombiningClass.pl or unicode/CombiningClass.pl
-unicore/Decomposition.pl or unicode/Decomposition.pl
-unicore/CompExcl.txt or unicode/CompExcl.txt
+unicore/CombiningClass.pl or unicode/CombiningClass.pl
+unicore/Decomposition.pl or unicode/Decomposition.pl
+unicore/CompositionExclusions.txt or unicode/CompExcl.txt
and for the Non-XS version, in addition to the above,
Lingua::KO::Hangul::Util 0.06
my($f, $fh);
foreach my $d (@INC) {
use File::Spec;
- $f = File::Spec->catfile($d, "unicore", "CompExcl.txt");
+ $f = File::Spec->catfile($d, "unicore", "CompositionExclusions.txt");
last if open($fh, $f);
$f = File::Spec->catfile($d, "unicode", "CompExcl.txt");
last if open($fh, $f);
+/* fakestdio.h
+ *
+ * Copyright (c) 2000-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
+
/*
* This is "source level" stdio compatibility mode.
* We try and #define stdio functions in terms of PerlIO.
+/* fakethr.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
+
typedef int perl_mutex;
typedef int perl_key;
/* form.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
#
+# global.sym
+#
+# Copyright (c) 1997-2002, Larry Wall
+#
+# You may distribute under the terms of either the GNU General Public
+# License or the Artistic License, as specified in the README file.
+#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
# This file is autogenerated from data in embed.pl. Edit that file
# and run 'make regen_headers' to effect changes.
+/* globals.c
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
+
#include "INTERN.h"
#define PERL_IN_GLOBALS_C
#include "perl.h"
/* gv.c
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* gv.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* handy.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
# Make command.
make="/system/gnu_library/bin/gmake"
-_make="/system/gnu_library/bin/gmake"
+# indented to not put it into config.sh
+ _make="/system/gnu_library/bin/gmake"
# Architecture name
archname="hppa1.1"
# VOS has a link() function but it is a dummy.
d_link="undef"
+
+# VOS does not have truncate() but we supply one in vos.c
+d_truncate="define"
+archobjs="vos.o"
+
+# Help gmake find vos.c
+test -h vos.c || ln -s vos/vos.c vos.c
/* hv.c
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* hv.h
*
- * Copyright (c) 1991-2001, Larry Wall
+ * Copyright (c) 1991-2002, Larry Wall
*
* You may distribute under the terms of either the GNU General Public
* License or the Artistic License, as specified in the README file.
/* The number of placeholders in the enumerated-keys hash */
#define XHvPLACEHOLDERS(xhv) ((xhv)->xhv_placeholders)
-/* the number of keys that exist() (i.e. excluding placeholers) */
-#define XHvUSEDKEYS(xhv) (XHvTOTALKEYS(xhv) - XHvPLACEHOLDERS(xhv))
+/* the number of keys that exist() (i.e. excluding placeholders) */
+#define XHvUSEDKEYS(xhv) (XHvTOTALKEYS(xhv) - (IV)XHvPLACEHOLDERS(xhv))
/*
* HvKEYS gets the number of keys that actually exist(), and is provided
+/*
+ * keywords.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
#define KEY_NULL 0
#define KEY___FILE__ 1
#define KEY___LINE__ 2
open(KW, ">keywords.h") || die "Can't create keywords.h: $!\n";
select KW;
+print <<EOM;
+/*
+ * keywords.h
+ *
+ * Copyright (c) 1997-2002, Larry Wall
+ *
+ * You may distribute under the terms of either the GNU General Public
+ * License or the Artistic License, as specified in the README file.
+ *
+ */
+EOM
+
# Read & print data.
$keynum = 0;
if ($Is_Dosish || $^O eq 'MacOS') ;
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,
$blksize,$blocks) = stat($Dfile);
- ok(($mode & 0777) == 0640 , "File permissions");
+ ok(($mode & 0777) == ($^O eq 'vos' ? 0750 : 0640) , "File permissions");
}
while (($key,$value) = each(%h)) {
for $key (keys %Prepend_dot_dot) {
next unless defined $self->{PARENT}{$key};
$self->{$key} = $self->{PARENT}{$key};
- # PERL and FULLPERL may be command verbs instead of full
- # file specifications under VMS. If so, don't turn them
- # into a filespec.
- $self->{$key} = $self->catdir("..",$self->{$key})
- unless $self->file_name_is_absolute($self->{$key})
- || ($^O eq 'VMS' and ($key =~ /PERL$/ && $self->{$key} =~ /^[\w\-\$]+$/));
+ unless ($^O eq 'VMS' && $key =~ /PERL$/) {
+ $self->{$key} = $self->catdir("..",$self->{$key})
+ unless $self->file_name_is_absolute($self->{$key});
+ } else {
+ # PERL or FULLPERL will be a command verb or even a command with
+ # an argument instead of a full file specification under VMS. So,
+ # don't turn the command into a filespec, but do add a level to the
+ # path of the argument if not already absolute.
+
+ my @cmd = split /\s+/, $self->{$key};
+ $cmd[1] = $self->catfile('[-]',$cmd[1])
+ unless (scalar(@cmd) < 2 || $self->file_name_is_absolute($cmd[1]));
+ $self->{$key} = join(' ', @cmd);
+ }
}
if ($self->{PARENT}) {
$self->{PARENT}->{CHILDREN}->{$newclass} = $self;
}
BEGIN {
- use Test::More tests => 21;
+ use Test::More tests => 24;
use File::Spec;
}
ok( abs($now - $stamp) <= 1, 'checking modify time stamp' ) ||
print "# mtime == $stamp, should be $now\n";
+SKIP: {
+ if ($^O eq 'amigaos' || $^O eq 'os2' || $^O eq 'MSWin32' ||
+ $^O eq 'NetWare' || $^O eq 'dos' || $^O eq 'cygwin') {
+ skip( "different file permission semantics on $^O\n", 3);
+ }
+
+ # change a file to execute-only
+ @ARGV = ( 0100, 'ecmdfile' );
+ ExtUtils::Command::chmod();
+
+ is( ((stat('ecmdfile'))[2] & 07777) & 0700,
+ 0100, 'change a file to execute-only' );
+
# change a file to read-only
+ @ARGV = ( 0400, 'ecmdfile' );
+ ExtUtils::Command::chmod();
+
+ is( ((stat('ecmdfile'))[2] & 07777) & 0700,
+ ($^O eq 'vos' ? 0500 : 0400), 'change a file to read-only' );
+
+ # change a file to write-only
+ @ARGV = ( 0200, 'ecmdfile' );
+ ExtUtils::Command::chmod();
+
+ is( ((stat('ecmdfile'))[2] & 07777) & 0700,
+ ($^O eq 'vos' ? 0700 : 0200), 'change a file to write-only' );
+ }
+
+ # change a file to read-write
@ARGV = ( 0600, 'ecmdfile' );
ExtUtils::Command::chmod();
- is( ((stat('ecmdfile'))[2] & 07777) & 0700, 0600, 'change a file to read-only' );
+ is( ((stat('ecmdfile'))[2] & 07777) & 0700,
+ ($^O eq 'vos' ? 0700 : 0600), 'change a file to read-write' );
# mkpath
@ARGV = ( File::Spec->join( 'ecmddir', 'temp2' ) );
last;
}
}
- openunicode(\$UNICODEFH, "Unicode.txt");
+ openunicode(\$UNICODEFH, "UnicodeData.txt");
if (defined $UNICODEFH) {
use Search::Dict 1.02;
if (look($UNICODEFH, "$hexk;", { xfrm => sub { $_[0] =~ /^([^;]+);(.+)/; sprintf "%06X;$2", hex($1) } } ) >= 0) {
sub _compexcl {
unless (%COMPEXCL) {
- if (openunicode(\$COMPEXCLFH, "CompExcl.txt")) {
+ if (openunicode(\$COMPEXCLFH, "CompositionExclusions.txt")) {
while (<$COMPEXCLFH>) {
if (/^([0-9A-F]+) \# /) {
my $code = hex($1);
sub _casefold {
unless (%CASEFOLD) {
- if (openunicode(\$CASEFOLDFH, "CaseFold.txt")) {
+ if (openunicode(\$CASEFOLDFH, "CaseFolding.txt")) {
while (<$CASEFOLDFH>) {
if (/^([0-9A-F]+); ([CFSI]); ([0-9A-F]+(?: [0-9A-F]+)*);/) {
my $code = hex($1);
sub _casespec {
unless (%CASESPEC) {
- if (openunicode(\$CASESPECFH, "SpecCase.txt")) {
+ if (openunicode(\$CASESPECFH, "SpecialCasing.txt")) {
while (<$CASESPECFH>) {
if (/^([0-9A-F]+); ([0-9A-F]+(?: [0-9A-F]+)*)?; ([0-9A-F]+(?: [0-9A-F]+)*)?; ([0-9A-F]+(?: [0-9A-F]+)*)?; (\w+(?: \w+)*)?/) {
my ($hexcode, $lower, $title, $upper, $condition) =
upper => $oldupper,
condition => $oldcondition };
} else {
- warn __PACKAGE__, ": SpecCase.txt:", $., ": No oldlocale for 0x$hexcode\n"
+ warn __PACKAGE__, ": SpecialCasing.txt:", $., ": No oldlocale for 0x$hexcode\n"
}
}
my ($locale) =
$_;
}
+sub uniescape {
+ join("",
+ map { $_ > 255 ? sprintf("\\x{%04X}", $_) : chr($_) }
+ unpack("U*", $_[0]));
+}
+
sub stringify {
local($_,$noticks) = @_;
local($v) ;
} elsif ($unctrl eq 'unctrl') {
s/([\"\\])/\\$1/g ;
s/([\000-\037\177])/'^'.pack('c',ord($1)^64)/eg;
+ # uniescape?
s/([\200-\377])/'\\0x'.sprintf('%2X',ord($1))/eg
if $quoteHighBit;
} elsif ($unctrl eq 'quote') {
s/\033/\\e/g;
s/([\000-\037\177])/'\\c'.chr(ord($1)^64)/eg;
}
+ $_ = uniescape($_);
s/([\200-\377])/'\\'.sprintf('%3o',ord($1))/eg if $quoteHighBit;
($noticks || /^\d+(\.\d*)?\Z/)
? $_
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
##
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
##
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
-
- "http://www.w3.org/TR/REC-html40/loose.dtd">
-
-<html>
-
-<head>
-<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
-<meta http-equiv="Content-Language" content="en-us">
-<meta name="GENERATOR" content="Microsoft FrontPage 4.0">
-<meta name="ProgId" content="FrontPage.Editor.Document">
-<meta name="keywords"
-content="unicode, normalization, composition, decomposition">
-<meta name="description" content="Specifies the Unicode Normalization Formats">
-<title>UCD: Unicode NamesList File Format</title>
-<link rel="stylesheet" type="text/css" href="http://www.unicode.org/unicode.css">
-<style type="text/css">
-
-<!--
-
-.foo { }
--->
-
-</style>
-</head>
-
-<body bgcolor="#ffffff">
-
-<table width="100%" cellpadding="0" cellspacing="0" border="0">
- <tr>
- <td>
- <table width="100%" border="0" cellpadding="0" cellspacing="0">
- <tr>
- <td class="icon"><a href="http://www.unicode.org"><img border="0"
- src="http://www.unicode.org/webscripts/logo60s2.gif" align="middle"
- alt="[Unicode]" width="34" height="33"></a> <a
- class="bar" href="UnicodeCharacterDatabase-3.1.0.html">Unicode Character
- Database</a></td>
- </tr>
- </table>
- </td>
- </tr>
- <tr>
- <td class="gray"> </td>
- </tr>
-</table>
- <h1>Unicode NamesList File Format</h1>
-<table height="87" cellSpacing="2" cellPadding="0" width="100%" border="1">
- <tbody>
- <tr>
- <td vAlign="top" width="144">Revision</td>
- <td vAlign="top">3.1</td>
- </tr>
- <tr>
- <td vAlign="top" width="144">Authors</td>
- <td vAlign="top">Asmus Freytag</td>
- </tr>
- <tr>
- <td vAlign="top" width="144">Date</td>
- <td vAlign="top">2001-02-26</td>
- </tr>
- <tr>
- <td vAlign="top" width="144">This Version</td>
- <td vAlign="top"><a href="http://http://www.unicode.org/Public/3.1-Update/NamesList-2.html">http://www.unicode.org/Public/3.1-Update/NamesList-2.html</a></td>
- </tr>
- <tr>
- <td vAlign="top" width="144">Previous Version</td>
- <td vAlign="top"><a href="http://http://www.unicode.org/Public/3.0-Update/NamesList-1.html">http://www.unicode.org/Public/3.0-Update/NamesList-1.html</a></td>
- </tr>
- <tr>
- <td vAlign="top" width="144">Latest Version</td>
- <td vAlign="top"><a href="http://www.unicode.org/Public/UNIDATA/NamesList.html">http://www.unicode.org/Public/UNIDATA/NamesList.html</a></td>
- </tr>
- </tbody>
-</table>
-<h3>
-<br>
-<i>Summary</i></h3>
-<blockquote>
- <p>This file describes the format and contents of NamesList.txt</p>
-</blockquote>
-<h3><i>Status</i></h3>
-<blockquote>
-<p>
-<i>The file and the files described herein are part of the <a href="UnicodeCharacterDatabase-3.1.0.html"> Unicode Character Database</a>
-(UCD)
-and are governed by the <a href="#Terms of Use">UCD Terms of Use</a> stated at the end.</i></p>
-</blockquote>
- <hr width="50%">
-
-<h2>1.0 Introduction</h2>
-
-<p>The Unicode name list file NamesList.txt (also NamesList.lst) is a plain text file used
-to drive the layout of the character code charts in the Unicode Standard. The information
-in this file is a combination of several fields from the UnicodeData.txt and Blocks.txt files,
-together with additional annotations for many characters. This document describes the
-syntax rules for the file format, but also gives brief information on how each construct
-is rendered when laid out for the book. Some of the syntax elements were used in
-preparation of the drafts of the book and may not be present in the final, released form
-of the NamesList.txt file.</p>
-
-<p>The same input file can be used to do the draft preparation for ISO/IEC 10646 (referred
-below as ISO-style). This necessitates the presence of some information in the name list
-file that is not needed (and in fact removed during parsing) for the Unicode book.</p>
-
-<p>With access to the layout program (unibook.exe) it is a simple matter of creating
-name lists for the purpose of formatting working drafts containing proposed characters.</p>
-
-<h3>1.1 NamesList File Overview</h3>
-
-<p>The *.lst files are plain text files which in their most simple form look like this</p>
-
-<p>@@<tab>0020<tab>BASIC LATIN<tab>007F<br>
-; this is a file comment (ignored)<br>
-0020<tab>SPACE<br>
-0021<tab>EXCLAMATION MARK<br>
-0022<tab>QUOTATION MARK<br>
-. . . <br>
-007F<tab>DELETE</p>
-
-<p>The semicolon (as first character), @ and <tab> characters are used by the file
-syntax and must be provided as shown. Hexadecimal digits must be in UPPER CASE). A double
-@@ introduces a block header, with the title, and start and ending code of the block
-provided as shown.</p>
-
-<p>For an ISO-style, minimal name list, only the NAME_LINE and BLOCKHEADER and their
-constituent syntax elements are needed.</p>
-
-<p>The full syntax with all the options is provided in the following sections.</p>
-
-<h3>1.2 NamesList File Structure</h3>
-
-<p>This section gives defines the overall file structure</p>
-
-<pre><strong>NAMELIST: TITLE_PAGE* BLOCK*
-</strong>
-<strong>TITLE_PAGE: TITLE
- | TITLE_PAGE SUBTITLE
- | TITLE_PAGE SUBHEADER
- | TITLE_PAGE IGNORED_LINE
- | TITLE_PAGE EMPTY_LINE
- | TITLE_PAGE COMMENTLINE
- | TITLE_PAGE NOTICE
- | TITLE_PAGE PAGEBREAK
-</strong>
-<strong>BLOCK: BLOCKHEADER
- | BLOCK CHAR_ENTRY
- | BLOCK SUBHEADER
- | BLOCK NOTICE
- | BLOCK EMPTY_LINE
- | BLOCK IGNORED_LINE
- | BLOCK PAGEBREAK
-
-CHAR_ENTRY: NAME_LINE | RESERVED_LINE
- | CHAR_ENTRY ALIAS_LINE
- | CHAR_ENTRY COMMENT_LINE
- | CHAR_ENTRY CROSS_REF
- | CHAR_ENTRY DECOMPOSITION
- | CHAR_ENTRY COMPAT_MAPPING
- | CHAR_ENTRY IGNORED_LINE
- | CHAR_ENTRY EMPTY_LINE
- | CHAR_ENTRY NOTICE
-</strong></pre>
-
-<p>In other words:<br>
-<br>
-Neither TITLE nor SUBTITLE may occur after the first BLOCKHEADER. </p>
-
-<p>Only TITLE, SUBTITLE, SUBHEADER, PAGEBREAK, COMMENT_LINE, and IGNORED_LINE may
-occur before the first BLOCKHEADER.</p>
-
-<p>Directly following either a NAME_LINE or a RESERVED_LINE an uninterrupted sequence of
-the following lines may occur (in any order and repeated as often as needed): ALIAS_LINE,
-CROSS_REF, DECOMPOSITION, COMPAT_MAPPING, NOTICE, EMPTY_LINE and IGNORED_LINE.</p>
-
-<p>Except for EMPTY_LINE, NOTICE and IGNORED_LINE, none of these lines may occur in any other
-place. </p>
-
-<p>Note: A NOTICE displays differently depending on whether it follows a header or title
-or is part of a CHAR_ENTRY.</p>
-
-<h3>1.3 NamesList File Elements</h3>
-
-<p>This section provides the details of the syntax for the individual elements.</p>
-
-<pre><small><strong>ELEMENT SYNTAX</strong> // How rendered</small></pre>
-
-<pre><small><strong>NAME_LINE: CHAR <tab> LINE
-</strong> // the CHAR and the corresponding image are echoed,
- // followed by the name as given in LINE
-
-<strong> CHAR TAB NAME COMMENT LF
-</strong> // Names may have a comment, which is stripped off
- // unless the file is parsed for an ISO style list
-
-<strong>RESERVED_LINE: CHAR TAB <reserved>
-</strong> // the CHAR is echoed followed by an icon for the
- // reserved character and a fixed string e.g. <reserved>
-
-<strong>COMMMENT_LINE: <tab> "*" SP EXPAND_LINE
-</strong> // * is replaced by BULLET, output line as comment
- <strong><tab> EXPAND_LINE</strong>
- // output line as comment
-
-<strong>ALIAS_LINE: <tab> "=" SP LINE
-</strong> // replace = by itself, output line as alias
-
-<strong>CROSS_REF: <tab> "X" SP EXPAND_LINE
-</strong> // X is replaced by a right arrow
-<strong> <tab> "X" SP "(" STRING SP "-" SP CHAR ")"
-</strong> // X is replaced by a right arrow
- // the "(", "-", ")" are removed, the
- // order of CHAR and STRING is reversed
- // i.e. both inputs result in the same output
-
-<strong>IGNORED_LINE: <tab> ";" EXPAND_LINE
-EMPTY_LINE: LF
-</strong> // empty lines and file comments are ignored
-
-<strong>DECOMPOSITION: <tab> ":" EXPAND_LINE
-</strong> // replace ':' by EQUIV, expand line into
- // decomposition
-
-<strong>COMPAT_MAPPING: <tab> "#" SP EXPAND_LINE
-</strong> // replace '#' by APPROX, output line as mapping
-
-<strong>NOTICE: "@+" <tab> LINE
-</strong> // skip '@+', output text as notice
-<strong> "@+" TAB * SP LINE
-</strong> // skip '@', output text as notice
- // "*" expands to a bullet character
- // Notices following a character code apply to the
- // character and are indented. Notices not following
- // a character code apply to the page/block/column
- // and are italicized, but not indented
-
-<strong>SUBTITLE: "@@@+" <tab> LINE
-</strong> // skip "@@@+", output text as subtitle
-
-<strong>SUBHEADER: "@" <tab> LINE
-</strong> // skip '@', output line as text as column header
-
-<strong>BLOCKHEADER: "@@" <tab> BLOCKSTART <tab> BLOCKNAME <tab> BLOCKEND
-</strong> // skip "@@", cause a page break and optional
- // blank page, then output one or more charts
- // followed by the list of character names.
- // use BLOCKSTART and BLOCKEND to define the
- // characters belonging to a block
- // use blockname in page and table headers
- <strong> "@@" <tab> BLOCKSTART <tab> BLOCKNAME COMMENT <tab> BLOCKEND
- </strong>// if a comment is present it replaces the blockname
- // when an ISO-style namelist is laid out
-
-<strong>BLOCKSTART: CHAR</strong> // first character position in block
-<strong>BLOCKEND: CHAR</strong> // last character position in block
-<strong>PAGE_BREAK: "@@"</strong> // insert a (column) break
-
-<strong>TITLE: "@@@" <tab> LINE</strong>
- // skip "@@@", output line as text
- // Title is used in page headers
-
-<strong>EXPAND_LINE: {CHAR | STRING}+ LF </strong>
- // all instances of CHAR *) are replaced by
- // CHAR NBSP x NBSP where x is the single Unicode
- // character corresponding to char
- // If character is combining, it is replaced with
- // CHAR NBSP <circ> x NBSP where <circ> is the
- // dotted circle</small></pre>
-
-<p><strong>Notes:</strong>
-
-</p>
-
-<ul>
- <li>Blocks must be aligned on 16-code point boundary and contain an integer
- multiple of code points. The exception to that rule is for blocks of
- ideographs etc. for which no names are listed in the file. Such blocks must
- end on the actual last character.</li>
- <li>Blocks must be non-overlapping and in ascending order. Namelines
- must be in ascending order and following the block header for the block to
- which they belong.</li>
- <li>Reserved entries are optional, and will be supplied automatically. They
- are required whenever followed by ALIAS_LINE, COMMENT_LINE or CROSS_REF</li>
-</ul>
-
-<h3><strong>1.4 NamesList File Primitives</strong></h3>
-
-<p>The following are the primitives and terminals for the NamesList syntax.</p>
-
-<pre><strong><small>LINE: STRING LF
-COMMENT: "(" NAME ")"
- "(" NAME ")" "*" </small></strong><small>
-<strong>BLOCKNAME:</strong> <sequence of Latin-1 characters, except "(" and ")">
-<strong>NAME</strong>: <sequence of uppercase ASCII letters, digit and hyphen>
-<strong>STRING</strong>: <sequence of Latin-1 characters>
-<strong>CHAR</strong>: <strong>X X X X</strong>
- <strong>| X X X X X</strong>
- <strong>| X X X X X X</strong></small>
-<small><strong>X: "0"|"1"|"2"|"3"|"4"|"5"|"6"|"7"|"8"|"9"|"A"|"B"|"C"|"D"|"E"|"F"
-<tab>:</strong> <sequence of one or more ASCII tab characters 0x09>
-<strong>SP</strong>: <ASCII 0x20>
-<strong>LF</strong>: <any sequence of ASCII 0x0A and 0x0D>
-</small></pre>
-
-<p><strong>Notes:</strong>
-
-<ul>
- <li>Special lookahead logic prevents a mention of a 4 digit standard, such as ISO 9999 from
- being misinterpreted as ISO CHAR. The - in a character range CHAR-CHAR is
- replaced by an EN DASH.</li>
- <li>Use of Latin-1 is supported in unibook.exe, but not portably, unless the file is encoded as
- UTF-16LE.</li>
- <li>The final LF in the file must be present</li>
- <li>A CHAR inside ' or " is expanded, but only its glyph image is printed,
- the
- code value is not echoed.</li>
- <li>Straight quotes in an EXPAND_LINE are replaced by curly quotes using English rules.
- Apostrophes are supported, but nested quotes are not.</li>
-</ul>
-<h2>Modifications</h2>
-<p>Use of 4-6 digit hex notation is now supported.</p>
- <hr width="50%">
-<h2>
-UCD <a name="Terms of Use">Terms of Use</a></h2>
-<h3>
-<i>Disclaimer</i></h3>
-<blockquote>
- <p><i>The Unicode Character Database is provided as is by Unicode, Inc. No
- claims are made as to fitness for any particular purpose. No warranties of any
- kind are expressed or implied. The recipient agrees to determine applicability
- of information provided. If this file has been purchased on magnetic or
- optical media from Unicode, Inc., the sole remedy for any claim will be
- exchange of defective media within 90 days of receipt.</i></p>
- <p><i>This disclaimer is applicable for all other data files accompanying the
- Unicode Character Database, some of which have been compiled by the Unicode
- Consortium, and some of which have been supplied by other sources.</i></p>
-</blockquote>
-<h3><i>Limitations on Rights to Redistribute This Data</i></h3>
-<blockquote>
- <p><i>Recipient is granted the right to make copies in any form for internal
- distribution and to freely use the information supplied in the creation of
- products supporting the Unicode<sup>TM</sup> Standard. The files in the
- Unicode Character Database can be redistributed to third parties or other
- organizations (whether for profit or not) as long as this notice and the
- disclaimer notice are retained. Information can be extracted from these files
- and used in documentation or programs, as long as there is an accompanying
- notice indicating the source.</i></p>
-</blockquote>
- <hr width="50%">
- <div align="center">
- <center>
- <table cellspacing="0" cellpadding="0" border="0">
- <tr>
- <td><a href="../../../../../../index.html"><img
- src="http://www.unicode.org/img/hb_home.gif" border="0"
- alt="Home" width="40" height="49"></a><a
- href="../copyright.html"><img
- src="http://www.unicode.org/img/hb_mid.gif" border="0"
- alt="Terms of Use" width="152" height="49"></a><a
- href="mailto:info@unicode.org"><img
- src="http://www.unicode.org/img/hb_mail.gif" border="0"
- alt="E-mail" width="46" height="49"></a></td>
- </tr>
- </table>
- <script language="Javascript" src="http://www.unicode.org/webscripts/lastModified.js"></script>
- </center>
- </div>
-</form>
-
-</body>
-
-</html>
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
-<html>
-
-<head>
-<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
-<meta http-equiv="Content-Language" content="en-us">
-<meta name="GENERATOR" content="Microsoft FrontPage 4.0">
-<meta name="ProgId" content="FrontPage.Editor.Document">
-<meta name="keywords"
-content="unicode, normalization, composition, decomposition">
-<meta name="description" content="Describes PropList.html">
-<title>UCD: Extended Character Properties</title>
-<link rel="stylesheet" type="text/css" href="http://www.unicode.org/unicode.css">
-</head>
-
-<body bgcolor="#ffffff">
-
-<table width="100%" cellpadding="0" cellspacing="0" border="0">
- <tr>
- <td>
- <table width="100%" border="0" cellpadding="0" cellspacing="0">
- <tr>
- <td class="icon"><a href="http://www.unicode.org"><img border="0"
- src="http://www.unicode.org/webscripts/logo60s2.gif" align="middle"
- alt="[Unicode]" width="34" height="33"></a> <a
- class="bar" href="UnicodeCharacterDatabase.html">Unicode Character
- Database</a></td>
- </tr>
- </table>
- </td>
- </tr>
- <tr>
- <td class="gray"> </td>
- </tr>
-</table>
-<h1>Extended Character Properties</h1>
-<table height="87" cellspacing="2" cellpadding="0" width="100%" border="1">
- <tbody>
- <tr>
- <td valign="top" width="144">Revision</td>
- <td valign="top">3.1.1</td>
- </tr>
- <tr>
- <td valign="top" width="144">Authors</td>
- <td valign="top">Mark Davis</td>
- </tr>
- <tr>
- <td valign="top" width="144">Date</td>
- <td valign="top">2001-07-12</td>
- </tr>
- <tr>
- <td valign="top" width="144">This Version</td>
- <td valign="top"><a
- href="http://www.unicode.org/Public/3.1-Update1/PropList-3.1.1.html">http://www.unicode.org/Public/3.1-Update1/PropList-3.1.1.html</a></td>
- </tr>
- <tr>
- <td valign="top" width="144">Previous Version</td>
- <td valign="top">n/a</td>
- </tr>
- <tr>
- <td valign="top" width="144">Latest Version</td>
- <td valign="top"><a
- href="http://www.unicode.org/Public/UNIDATA/PropList.html">http://www.unicode.org/Public/UNIDATA/PropList.html</a></td>
- </tr>
- </tbody>
-</table>
-<h3><i><br>
-Summary</i></h3>
-<blockquote>
- <p><i>This document describes the format and content of the PropList.txt data
- file in the Unicode Character Database (UCD).</i></p>
-</blockquote>
-<h3><i>Status</i></h3>
-<blockquote>
- <p><i>The file and the files described herein are part of the Unicode
- Character Database and governed by the <a href="#UCD_Terms">UCD Terms of Use</a>
- given below.</i></p>
- <p><i>For general information on file formats and table formats, and the
- implications of normative vs informative properties, see
- UnicodeCharacterDatabase.html.</i></p>
- <p><i><b>Warning: </b>the information in this file does not completely
- describe the use and interpretation of Unicode character properties and
- behavior. It must be used in conjunction with the data in the other files in
- the UCD, and relies on the notation and definitions supplied in <a
- href="http://www.unicode.org/unicode/standard/versions/Unicode3.0.html">The
- Unicode Standard</a>. All chapter references are to Version 3.1.0 of the
- standard.</i></p>
-</blockquote>
-<hr width="50%">
-<h2>Introduction</h2>
-<p align="left">PropList.txt contains extended properties that supplement the
-General Category property described in UnicodeData.html. Unlike the derived
-properties, the properties in PropList.txt cannot be derived directly from
-UnicodeData.txt or other data files of the UCD. These properties are listed in
-the following table.</p>
-<div align="center">
- <center>
- <table border="1" cellspacing="0" cellpadding="3" class="smallText">
- <tr>
- <th>Property Value</th>
- <th>N/I</th>
- <th>Definition and Usage</th>
- </tr>
- <tr>
- <th valign="top">White_space</th>
- <th valign="top">N</th>
- <td valign="top">Space characters and those format control characters
- (such as TAB, CR and LF) which should be treated by programming
- languages as "white space" for the purpose of parsing
- elements.
- <p><b>Note:</b> ZERO WIDTH SPACE and ZERO WIDTH NO-BREAK SPACE are not
- included, since their functions are restricted to line-break control.
- Their names are unfortunately misleading in this respect.</p>
- <p><b>Note: </b>There are other senses of "whitespace" that
- encompass a different set of characters.</p>
- </td>
- </tr>
- <tr>
- <th valign="top">Bidi_Control</th>
- <th valign="top">N</th>
- <td valign="top">Those format control characters which have specific
- functions in the Bidirectional Algorithm.</td>
- </tr>
- <tr>
- <th valign="top">Join_Control</th>
- <th valign="top">N</th>
- <td valign="top">Those format control characters which have specific
- functions for control of cursive joining and ligation.</td>
- </tr>
- <tr>
- <th valign="top">ASCII_Hex_Digit</th>
- <th valign="top">N</th>
- <td valign="top">ASCII characters commonly used for the representation of
- hexadecimal numbers.</td>
- </tr>
- <tr>
- <th valign="top">Dash</th>
- <th valign="top">I</th>
- <td valign="top">Those punctuation characters explicitly called out as
- dashes in the Unicode Standard, plus compatibility equivalents to those.
- Most of these have the Pd General Category, but some have the Sm General
- Category because of their use in mathematics.</td>
- </tr>
- <tr>
- <th valign="top">Hyphen</th>
- <th valign="top">I</th>
- <td valign="top">Those dashes used to mark connections between pieces of
- words, plus the Katakana middle dot. The Katakana middle dot functions
- like a hyphen, but is shaped like a dot rather than a dash.</td>
- </tr>
- <tr>
- <th valign="top">Quotation_Mark</th>
- <th valign="top">I</th>
- <td valign="top">Those punctuation characters that function as quotation
- marks.</td>
- </tr>
- <tr>
- <th valign="top">Terminal_Punctuation</th>
- <th valign="top">I</th>
- <td valign="top">Those punctuation characters that generally mark the end
- of textual units.</td>
- </tr>
- <tr>
- <th valign="top">Other_Math</th>
- <th valign="top">I</th>
- <td valign="top">Math characters that do not have the Sm General Category.</td>
- </tr>
- <tr>
- <th valign="top">Hex_Digit</th>
- <th valign="top">I</th>
- <td valign="top">Characters commonly used for the representation of
- hexadecimal numbers, plus their compatibility equivalents.</td>
- </tr>
- <tr>
- <th valign="top">Other_Alphabetic</th>
- <th valign="top">I</th>
- <td valign="top">Alphabetic characters that do not have L as their major
- class for the General Category (Lu, Ll, Lt, Lm, Lo).</td>
- </tr>
- <tr>
- <th valign="top">Ideographic</th>
- <th valign="top">I</th>
- <td valign="top">Characters considered to be CJKV (Chinese, Japanese,
- Korean, and Vietnamese) ideographs.</td>
- </tr>
- <tr>
- <th valign="top">Diacritic</th>
- <th valign="top">I</th>
- <td valign="top">Characters that linguistically modify the meaning of
- another character to which they apply. Some diacritics are not combining
- characters, and some combining characters are not diacritics.</td>
- </tr>
- <tr>
- <th valign="top">Extender</th>
- <th valign="top">I</th>
- <td valign="top">Characters whose principal function is to extend the
- value or shape of a preceding alphabetic character. Typical of these are
- length and iteration marks.</td>
- </tr>
- <tr>
- <th valign="top">Other_Lowercase</th>
- <th valign="top">I</th>
- <td valign="top">Lowercase characters that do not have the Ll General
- Category.</td>
- </tr>
- <tr>
- <th valign="top">Other_Uppercase</th>
- <th valign="top">I</th>
- <td valign="top">Uppercase characters that do not have the Lu General
- Category.</td>
- </tr>
- <tr>
- <th valign="top">Noncharacter_Code_Point</th>
- <th valign="top">N</th>
- <td valign="top">Code points that are explicitly defined as illegal for
- the encoding of characters. See <a
- href="http://www.unicode.org/unicode/reports/tr27/">Unicode 3.1</a> for
- more information.</td>
- </tr>
- </table>
- </center>
-</div>
-<h2><i><a name="UCD_Terms"><br>
-UCD Terms of Use</a></i></h2>
-<h3><i>Disclaimer</i></h3>
-<blockquote>
- <p><i>The Unicode Character Database is provided as is by Unicode, Inc. No
- claims are made as to fitness for any particular purpose. No warranties of any
- kind are expressed or implied. The recipient agrees to determine applicability
- of information provided. If this file has been purchased on magnetic or
- optical media from Unicode, Inc., the sole remedy for any claim will be
- exchange of defective media within 90 days of receipt.</i></p>
- <p><i>This disclaimer is applicable for all other data files accompanying the
- Unicode Character Database, some of which have been compiled by the Unicode
- Consortium, and some of which have been supplied by other sources.</i></p>
-</blockquote>
-<h3><i>Limitations on Rights to Redistribute This Data</i></h3>
-<blockquote>
- <p><i>Recipient is granted the right to make copies in any form for internal
- distribution and to freely use the information supplied in the creation of
- products supporting the Unicode<sup>TM</sup> Standard. The files in the
- Unicode Character Database can be redistributed to third parties or other
- organizations (whether for profit or not) as long as this notice and the
- disclaimer notice are retained. Information can be extracted from these files
- and used in documentation or programs, as long as there is an accompanying
- notice indicating the source.</i></p>
-</blockquote>
-<hr width="50%">
-<p align="center"><a href="http://www.unicode.org/unicode/copyright.html"><img
-src="http://www.unicode.org/img/hb_home.gif" border="0" alt="Home" width="40"
-height="49"><img src="http://www.unicode.org/img/hb_mid.gif" border="0"
-alt="Terms of Use" width="152" height="49"><img
-src="http://www.unicode.org/img/hb_mail.gif" border="0" alt="E-mail" width="46"
-height="49"></a>
-
-</body>
-
-</html>
http://www.unicode.org/Public/3.1-Update/
-and most of them were renamed to better fit 8.3 filename limitations,
-by which the Perl distribution tries to live. The renamings are listed
-in the file 'rename'.
-
The two big files, NormalizationTest.txt (2.0MB) and Unihan.txt (15.8MB)
-were not copied due to space considerations. Also not included are the
-derived files:
+were not copied due to space considerations. Also not included are any
+*.html files and the derived files:
DerivedBidiClass.txt
DerivedBinaryProperties.txt
DerivedNumericValues.txt
DerivedProperties.html
-The *.pl files are generated from these files by the 'mktables.PL' script.
-
-While the files have been renamed the links in the html files haven't.
+The *.pl files are generated from these files by the mktables script.
--
jhi@iki.fi
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
return <<'END';
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
-
- "http://www.w3.org/TR/REC-html40/loose.dtd">
-
-<html>
-
-<head>
-<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
-<meta http-equiv="Content-Language" content="en-us">
-<meta name="GENERATOR" content="Microsoft FrontPage 4.0">
-<meta name="ProgId" content="FrontPage.Editor.Document">
-<link rel="stylesheet" href="http://www.unicode.org/unicode.css" type="text/css">
-<title>Unicode Character Database</title>
-</head>
-
-<body>
-
-<table width="100%" cellpadding="0" cellspacing="0" border="0">
- <tr>
- <td>
- <table width="100%" border="0" cellpadding="0" cellspacing="0">
- <tr>
- <td class="icon"><a href="http://www.unicode.org"><img border="0"
- src="http://www.unicode.org/webscripts/logo60s2.gif" align="middle"
- alt="[Unicode]" width="34" height="33"></a> <a
- class="bar" href="UnicodeCharacterDatabase.html">Unicode Character
- Database</a></td>
- </tr>
- </table>
- </td>
- </tr>
- <tr>
- <td class="gray"> </td>
- </tr>
-</table>
-<h1>UNICODE CHARACTER DATABASE</h1>
-<table border="1" cellspacing="2" cellpadding="0" height="87" width="100%">
- <tr>
- <td valign="TOP" width="144">Revision</td>
- <td valign="TOP">3.1.0</td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Authors</td>
- <td valign="TOP">Mark Davis and Ken Whistler</td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Date</td>
- <td valign="TOP">2001-02-28</td>
- </tr>
- <tr>
- <td valign="TOP" width="144">This Version</td>
- <td valign="TOP"><a
- href="http://http://www.unicode.org/Public/3.1-Update/UnicodeCharacterDatabase-3.1.0.html">http://www.unicode.org/Public/3.1-Update/UnicodeCharacterDatabase-3.1.0.html</a></td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Previous Version</td>
- <td valign="TOP"><a
- href="http://www.unicode.org/Public/3.0-Update1/UnicodeCharacterDatabase-3.0.1.html">http://www.unicode.org/Public/3.0-Update1/UnicodeCharacterDatabase-3.0.1.html</a></td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Latest Version</td>
- <td valign="TOP"><a
- href="http://www.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html">http://www.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html</a></td>
- </tr>
-</table>
-<h3><br>
-S<i>ummary</i></h3>
-<blockquote>
- <p><i>This document describes the format and content of the Unicode Character
- Database (UCD)</i></p>
-</blockquote>
-<h3><i>Status</i></h3>
-<blockquote>
- <p><i>The file and the files described herein are part of the Unicode
- Character Database and are governed by the <a href="#UCD_Terms">UCD Terms of
- Use</a> given below.</i></p>
- <p><i>The <a href="#References">References</a> provide related information
- that is useful in understanding this document.</i></p>
- <p><i><b>Warning: </b>the information in this file does not completely
- describe the use and interpretation of Unicode character properties and
- behavior. It must be used in conjunction with the data in the other files in
- the Unicode Character Database, and relies on the notation and definitions
- supplied in <a
- href="http://www.unicode.org/unicode/standard/versions/Unicode3.0.html">The
- Unicode Standard</a>. All chapter references are to Version 3.1.0 of the
- standard.</i></p>
-</blockquote>
-<h2>Introduction</h2>
-<p>The Unicode Character Database (UCD) is a set of files that define the
-Unicode character properties and internal mappings. This document describes the
-files that are part of <a href="http://www.unicode.org/unicode/reports/tr27/">The
-Unicode Standard, Version 3.1</a> [<a href="#U3.1">U3.1</a>]. The main changes
-in this version are:</p>
-<ul>
- <li>All of the data files have been updated to account for the large number of
- additional characters in Unicode 3.1.</li>
- <li>PropList.txt has been extensively reorganized and reformatted.</li>
- <li>Scripts.txt has been added to the UCD.</li>
- <li>A large number of informative derived property files have been added to
- the UCD.</li>
-</ul>
-<p><i>Files in the UCD use a common format unless otherwise specified. For
-details, see <a href="#UCD_File_Format">UCD File Format</a>.</i></p>
-<h2><a name="Conformance">Conformance</a></h2>
-<p>For information on the meaning and application of the terms normative and
-informative, see "Chapter 4, Character Properties (revision)" in <a
-href="http://www.unicode.org/unicode/reports/tr27/#conformance">UAX #27, Unicode
-3.1</a>.</p>
-<p>Some informative data files contain derived properties, properties that can
-be derived from other properties in the UCD. The derived properties that are
-computed from solely normative properties are themselves normative, while the
-others are informative.</p>
-<h2>UCD Files</h2>
-<p>The following table summarizes the files in the Unicode Character Database.
- For more information about these files, see the referenced technical
-report(s), files, or section of Unicode Standard, Version 3.1.</p>
-<table border="1" cellspacing="0" cellpadding="4">
- <tr>
- <th>".txt" File</th>
- <th>Description</th>
- <th align="center">N/I</th>
- <th>Summary</th>
- </tr>
- <tr>
- <td>ArabicShaping</td>
- <td>Section 8.2</td>
- <td align="center">N</td>
- <td>Basic Arabic and Syriac character shaping properties, such as initial,
- medial and final shapes.</td>
- </tr>
- <tr>
- <td>BidiMirroring</td>
- <td><a href="http://www.unicode.org/unicode/reports/tr9/">UAX #9</a></td>
- <td align="center">I</td>
- <td>Properties for substituting characters in an implementation of
- bidirectional mirroring.</td>
- </tr>
- <tr>
- <td>Blocks</td>
- <td>Chapter 14</td>
- <td align="center">N</td>
- <td>List of block names.</td>
- </tr>
- <tr>
- <td>CaseFolding</td>
- <td><a href="http://www.unicode.org/unicode/reports/tr21/">UTR #21</a></td>
- <td align="center">N</td>
- <td>Mapping from characters to their case-folded forms. This is an
- informative file containing normative derived properties.
- <p><i>Derived from UnicodeData and SpecialCasing.</i></p>
- </td>
- </tr>
- <tr>
- <td>CompositionExclusions</td>
- <td><a href="http://www.unicode.org/unicode/reports/tr15/">UAX #15</a></td>
- <td align="center">N</td>
- <td>Properties for normalization.</td>
- </tr>
- <tr>
- <td><i>DerivedXXX</i></td>
- <td>DerivedProperties.html</td>
- <td align="center">N/I</td>
- <td>Various informative derived files, described in the documentation file.
- Some of the derived properties are normative and some are informative.</td>
- </tr>
- <tr>
- <td>EastAsianWidth</td>
- <td><a href="http://www.unicode.org/unicode/reports/tr11/">UAX #11</a></td>
- <td align="center">I</td>
- <td>Properties for determining the choice of wide vs. narrow glyphs in East
- Asian contexts.</td>
- </tr>
- <tr>
- <td>Index</td>
- <td>Chapter 14</td>
- <td align="center">I</td>
- <td>Index to Unicode characters, as printed in the Unicode Standard. (See <a
- href="#Update_Note">Update Note</a>.)</td>
- </tr>
- <tr>
- <td>Jamo</td>
- <td>Chapter 4</td>
- <td align="center">N</td>
- <td>List of Jamo short names, used in deriving HANGUL SYLLABLE names
- algorithmically.</td>
- </tr>
- <tr>
- <td>LineBreak</td>
- <td><a href="http://www.unicode.org/unicode/reports/tr14/">UAX #14</a></td>
- <td align="center">N/I</td>
- <td>Properties for line breaking.</td>
- </tr>
- <tr>
- <td>NamesList</td>
- <td>Chapter 14</td>
- <td align="center">I</td>
- <td>This file duplicates some of the material in the UnicodeData file, and
- adds annotations used in the character charts.</td>
- </tr>
- <tr>
- <td>NormalizationTest</td>
- <td><a href="http://www.unicode.org/unicode/reports/tr15/">UAX #15</a></td>
- <td align="center">N</td>
- <td>Test file for conformance to Unicode Normalization Forms.</td>
- </tr>
- <tr>
- <td>PropList</td>
- <td>PropList.html</td>
- <td align="center">N/I</td>
- <td>Extended character properties</td>
- </tr>
- <tr>
- <td>Scripts</td>
- <td><a href="http://www.unicode.org/unicode/reports/tr24/">UTR #24</a></td>
- <td align="center">I</td>
- <td>Default scripts values for use in regular expressions.</td>
- </tr>
- <tr>
- <td>SpecialCasing</td>
- <td>Chapter 4,<br>
- <a href="http://www.unicode.org/unicode/reports/tr21/">UTR #21</a></td>
- <td align="center">N</td>
- <td>List of properties required for full case mapping.</td>
- </tr>
- <tr>
- <td>UnicodeData</td>
- <td>UnicodeData.html,<br>
- Chapter 4,<br>
- <a href="http://www.unicode.org/unicode/reports/tr21/">UTR #21</a>,<br>
- <a href="http://www.unicode.org/unicode/reports/tr15/">UAX #15</a></td>
- <td align="center">N/I</td>
- <td>The main file in the UCD. </td>
- </tr>
- <tr>
- <td>Unihan</td>
- <td>Unihan.txt</td>
- <td align="center">N/I</td>
- <td>Extended properties of Han (CJK) characters. (See <a href="#Format_Note">Format
- Note</a>.)</td>
- </tr>
-</table>
-<blockquote>
- <p><b><a name="Update_Note">Update Note</a>: </b>The information in Index.txt
- files matches the appropriate version of the book. Changes in the Unicode
- Character Database since then may not be reflected in these files, since they
- are primarily of archival interest.</p>
- <p><b><a name="Format_Note">Format Note</a>: </b>The file data format differs
- from the standard format, and is described in the header of the file. The
- header also describes which properties are informative and which are
- normative.</p>
-</blockquote>
-<h2><a name="UCD_File_Format">UCD File Format</a></h2>
-<p>Files in the UCD use the following format, unless otherwise specified.</p>
-<ul>
- <li>Each line of data consists of fields separated by semicolons. The fields
- are numbered starting with zero. Code points are expressed as hexadecimal
- numbers with four to six digits. They are written without "U+".
- Within a sequence of code points, spaces are used for separation. Leading
- and trailing spaces within a field are not significant.</li>
-</ul>
-<ul>
- <li>The first field (0) of each line in the Unicode Character Database files
- represents a code point or range. The remaining fields (1..n) are properties
- associated with that code point.</li>
-</ul>
-<ul>
- <li>A range of code points is specified by the form "X..Y". Each
- code point from X to Y has the associated properties. For example:</li>
-</ul>
-<blockquote>
- <pre>0000..007F; Basic Latin
-0080..00FF; Latin-1 Supplement
-
-1680 ; White_space # Zs OGHAM SPACE MARK
-2000..200A; White_space # Zs [11] EN QUAD..HAIR SPACE</pre>
-</blockquote>
-<ul>
- <li>Hash marks ("#") are used to indicate comments: all characters
- from the hash mark to the end of the line are comments, and disregarded when
- parsing data. In many files, the comments on data lines use a common format.</li>
-</ul>
-<blockquote>
- <pre>00BC..00BE ; numeric # No [3] VULGAR FRACTION ONE QUARTER..VULGAR FRACTION THREE QUARTERS</pre>
-</blockquote>
-<ul>
- <li>The first part of the comment is the UCD general category. The symbol
- "L&" indicates characters of type Lu, Ll, or Lt. The code
- point ranges are calculated so that they all have the same General Category
- (or L&). While this results in more ranges than are strictly necessary,
- it makes the contents of the ranges clearer. The second part of the comment
- (in square brackets), indicates the number of items in a range, if there is
- one. The third part is the name of the character in field zero: if it is a
- range, then the character names for the ends of the range are separated by
- "..".</li>
-</ul>
-<p>However, the comments are purely informational, and may change format or be
-omitted in the future. They should not be parsed for content.</p>
-<h2><a name="References">References</a></h2>
-<table cellspacing="12" cellpadding="0" width="100%" border="0">
- <tbody>
- <tr>
- <td valign="top" width="1">[<a name="FAQ">FAQ</a>]</td>
- <td valign="top">Unicode Frequently Asked Questions<br>
- <a href="http://www.unicode.org/unicode/faq/">http://www.unicode.org/unicode/faq/<br>
- </a><i>For answers to common questions on technical issues.</i></td>
- </tr>
- <tr>
- <td valign="top" width="1">[<a name="Glossary">Glossary</a>]</td>
- <td valign="top">Unicode Glossary<a
- href="http://www.unicode.org/glossary/"><br>
- http://www.unicode.org/glossary/<br>
- </a><i>For explanations of terminology used in this and other documents.</i></td>
- </tr>
- <tr>
- <td valign="top" width="1">[<a name="Reports">Reports</a>]</td>
- <td valign="top">Unicode Technical Reports<br>
- <a href="http://www.unicode.org/unicode/reports/">http://www.unicode.org/unicode/reports/<br>
- </a><i>For information on the status and development process for
- technical reports, and for a list of technical reports.</i></td>
- </tr>
- <tr>
- <td valign="top" width="1">[<a name="U3.1">U3.1</a>]</td>
- <td valign="top">Unicode Standard Annex #27: Unicode 3.1<a
- href="http://www.unicode.org/unicode/reports/tr27/"><br>
- http://www.unicode.org/unicode/reports/tr27/</a></td>
- </tr>
- <tr>
- <td valign="top" width="1">[<a name="Versions">Versions</a>]</td>
- <td valign="top">Versions of the Unicode Standard<br>
- <a href="http://www.unicode.org/unicode/standard/versions/">http://www.unicode.org/unicode/standard/versions/<br>
- </a><i>For details on the precise contents of each version of the
- Unicode Standard, and how to cite them.</i></td>
- </tr>
- </tbody>
-</table>
-<h2><br>
-<i><a name="UCD_Terms">UCD Terms of Use</a></i></h2>
-<h3><i>Disclaimer</i></h3>
-<blockquote>
- <p><i>The Unicode Character Database is provided as is by Unicode, Inc. No
- claims are made as to fitness for any particular purpose. No warranties of any
- kind are expressed or implied. The recipient agrees to determine applicability
- of information provided. If this file has been purchased on magnetic or
- optical media from Unicode, Inc., the sole remedy for any claim will be
- exchange of defective media within 90 days of receipt.</i></p>
- <p><i>This disclaimer is applicable for all other data files accompanying the
- Unicode Character Database, some of which have been compiled by the Unicode
- Consortium, and some of which have been supplied by other sources.</i></p>
-</blockquote>
-<h3><i>Limitations on Rights to Redistribute This Data</i></h3>
-<blockquote>
- <p><i>Recipient is granted the right to make copies in any form for internal
- distribution and to freely use the information supplied in the creation of
- products supporting the Unicode<sup>TM</sup> Standard. The files in the
- Unicode Character Database can be redistributed to third parties or other
- organizations (whether for profit or not) as long as this notice and the
- disclaimer notice are retained. Information can be extracted from these files
- and used in documentation or programs, as long as there is an accompanying
- notice indicating the source.</i></p>
-</blockquote>
-<hr width="50%">
-<p align="center"><a href="http://www.unicode.org/unicode/copyright.html"><img
-src="http://www.unicode.org/img/hb_home.gif" border="0" alt="Home" width="40"
-height="49"><img src="http://www.unicode.org/img/hb_mid.gif" border="0"
-alt="Terms of Use" width="152" height="49"><img
-src="http://www.unicode.org/img/hb_mail.gif" border="0" alt="E-mail" width="46"
-height="49"></a>
-
-</body>
-
-</html>
+++ /dev/null
-<html>
-
-<head>
-<meta name="GENERATOR" content="Microsoft FrontPage 4.0">
-<meta name="ProgId" content="FrontPage.Editor.Document">
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
-<link rel="stylesheet" href="http://www.unicode.org/unicode.css" type="text/css">
-<title>UnicodeData File Format</title>
-</head>
-
-<body>
-
-<table width="100%" cellpadding="0" cellspacing="0" border="0">
- <tr>
- <td>
- <table width="100%" border="0" cellpadding="0" cellspacing="0">
- <tr>
- <td class="icon"><a href="http://www.unicode.org"><img border="0"
- src="http://www.unicode.org/webscripts/logo60s2.gif" align="middle"
- alt="[Unicode]" width="34" height="33"></a> <a
- class="bar" href="UnicodeCharacterDatabase.html">Unicode Character
- Database</a></td>
- </tr>
- </table>
- </td>
- </tr>
- <tr>
- <td class="gray"> </td>
- </tr>
-</table>
-<h1>Unicode Data File Format</h1>
-<table border="1" cellspacing="2" cellpadding="0" height="87" width="100%">
- <tr>
- <td valign="TOP" width="144">Revision</td>
- <td valign="TOP">3.1.0</td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Authors</td>
- <td valign="TOP">Mark Davis and Ken Whistler</td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Date</td>
- <td valign="TOP">2001-02-28</td>
- </tr>
- <tr>
- <td valign="TOP" width="144">This Version</td>
- <td valign="TOP"><a
- href="http://www.unicode.org/Public/3.1-Update/UnicodeData-3.1.0.html">http://www.unicode.org/Public/3.1-Update/UnicodeData-3.1.0.html</a></td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Previous Version</td>
- <td valign="TOP"><a
- href="http://www.unicode.org/Public/3.0-Update1/UnicodeData-3.0.1.html">http://www.unicode.org/Public/3.0-Update1/UnicodeData-3.0.1.html</a></td>
- </tr>
- <tr>
- <td valign="TOP" width="144">Latest Version</td>
- <td valign="TOP"><a
- href="http://www.unicode.org/Public/UNIDATA/UnicodeData.html">http://www.unicode.org/Public/UNIDATA/UnicodeData.html</a></td>
- </tr>
-</table>
-<h3><br>
-S<i>ummary</i></h3>
-<blockquote>
- <p><i>This document describes the format and content of the UnicodeData.txt
- file in the Unicode Character Database (UCD).</i></p>
-</blockquote>
-<h3><i>Status</i></h3>
-<blockquote>
- <p><i>The file and the files described herein are part of the Unicode
- Character Database and governed by the <a href="#UCD_Terms">UCD Terms of
- Use</a> given below.</i></p>
- <p><i>For general information on file formats and table formats, and the
- implications of normative vs informative properties, see
- UnicodeCharacterDatabase.html. </i></p>
- <p><i><b>Warning: </b>the information in this file does not completely
- describe the use and interpretation of Unicode character properties and
- behavior. It must be used in conjunction with the data in the other files in
- the UCD, and relies on the notation and definitions supplied in <a
- href="http://www.unicode.org/unicode/standard/versions/Unicode3.0.html">The
- Unicode Standard</a>. All chapter references are to Version 3.1.0 of the
- standard.</i></p>
-</blockquote>
-<h2>Introduction</h2>
-<p>This document describes the format of the UnicodeData.txt file, which is one
-of the files in the Unicode Character Database. The document is divided into the
-following sections:
-<ul>
- <li><a href="#Field Formats">Field Formats</a>
- <ul>
- <li><a href="#General Category">General Category</a></li>
- <li><a href="#Bidirectional Category">Bidirectional Category</a></li>
- <li><a href="#Character Decomposition">Character Decomposition Mapping</a></li>
- <li><a href="#Canonical Combining Classes">Canonical Combining Classes</a></li>
- <li><a href="#Decompositions and Normalization">Decompositions and
- Normalization</a></li>
- <li><a href="#Case Mappings">Case Mappings</a></li>
- </ul>
- </li>
- <li><a href="#Property Invariants">Property Invariants</a></li>
- <li><a href="#Modification History">Modification History</a></li>
-</ul>
-<h2><a name="Field Formats"></a>Field Formats</h2>
-<p>Each line represents the data for one encoded character in the Unicode
-Standard. (For information on the file format, see UCD File Format in
-UnicodeCharacterDatabase.html).
-<p>Every encoded character has a data entry, with the exception of certain
-special ranges, as detailed below.
-<ul>
- <li>These ranges represented only by their start and end characters, since the
- properties in the file are uniform, except for code values (which are all
- sequential and assigned).</li>
- <li>The names of CJK ideograph characters and the names and decompositions of
- Hangul syllable characters are algorithmically derivable. (See the Unicode
- Standard and <a href="http://www.unicode.org/unicode/reports/tr15/">Unicode
- Standard Annex #15</a> for more information).</li>
- <li>Surrogate code values and private use characters have no names.</li>
- <li>The supplementary Private Use characters (U+F0000 .. U+FFFFD, U+100000 ..
- U+10FFFD) are listed as distinct ranges. These correspond to surrogate pairs
- where the first surrogate is in the High Surrogate Private Use section.</li>
-</ul>
-<p>The exact ranges represented by start and end characters are:
-<ul>
- <li>CJK Ideographs Extension A (U+3400 .. U+4DB5)</li>
- <li>CJK Ideographs (U+4E00 .. U+9FA5)</li>
- <li>Hangul Syllables (U+AC00 .. U+D7A3)</li>
- <li>Non-Private Use High Surrogates (U+D800 .. U+DB7F)</li>
- <li>Private Use High Surrogates (U+DB80 .. U+DBFF)</li>
- <li>Low Surrogates (U+DC00 .. U+DFFF)</li>
- <li>The Private Use Area (U+E000 .. U+F8FF)</li>
- <li>CJK Ideographs Extension B (U+20000 .. U+2A6D6)</li>
- <li>Plane 15 Private Use Area (U+F0000 .. U+FFFFD)</li>
- <li>Plane 16 Private Use Area (U+100000 .. U+10FFFD)</li>
-</ul>
-<p>The following table describes the format and meaning of each field in a data
-entry in the UnicodeData file.</p>
-<table border="1" cellspacing="2" cellpadding="2">
- <tr>
- <th valign="top" align="LEFT">
- <p align="LEFT">Field</th>
- <th valign="top" align="LEFT">
- <p align="LEFT">Name</th>
- <th valign="top" align="center">
- <p align="LEFT">N/I</th>
- <th valign="top" align="LEFT">
- <p align="LEFT">Explanation</th>
- </tr>
- <tr>
- <th valign="top">0</th>
- <td valign="top">Code value</td>
- <td valign="top" align="center">N</td>
- <td valign="top">Code value.</td>
- </tr>
- <tr>
- <th valign="top">1</th>
- <td valign="top">Character name</td>
- <td valign="top" align="center">N</td>
- <td valign="top">These names match exactly the names published in Chapter 14
- of the Unicode Standard, Version 3.0.</td>
- </tr>
- <tr>
- <th valign="top">2</th>
- <td valign="top"><a href="#General Category">General Category</a></td>
- <td valign="top" align="center">N</td>
- <td valign="top">This is a useful breakdown into various "character
- types" which can be used as a default categorization in
- implementations. See below for a brief explanation.</td>
- </tr>
- <tr>
- <th valign="top">3</th>
- <td valign="top"><a href="#Canonical Combining Classes">Canonical Combining
- Classes</a></td>
- <td valign="top" align="center">N</td>
- <td valign="top">The classes used for the Canonical Ordering Algorithm in
- the Unicode Standard. These classes are also printed in Chapter 4 of the
- Unicode Standard.</td>
- </tr>
- <tr>
- <th valign="top">4</th>
- <td valign="top"><a href="#Bidirectional Category">Bidirectional Category</a></td>
- <td valign="top" align="center">N</td>
- <td valign="top">See the list below for an explanation of the abbreviations
- used in this field. These are the categories required by the Bidirectional
- Behavior Algorithm in the Unicode Standard. These categories are
- summarized in Chapter 3 of the Unicode Standard.</td>
- </tr>
- <tr>
- <th valign="top">5</th>
- <td valign="top"><a href="#Character Decomposition">Character Decomposition
- Mapping</a></td>
- <td valign="top" align="center">N</td>
- <td valign="top">In the Unicode Standard, not all of the mappings are full
- (maximal) decompositions. Recursive application of look-up for
- decompositions will, in all cases, lead to a maximal decomposition. The
- decomposition mappings match exactly the decomposition mappings published
- with the character names in the Unicode Standard.</td>
- </tr>
- <tr>
- <th valign="top">6</th>
- <td valign="top">Decimal digit value</td>
- <td valign="top" align="center">N</td>
- <td valign="top">This is a numeric field. If the character has the decimal
- digit property, as specified in Chapter 4 of the Unicode Standard, the
- value of that digit is represented with an integer value in this field</td>
- </tr>
- <tr>
- <th valign="top">7</th>
- <td valign="top">Digit value</td>
- <td valign="top" align="center">N</td>
- <td valign="top">This is a numeric field. If the character represents a
- digit, not necessarily a decimal digit, the value is here. This covers
- digits which do not form decimal radix forms, such as the compatibility
- superscript digits</td>
- </tr>
- <tr>
- <th valign="top">8</th>
- <td valign="top">Numeric value</td>
- <td valign="top" align="center">N</td>
- <td valign="top">This is a numeric field. If the character has the numeric
- property, as specified in Chapter 4 of the Unicode Standard, the value of
- that character is represented with an integer or rational number in this
- field. This includes fractions as, e.g., "1/5" for U+2155 VULGAR
- FRACTION ONE FIFTH Also included are numerical values for compatibility
- characters such as circled numbers.</td>
- </tr>
- <tr>
- <th valign="top">9</th>
- <td valign="top">Mirrored</td>
- <td valign="top" align="center">N</td>
- <td valign="top">If the character has been identified as a
- "mirrored" character in bidirectional text, this field has the
- value "Y"; otherwise "N". The list of mirrored
- characters is also printed in Chapter 4 of the Unicode Standard.</td>
- </tr>
- <tr>
- <th valign="top">10</th>
- <td valign="top">Unicode 1.0 Name</td>
- <td valign="top" align="center">I</td>
- <td valign="top">This is the old name as published in Unicode 1.0. This name
- is only provided when it is significantly different from the current name
- for the character.</td>
- </tr>
- <tr>
- <th valign="top">11</th>
- <td valign="top">10646 comment field</td>
- <td valign="top" align="center">I</td>
- <td valign="top">This is the ISO 10646 comment field. It appears in
- parentheses in the 10646 names list, or contains an asterisk to mark an
- Annex P note.</td>
- </tr>
- <tr>
- <th valign="top">12</th>
- <td valign="top"><a href="#Case Mappings">Uppercase Mapping</a></td>
- <td valign="top" align="center">N</td>
- <td valign="top">Upper case equivalent mapping. If a character is part of an
- alphabet with case distinctions, and has a simple upper case equivalent,
- then the upper case equivalent is in this field. See the explanation below
- on case distinctions. These mappings are always one-to-one, not
- one-to-many or many-to-one.
- <p><i>For full case mappings, see <a
- href="http://www.unicode.org/unicode/reports/tr21/">UTR #21</a> and
- SpecialCasing.txt.</i></p>
- </td>
- </tr>
- <tr>
- <th valign="top">13</th>
- <td valign="top"><a href="#Case Mappings">Lowercase Mapping</a></td>
- <td valign="top" align="center">N</td>
- <td valign="top">Similar to Uppercase mapping</td>
- </tr>
- <tr>
- <th valign="top">14</th>
- <td valign="top"><a href="#Case Mappings">Titlecase Mapping</a></td>
- <td valign="top" align="center">N</td>
- <td valign="top">Similar to Uppercase mapping</td>
- </tr>
-</table>
-<h3><a name="General Category"></a>General Category</h3>
-<p>The values in this field are abbreviations for the following values. For more
-information, see the Unicode Standard.</p>
-<blockquote>
- <p><b>Note:</b> the standard does not assign information to control characters
- (except for certain cases in the Bidirectional Algorithm). Implementations
- will generally also assign categories to certain control characters, notably
- CR and LF, according to platform conventions. See <a
- href="http://www.unicode.org/unicode/reports/tr13/">UAX #13: Unicode Newline
- Guidelines</a> for more information.</p>
-</blockquote>
-<table border="0" cellspacing="0" cellpadding="4">
- <tr>
- <th>
- <p align="LEFT">Abbr.</th>
- <th>
- <p align="LEFT">Description</th>
- </tr>
- <tr>
- <td align="CENTER">Lu</td>
- <td>Letter, Uppercase</td>
- </tr>
- <tr>
- <td align="CENTER">Ll</td>
- <td>Letter, Lowercase</td>
- </tr>
- <tr>
- <td align="CENTER">Lt</td>
- <td>Letter, Titlecase</td>
- </tr>
- <tr>
- <td align="CENTER">Lm</td>
- <td>Letter, Modifier</td>
- </tr>
- <tr>
- <td align="CENTER">Lo</td>
- <td>Letter, Other</td>
- </tr>
- <tr>
- <td align="CENTER">Mn</td>
- <td>Mark, Non-Spacing</td>
- </tr>
- <tr>
- <td align="CENTER">Mc</td>
- <td>Mark, Spacing Combining</td>
- </tr>
- <tr>
- <td align="CENTER">Me</td>
- <td>Mark, Enclosing</td>
- </tr>
- <tr>
- <td align="CENTER">Nd</td>
- <td>Number, Decimal Digit</td>
- </tr>
- <tr>
- <td align="CENTER">Nl</td>
- <td>Number, Letter</td>
- </tr>
- <tr>
- <td align="CENTER">No</td>
- <td>Number, Other</td>
- </tr>
- <tr>
- <td align="CENTER">Pc</td>
- <td>Punctuation, Connector</td>
- </tr>
- <tr>
- <td align="CENTER">Pd</td>
- <td>Punctuation, Dash</td>
- </tr>
- <tr>
- <td align="CENTER">Ps</td>
- <td>Punctuation, Open</td>
- </tr>
- <tr>
- <td align="CENTER">Pe</td>
- <td>Punctuation, Close</td>
- </tr>
- <tr>
- <td align="CENTER">Pi</td>
- <td>Punctuation, Initial quote (may behave like Ps or Pe depending on usage)</td>
- </tr>
- <tr>
- <td align="CENTER">Pf</td>
- <td>Punctuation, Final quote (may behave like Ps or Pe depending on usage)</td>
- </tr>
- <tr>
- <td align="CENTER">Po</td>
- <td>Punctuation, Other</td>
- </tr>
- <tr>
- <td align="CENTER">Sm</td>
- <td>Symbol, Math</td>
- </tr>
- <tr>
- <td align="CENTER">Sc</td>
- <td>Symbol, Currency</td>
- </tr>
- <tr>
- <td align="CENTER">Sk</td>
- <td>Symbol, Modifier</td>
- </tr>
- <tr>
- <td align="CENTER">So</td>
- <td>Symbol, Other</td>
- </tr>
- <tr>
- <td align="CENTER">Zs</td>
- <td>Separator, Space</td>
- </tr>
- <tr>
- <td align="CENTER">Zl</td>
- <td>Separator, Line</td>
- </tr>
- <tr>
- <td align="CENTER">Zp</td>
- <td>Separator, Paragraph</td>
- </tr>
- <tr>
- <td align="CENTER">Cc</td>
- <td>Other, Control</td>
- </tr>
- <tr>
- <td align="CENTER">Cf</td>
- <td>Other, Format</td>
- </tr>
- <tr>
- <td align="CENTER">Cs</td>
- <td>Other, Surrogate</td>
- </tr>
- <tr>
- <td align="CENTER">Co</td>
- <td>Other, Private Use</td>
- </tr>
- <tr>
- <td align="CENTER">Cn</td>
- <td>Other, Not Assigned (no characters in the file have this property)</td>
- </tr>
-</table>
-<blockquote>
- <p><b>Note:</b> The term "L&" is sometimes used to stand for
- Uppercase, Lowercase or Titlecase letters (Lu, Ll, or Lt).</p>
-</blockquote>
-<h3><a name="Bidirectional Category"></a>Bidirectional Category</h3>
-<p>Please refer to Chapter 3 for an explanation of the algorithm for
-Bidirectional Behavior and an explanation of the significance of these
-categories. An up-to-date version can be found on <a
-href="http://www.unicode.org/unicode/reports/tr9/">Unicode Standard Annex #9:
-The Bidirectional Algorithm</a>.</p>
-<table border="0" cellpadding="4" cellspacing="0">
- <tr>
- <th valign="TOP" align="LEFT">
- <p align="LEFT">Type</th>
- <th valign="TOP" align="LEFT">
- <p align="LEFT">Description</th>
- </tr>
- <tr>
- <td valign="TOP"><b>L</b></td>
- <td valign="TOP">Left-to-Right</td>
- </tr>
- <tr>
- <td valign="TOP"><b>LRE</b></td>
- <td valign="TOP">Left-to-Right Embedding</td>
- </tr>
- <tr>
- <td valign="TOP"><b>LRO</b></td>
- <td valign="TOP">Left-to-Right Override</td>
- </tr>
- <tr>
- <td valign="TOP"><b>R</b></td>
- <td valign="TOP">Right-to-Left</td>
- </tr>
- <tr>
- <td valign="TOP"><b>AL</b></td>
- <td valign="TOP">Right-to-Left Arabic</td>
- </tr>
- <tr>
- <td valign="TOP"><b>RLE</b></td>
- <td valign="TOP">Right-to-Left Embedding</td>
- </tr>
- <tr>
- <td valign="TOP"><b>RLO</b></td>
- <td valign="TOP">Right-to-Left Override</td>
- </tr>
- <tr>
- <td valign="TOP"><b>PDF</b></td>
- <td valign="TOP">Pop Directional Format</td>
- </tr>
- <tr>
- <td valign="TOP"><b>EN</b></td>
- <td valign="TOP">European Number</td>
- </tr>
- <tr>
- <td valign="TOP"><b>ES</b></td>
- <td valign="TOP">European Number Separator</td>
- </tr>
- <tr>
- <td valign="TOP"><b>ET</b></td>
- <td valign="TOP">European Number Terminator</td>
- </tr>
- <tr>
- <td valign="TOP"><b>AN</b></td>
- <td valign="TOP">Arabic Number</td>
- </tr>
- <tr>
- <td valign="TOP"><b>CS</b></td>
- <td valign="TOP">Common Number Separator</td>
- </tr>
- <tr>
- <td valign="TOP"><b>NSM</b></td>
- <td valign="TOP">Non-Spacing Mark</td>
- </tr>
- <tr>
- <td valign="TOP"><b>BN</b></td>
- <td valign="TOP">Boundary Neutral</td>
- </tr>
- <tr>
- <td valign="TOP"><b>B</b></td>
- <td valign="TOP">Paragraph Separator</td>
- </tr>
- <tr>
- <td valign="TOP"><b>S</b></td>
- <td valign="TOP">Segment Separator</td>
- </tr>
- <tr>
- <td valign="TOP"><b>WS</b></td>
- <td valign="TOP">Whitespace</td>
- </tr>
- <tr>
- <td valign="TOP"><b>ON</b></td>
- <td valign="TOP">Other Neutrals</td>
- </tr>
-</table>
-<h3><a name="Character Decomposition"></a>Character Decomposition Mapping</h3>
-<p>The tags supplied with certain decomposition mappings generally indicate
-formatting information. Where no such tag is given, the mapping is designated as
-canonical. Conversely, the presence of a formatting tag also indicates that the
-mapping is a compatibility mapping and not a canonical mapping. In the absence
-of other formatting information in a compatibility mapping, the tag is used to
-distinguish it from canonical mappings.</p>
-<p>In some instances a canonical mapping or a compatibility mapping may consist
-of a single character. For a canonical mapping, this indicates that the
-character is a canonical equivalent of another single character. For a
-compatibility mapping, this indicates that the character is a compatibility
-equivalent of another single character. The compatibility formatting tags used
-are:</p>
-<table border="0" cellspacing="0" cellpadding="4">
- <tr>
- <th>Tag</th>
- <th>
- <p align="LEFT">Description</th>
- </tr>
- <tr>
- <td align="CENTER"><font> </td>
- <td>A font variant (e.g. a blackletter form).</td>
- </tr>
- <tr>
- <td align="CENTER"><noBreak> </td>
- <td>A no-break version of a space or hyphen.</td>
- </tr>
- <tr>
- <td align="CENTER"><initial> </td>
- <td>An initial presentation form (Arabic).</td>
- </tr>
- <tr>
- <td align="CENTER"><medial> </td>
- <td>A medial presentation form (Arabic).</td>
- </tr>
- <tr>
- <td align="CENTER"><final> </td>
- <td>A final presentation form (Arabic).</td>
- </tr>
- <tr>
- <td align="CENTER"><isolated> </td>
- <td>An isolated presentation form (Arabic).</td>
- </tr>
- <tr>
- <td align="CENTER"><circle> </td>
- <td>An encircled form.</td>
- </tr>
- <tr>
- <td align="CENTER"><super> </td>
- <td>A superscript form.</td>
- </tr>
- <tr>
- <td align="CENTER"><sub> </td>
- <td>A subscript form.</td>
- </tr>
- <tr>
- <td align="CENTER"><vertical> </td>
- <td>A vertical layout presentation form.</td>
- </tr>
- <tr>
- <td align="CENTER"><wide> </td>
- <td>A wide (or zenkaku) compatibility character.</td>
- </tr>
- <tr>
- <td align="CENTER"><narrow> </td>
- <td>A narrow (or hankaku) compatibility character.</td>
- </tr>
- <tr>
- <td align="CENTER"><small> </td>
- <td>A small variant form (CNS compatibility).</td>
- </tr>
- <tr>
- <td align="CENTER"><square> </td>
- <td>A CJK squared font variant.</td>
- </tr>
- <tr>
- <td align="CENTER"><fraction> </td>
- <td>A vulgar fraction form.</td>
- </tr>
- <tr>
- <td align="CENTER"><compat> </td>
- <td>Otherwise unspecified compatibility character.</td>
- </tr>
-</table>
-<p><b>Reminder: </b>There is a difference between decomposition and
-decomposition mapping. The decomposition mappings are defined in the UnicodeData,
-while the decomposition (also termed "full decomposition") is defined
-in Chapter 3 to use those mappings <i>recursively.</i>
-<ul>
- <li>The canonical decomposition is formed by recursively applying the
- canonical mappings, then applying the canonical reordering algorithm.</li>
- <li>The compatibility decomposition is formed by recursively applying the
- canonical <em>and</em> compatibility mappings, then applying the canonical
- reordering algorithm.</li>
-</ul>
-<h3><a name="Canonical Combining Classes"></a>Canonical Combining Classes</h3>
-<table border="0" cellspacing="0" cellpadding="4">
- <tr>
- <th>
- <p align="LEFT">Value</th>
- <th>
- <p align="LEFT">Description</th>
- </tr>
- <tr>
- <td align="RIGHT">0:</td>
- <td>Spacing, split, enclosing, reordrant, and Tibetan subjoined</td>
- </tr>
- <tr>
- <td align="RIGHT">1:</td>
- <td>Overlays and interior</td>
- </tr>
- <tr>
- <td align="RIGHT">7:</td>
- <td>Nuktas</td>
- </tr>
- <tr>
- <td align="RIGHT">8:</td>
- <td>Hiragana/Katakana voicing marks</td>
- </tr>
- <tr>
- <td align="RIGHT">9:</td>
- <td>Viramas</td>
- </tr>
- <tr>
- <td align="RIGHT">10:</td>
- <td>Start of fixed position classes</td>
- </tr>
- <tr>
- <td align="RIGHT">199:</td>
- <td>End of fixed position classes</td>
- </tr>
- <tr>
- <td align="RIGHT">200:</td>
- <td>Below left attached</td>
- </tr>
- <tr>
- <td align="RIGHT">202:</td>
- <td>Below attached</td>
- </tr>
- <tr>
- <td align="RIGHT">204:</td>
- <td>Below right attached</td>
- </tr>
- <tr>
- <td align="RIGHT">208:</td>
- <td>Left attached (reordrant around single base character)</td>
- </tr>
- <tr>
- <td align="RIGHT">210:</td>
- <td>Right attached</td>
- </tr>
- <tr>
- <td align="RIGHT">212:</td>
- <td>Above left attached</td>
- </tr>
- <tr>
- <td align="RIGHT">214:</td>
- <td>Above attached</td>
- </tr>
- <tr>
- <td align="RIGHT">216:</td>
- <td>Above right attached</td>
- </tr>
- <tr>
- <td align="RIGHT">218:</td>
- <td>Below left</td>
- </tr>
- <tr>
- <td align="RIGHT">220:</td>
- <td>Below</td>
- </tr>
- <tr>
- <td align="RIGHT">222:</td>
- <td>Below right</td>
- </tr>
- <tr>
- <td align="RIGHT">224:</td>
- <td>Left (reordrant around single base character)</td>
- </tr>
- <tr>
- <td align="RIGHT">226:</td>
- <td>Right</td>
- </tr>
- <tr>
- <td align="RIGHT">228:</td>
- <td>Above left</td>
- </tr>
- <tr>
- <td align="RIGHT">230:</td>
- <td>Above</td>
- </tr>
- <tr>
- <td align="RIGHT">232:</td>
- <td>Above right</td>
- </tr>
- <tr>
- <td align="RIGHT">233:</td>
- <td>Double below</td>
- </tr>
- <tr>
- <td align="RIGHT">234:</td>
- <td>Double above</td>
- </tr>
- <tr>
- <td align="RIGHT">240:</td>
- <td>Below (iota subscript)</td>
- </tr>
-</table>
-<p><strong>Note: </strong>some of the combining classes in this list do not
-currently have members but are specified here for completeness.</p>
-<h3><a name="Decompositions and Normalization"></a>Decompositions and
-Normalization</h3>
-<p>Decomposition is specified in Chapter 3. <a
-href="http://www.unicode.org/unicode/reports/tr15/"><i>Unicode Standard Annex
-#15: Unicode Normalization Forms</i></a> specifies the interaction between
-decomposition and normalization. That report specifies how the decompositions
-defined in UnicodeData.txt are used to derive normalized forms of Unicode text.</p>
-<p>Note that as of the 2.1.9 update of the Unicode Character Database, the
-decompositions in the UnicodeData.txt file can be used to recursively derive the
-full decomposition in canonical order, without the need to separately apply
-canonical reordering. However, canonical reordering of combining character
-sequences <b><i>must</i></b> still be applied in decomposition when normalizing
-source text which contains any combining marks.</p>
-<h3><a name="Case Mappings"></a>Case Mappings</h3>
-<p>There are a number of complications to case mappings that occur once the
-repertoire of characters is expanded beyond ASCII. For more information, see <a
-href="http://www.unicode.org/unicode/reports/tr21/">UTR #21: Case Mappings</a>.</p>
-<p>For compatibility with existing parsers, UnicodeData.txt only contains case
-mappings for characters where they are one-to-one mappings; it also omits
-information about context-sensitive case mappings. Information about these
-special cases can be found in a separate data file, SpecialCasing.txt.</p>
-<h2><a name="Property Invariants"></a>Property Invariants</h2>
-<p>Values in UnicodeData.txt are subject to correction as errors are found;
-however, some characteristics of the categories themselves can be considered
-invariants. Applications may wish to take these invariants into account when
-choosing how to implement character properties. For more information, see <a
-href="http://www.unicode.org/unicode/standard/policies.html">Unicode Policies</a>.</p>
-<p>The following is a partial list of known invariants for the Unicode Character
-Database.</p>
-<h4>Database Fields</h4>
-<ul>
- <li>The number of fields in UnicodeData.txt is fixed.</li>
- <li>The order of the fields is also fixed.
- <ul>
- <li>Any additional information about character properties to be added in
- the future will appear in separate data tables, rather than being added
- on to the existing table or by subdivision or reinterpretation of
- existing fields.</li>
- </ul>
- </li>
-</ul>
-<h4>General Category</h4>
-<ul>
- <li>There will never be more than 32 General Category values.
- <ul>
- <li>It is very unlikely that the Unicode Technical Committee will
- subdivide the General Category partition any further, since that can
- cause implementations to misbehave. Because the General Category is
- limited to 32 values, 5 bits can be used to represent the information,
- and a 32-bit integer can be used as a bitmask to represent arbitrary
- sets of categories.</li>
- </ul>
- </li>
-</ul>
-<h4>Combining Classes</h4>
-<ul>
- <li>Combining classes are limited to the values 0 to 255.
- <ul>
- <li>In practice, there are far fewer than 256 values used. Implementations
- may take advantage of this fact for compression, since only the ordering
- of the non-zero values matters for the Canonical Reordering Algorithm.
- It is possible for up to 256 values to be used in the future; however,
- UTC decisions in the future may restrict the number of values to 128,
- since this has implementation advantages. [Signed bytes can be used
- without widening to ints in Java, for example.]</li>
- </ul>
- </li>
- <li>All characters other than those of General Category M* have the combining
- class 0.
- <ul>
- <li>Currently, all characters other than those of General Category Mn have
- the value 0. However, some characters of General Category Me or Mc may
- be given non-zero values in the future.</li>
- <li>The precise values above the value 0 are not invariant--only the
- relative ordering is considered normative. For example, it is not
- guaranteed in future versions that the class of U+05B4 will be precisely
- 14.</li>
- </ul>
- </li>
-</ul>
-<h4>Canonical Decomposition</h4>
-<ul>
- <li>Canonical mappings are always in canonical order.</li>
- <li>Canonical mappings have only the first of a pair possibly further
- decomposing.</li>
- <li>Canonical decompositions are "transparent" to other character
- data:
- <ul>
- <li><tt>BIDI(a) = BIDI(principal(canonicalDecomposition(a))</tt></li>
- <li><tt>Category(a) = Category(principal(canonicalDecomposition(a))</tt></li>
- <li><tt>CombiningClass(a) =
- CombiningClass(principal(canonicalDecomposition(a))</tt><br>
- where principal(a) is the first character not of type Mn, or the first
- character if all characters are of type Mn.</li>
- </ul>
- </li>
- <li>However, because there are sometimes missing case pairs, and because of
- some legacy characters, it is only generally true that:
- <ul>
- <li><tt>upper(canonicalDecomposition(a)) = canonicalDecomposition(upper(a))</tt></li>
- <li><tt>lower(canonicalDecomposition(a)) = canonicalDecomposition(lower(a))</tt></li>
- <li><tt>title(canonicalDecomposition(a)) = canonicalDecomposition(title(a))</tt></li>
- </ul>
- </li>
-</ul>
-<h2><a name="Modification History"></a>Modification History</h2>
-<p>This section provides a summary of the changes between update versions of the
-Unicode Standard.</p>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 3.1">Unicode
-3.1</a></h3>
-<p>Modifications made for Version 3.0.1 of UnicodeData.txt include:
-<ul>
- <li>Addition of 2237 new entries, to cover new characters and new ranges of
- unified Han characters encoded in Unicode 3.1.</li>
- <li>Changed General Category value of 16EE..16F0 (Runic golden numbers) from
- No to Nl.</li>
-</ul>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 3.0.1">Unicode
-3.0.1</a></h3>
-<p>Modifications made for Version 3.0.1 of UnicodeData.txt include:
-<ul>
- <li>Added 5- and 6-digit representation of code points past U+FFFF.</li>
- <li>Added Private Use range definitions for Planes 15 and 16.</li>
- <li>Minor additions for the 10646 comment field.</li>
-</ul>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 3.0.0">Unicode
-3.0.0</a></h3>
-<p>Modifications made for Version 3.0.0 of UnicodeData.txt include many new
-characters and a number of property changes. These are summarized in Appendex D
-of <em>The Unicode Standard, Version 3.0.</em></p>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 2.1.9">Unicode
-2.1.9</a></h3>
-<p>Modifications made for Version 2.1.9 of UnicodeData.txt include:
-<ul>
- <li>Corrected combining class for U+05AE HEBREW ACCENT ZINOR.</li>
- <li>Corrected combining class for U+20E1 COMBINING LEFT RIGHT ARROW ABOVE</li>
- <li>Corrected combining class for U+0F35 and U+0F37 to 220.</li>
- <li>Corrected combining class for U+0F71 to 129.</li>
- <li>Added a decomposition for U+0F0C TIBETAN MARK DELIMITER TSHEG BSTAR.</li>
- <li>Added decompositions for several Greek symbol letters:
- U+03D0..U+03D2, U+03D5, U+03D6, U+03F0..U+03F2.</li>
- <li>Removed decompositions from the conjoining jamo block:
- U+1100..U+11F8.</li>
- <li>Changes to decomposition mappings for some Tibetan vowels for consistency
- in normalization. (U+0F71, U+0F73, U+0F77, U+0F79, U+0F81)</li>
- <li>Updated the decomposition mappings for several Vietnamese characters with
- two diacritics (U+1EAC, U+1EAD, U+1EB6, U+1EB7, U+1EC6, U+1EC7, U+1ED8,
- U+1ED9), so that the recursive decomposition can be generated directly in
- canonically reordered form (not a normative change).</li>
- <li>Updated the decomposition mappings for several Arabic compatibility
- characters involving shadda (U+FC5E..U+FC62, U+FCF2..U+FCF4), and two Latin
- characters (U+1E1C, U+1E1D), so that the decompositions are generated
- directly in canonically reordered form (not a normative change).</li>
- <li>Changed BIDI category for: U+00A0 NO-BREAK SPACE, U+2007 FIGURE SPACE,
- U+2028 LINE SEPARATOR.</li>
- <li>Changed BIDI category for extenders of General Category Lm: U+3005,
- U+3021..U+3035, U+FF9E, U+FF9F.</li>
- <li>Changed General Category and BIDI category for the Greek numeral signs:
- U+0374, U+0375.</li>
- <li>Corrected General Category for U+FFE8 HALFWIDTH FORMS LIGHT VERTICAL.</li>
- <li>Added Unicode 1.0 names for many Tibetan characters (informative).</li>
-</ul>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 2.1.8">Unicode
-2.1.8</a></h3>
-<p>Modifications made for Version 2.1.8 of UnicodeData.txt include:
-<ul>
- <li>Added combining class 240 for U+0345 COMBINING GREEK YPOGEGRAMMENI so that
- decompositions involving iota subscript are derivable directly in
- canonically reordered form; this also has a bearing on simplification of
- casing of polytonic Greek.</li>
- <li>Changes in decompositions related to Greek tonos. These result from the
- clarification that monotonic Greek "tonos" should be equated with
- U+0301 COMBINING ACUTE, rather than with U+030D COMBINING VERTICAL LINE
- ABOVE. (All Greek characters in the Greek block involving "tonos";
- some Greek characters in the polytonic Greek in the 1FXX block.)</li>
- <li>Changed decompositions involving dialytika tonos. (U+0390, U+03B0)</li>
- <li>Changed ternary decompositions to binary. (U+0CCB, U+FB2C, U+FB2D) These
- changes simplify normalization.</li>
- <li>Removed canonical decomposition for Latin Candrabindu. (U+0310)</li>
- <li>Corrected error in canonical decomposition for U+1FF4.</li>
- <li>Added compatibility decompositions to clarify collation tables. (U+2100,
- U+2101, U+2105, U+2106, U+1E9A)</li>
- <li>A series of general category changes to assist the convergence of
- Unicode definition of identifier with ISO TR 10176:
- <ul>
- <li>So > Lo: U+0950, U+0AD0, U+0F00, U+0F88..U+0F8B</li>
- <li>Po > Lo: U+0E2F, U+0EAF, U+3006</li>
- <li>Lm > Sk: U+309B, U+309C</li>
- <li>Po > Pc: U+30FB, U+FF65</li>
- <li>Ps/Pe > Mn: U+0F3E, U+0F3F</li>
- </ul>
- </li>
- <li>A series of bidi property changes for consistency.
- <ul>
- <li>L > ET: U+09F2, U+09F3</li>
- <li>ON > L: U+3007</li>
- <li>L > ON: U+0F3A..U+0F3D, U+037E, U+0387</li>
- </ul>
- </li>
- <li>Add case mapping: U+01A6 <-> U+0280</li>
- <li>Updated symmetric swapping value for guillemets: U+00AB, U+00BB, U+2039,
- U+203A.</li>
- <li>Changes to combining class values. Most Indic fixed position class
- non-spacing marks were changed to combining class 0. This fixes some
- inconsistencies in how canonical reordering would apply to Indic scripts,
- including Tibetan. Indic interacting top/bottom fixed position classes were
- merged into single (non-zero) classes as part of this change. Tibetan
- subjoined consonants are changed from combining class 6 to combining class
- 0. Thai pinthu (U+0E3A) moved to combining class 9. Moved two Devanagari
- stress marks into generic above and below combining classes (U+0951,
- U+0952).</li>
- <li>Corrected placement of semicolon near symmetric swapping field. (U+FA0E,
- etc., scattered positions to U+FA29)</li>
-</ul>
-<h3>Version 2.1.7</h3>
-<p><i>This version was for internal change tracking only, and never publicly
-released.</i></p>
-<h3>Version 2.1.6</h3>
-<p><i>This version was for internal change tracking only, and never publicly
-released.</i></p>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 2.1.5">Unicode
-2.1.5</a></h3>
-<p>Modifications made for Version 2.1.5 of UnicodeData.txt include:
-<ul>
- <li>Changed decomposition for U+FF9E and U+FF9F so that correct collation
- weighting will automatically result from the canonical equivalences.</li>
- <li>Removed canonical decompositions for U+04D4, U+04D5, U+04D8, U+04D9,
- U+04E0, U+04E1, U+04E8, U+04E9 (the implication being that no canonical
- equivalence is claimed between these 8 characters and similar Latin
- letters), and updated 4 canonical decompositions for U+04DB, U+04DC, U+04EA,
- U+04EB to reflect the implied difference in the base character.</li>
- <li>Added Pi, and Pf categories and assigned the relevant quotation marks to
- those categories, based on the Unicode Technical Corrigendum on Quotation
- Characters.</li>
- <li>Updating of many bidi properties, following the advice of the ad hoc
- committee on bidi, and to make the bidi properties of compatibility
- characters more consistent.</li>
- <li>Changed category of several Tibetan characters: U+0F3E, U+0F3F,
- U+0F88..U+0F8B to make them non-combining, reflecting the combined opinion
- of Tibetan experts.</li>
- <li>Added case mapping for U+03F2.</li>
- <li>Corrected case mapping for U+0275.</li>
- <li>Added titlecase mappings for U+03D0, U+03D1, U+03D5, U+03D6, U+03F0..
- U+03F2.</li>
- <li>Corrected compatibility label for U+2121.</li>
- <li>Add specific entries for all the CJK compatibility ideographs,
- U+F900..U+FA2D, so the canonical decomposition for each (the URO character
- it is equivalent to) can be carried in the database.</li>
-</ul>
-<h3>Version 2.1.4</h3>
-<p><i>This version was for internal change tracking only, and never publicly
-released.</i></p>
-<h3>Version 2.1.3</h3>
-<p><i>This version was for internal change tracking only, and never publicly
-released.</i></p>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 2.1.2">Unicode
-2.1.2</a></h3>
-<p>Modifications made in updating UnicodeData.txt to Version 2.1.2 for the
-Unicode Standard, Version 2.1 (from Version 2.0) include:
-<ul>
- <li>Added two characters (U+20AC and U+FFFC).</li>
- <li>Amended bidi properties for U+0026, U+002E, U+0040, U+2007.</li>
- <li>Corrected case mappings for U+018E, U+019F, U+01DD, U+0258, U+0275,
- U+03C2, U+1E9B.</li>
- <li>Changed combining order class for U+0F71.</li>
- <li>Corrected canonical decompositions for U+0F73, U+1FBE.</li>
- <li>Changed decomposition for U+FB1F from compatibility to canonical.</li>
- <li>Added compatibility decompositions for U+FBE8, U+FBE9, U+FBF9..U+FBFB.</li>
- <li>Corrected compatibility decompositions for U+2469, U+246A, U+3358.</li>
-</ul>
-<h3>Version 2.1.1</h3>
-<p><i>This version was for internal change tracking only, and never publicly
-released.</i></p>
-<h3><a
-href="http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode 2.0.0">Unicode
-2.0.0</a></h3>
-<p>The modifications made in updating UnicodeData.txt for the Unicode Standard,
-Version 2.0 include:
-<ul>
- <li>Fixed decompositions with TONOS to use correct NSM: 030D.</li>
- <li>Removed old Hangul Syllables; mapping to new characters are in a separate
- table.</li>
- <li>Marked compatibility decompositions with additional tags.</li>
- <li>Changed old tag names for clarity.</li>
- <li>Revision of decompositions to use first-level decomposition, instead of
- maximal decomposition.</li>
- <li>Correction of all known errors in decompositions from earlier versions.</li>
- <li>Added control code names (as old Unicode names).</li>
- <li>Added Hangul Jamo decompositions.</li>
- <li>Added Number category to match properties list in book.</li>
- <li>Fixed categories of Koranic Arabic marks.</li>
- <li>Fixed categories of precomposed characters to match decomposition where
- possible.</li>
- <li>Added Hebrew cantillation marks and the Tibetan script.</li>
- <li>Added place holders for ranges such as CJK Ideographic Area and the
- Private Use Area.</li>
- <li>Added categories Me, Sk, Pc, Nl, Cs, Cf, and rectified a number of
- mistakes in the database.</li>
-</ul>
-<h2><i><a name="UCD_Terms">UCD Terms of Use</a></i></h2>
-<h3><i>Disclaimer</i></h3>
-<blockquote>
- <p><i>The Unicode Character Database is provided as is by Unicode, Inc. No
- claims are made as to fitness for any particular purpose. No warranties of any
- kind are expressed or implied. The recipient agrees to determine applicability
- of information provided. If this file has been purchased on magnetic or
- optical media from Unicode, Inc., the sole remedy for any claim will be
- exchange of defective media within 90 days of receipt.</i></p>
- <p><i>This disclaimer is applicable for all other data files accompanying the
- Unicode Character Database, some of which have been compiled by the Unicode
- Consortium, and some of which have been supplied by other sources.</i></p>
-</blockquote>
-<h3><i>Limitations on Rights to Redistribute This Data</i></h3>
-<blockquote>
- <p><i>Recipient is granted the right to make copies in any form for internal
- distribution and to freely use the information supplied in the creation of
- products supporting the Unicode<sup>TM</sup> Standard. The files in the
- Unicode Character Database can be redistributed to third parties or other
- organizations (whether for profit or not) as long as this notice and the
- disclaimer notice are retained. Information can be extracted from these files
- and used in documentation or programs, as long as there is an accompanying
- notice indicating the source.</i></p>
-</blockquote>
-<hr width="50%">
-<div align="center">
- <center>
- <table cellspacing="0" cellpadding="0" border="0">
- <tr>
- <td><a href="http://www.unicode.org/unicode/copyright.html"><img
- src="http://www.unicode.org/img/hb_home.gif" border="0" alt="Home"
- width="40" height="49"><img src="http://www.unicode.org/img/hb_mid.gif"
- border="0" alt="Terms of Use" width="152" height="49"><img
- src="http://www.unicode.org/img/hb_mail.gif" border="0" alt="E-mail"
- width="46" height="49"></a></td>
- </tr>
- </table>
- </center>
-</div>
-
-</body>
-
-</html>
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be lost!
#
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
-# This file is built by ./mktables from e.g. Unicode.txt.
+# This file is built by ./mktables from e.g. UnicodeData.txt.
# Any changes made here will be&nbs