Commit | Line | Data |
---|---|---|
11aea360 LW |
1 | Article 484 of comp.lang.perl: |
2 | Xref: netlabs comp.lang.perl:484 comp.lang.c:983 alt.sources:134 | |
3 | Path: netlabs!psinntp!iggy.GW.Vitalink.COM!lll-winken!sun-barr!cronkite.Central.Sun.COM!spdev!texsun!convex!tchrist | |
4 | From: tchrist@convex.com (Tom Christiansen) | |
5 | Newsgroups: comp.lang.perl,comp.lang.c,alt.sources | |
6 | Subject: pstruct -- a C structure formatter; AKA c2ph, a C to perl header translator | |
7 | Keywords: C perl tranlator | |
8 | Message-ID: <1991Jul25.081021.8104@convex.com> | |
9 | Date: 25 Jul 91 08:10:21 GMT | |
10 | Sender: usenet@convex.com (news access account) | |
11 | Followup-To: comp.lang.perl | |
12 | Organization: CONVEX Computer Corporation, Richardson, Tx., USA | |
13 | Lines: 1208 | |
14 | Nntp-Posting-Host: pixel.convex.com | |
15 | ||
16 | Once upon a time, I wrote a program called pstruct. It was a perl | |
17 | program that tried to parse out C structures and display their member | |
18 | offsets for you. This was especially useful for people looking at | |
19 | binary dumps or poking around the kernel. | |
20 | ||
21 | Pstruct was not a pretty program. Neither was it particularly robust. | |
22 | The problem, you see, was that the C compiler was much better at parsing | |
23 | C than I could ever hope to be. | |
24 | ||
25 | So I got smart: I decided to be lazy and let the C compiler parse the C, | |
26 | which would spit out debugger stabs for me to read. These were much | |
27 | easier to parse. It's still not a pretty program, but at least it's more | |
28 | robust. | |
29 | ||
30 | Pstruct takes any .c or .h files, or preferably .s ones, since that's | |
31 | the format it is going to massage them into anyway, and spits out | |
32 | listings like this: | |
33 | ||
34 | struct tty { | |
35 | int tty.t_locker 000 4 | |
36 | int tty.t_mutex_index 004 4 | |
37 | struct tty * tty.t_tp_virt 008 4 | |
38 | struct clist tty.t_rawq 00c 20 | |
39 | int tty.t_rawq.c_cc 00c 4 | |
40 | int tty.t_rawq.c_cmax 010 4 | |
41 | int tty.t_rawq.c_cfx 014 4 | |
42 | int tty.t_rawq.c_clx 018 4 | |
43 | struct tty * tty.t_rawq.c_tp_cpu 01c 4 | |
44 | struct tty * tty.t_rawq.c_tp_iop 020 4 | |
45 | unsigned char * tty.t_rawq.c_buf_cpu 024 4 | |
46 | unsigned char * tty.t_rawq.c_buf_iop 028 4 | |
47 | struct clist tty.t_canq 02c 20 | |
48 | int tty.t_canq.c_cc 02c 4 | |
49 | int tty.t_canq.c_cmax 030 4 | |
50 | int tty.t_canq.c_cfx 034 4 | |
51 | int tty.t_canq.c_clx 038 4 | |
52 | struct tty * tty.t_canq.c_tp_cpu 03c 4 | |
53 | struct tty * tty.t_canq.c_tp_iop 040 4 | |
54 | unsigned char * tty.t_canq.c_buf_cpu 044 4 | |
55 | unsigned char * tty.t_canq.c_buf_iop 048 4 | |
56 | struct clist tty.t_outq 04c 20 | |
57 | int tty.t_outq.c_cc 04c 4 | |
58 | int tty.t_outq.c_cmax 050 4 | |
59 | int tty.t_outq.c_cfx 054 4 | |
60 | int tty.t_outq.c_clx 058 4 | |
61 | struct tty * tty.t_outq.c_tp_cpu 05c 4 | |
62 | struct tty * tty.t_outq.c_tp_iop 060 4 | |
63 | unsigned char * tty.t_outq.c_buf_cpu 064 4 | |
64 | unsigned char * tty.t_outq.c_buf_iop 068 4 | |
65 | (*int)() tty.t_oproc_cpu 06c 4 | |
66 | (*int)() tty.t_oproc_iop 070 4 | |
67 | (*int)() tty.t_stopproc_cpu 074 4 | |
68 | (*int)() tty.t_stopproc_iop 078 4 | |
69 | struct thread * tty.t_rsel 07c 4 | |
70 | ||
71 | etc. | |
72 | ||
73 | ||
74 | Actually, this was generated by a particular set of options. You can control | |
75 | the formatting of each column, whether you prefer wide or fat, hex or decimal, | |
76 | leading zeroes or whatever. | |
77 | ||
78 | All you need to be able to use this is a C compiler than generates | |
79 | BSD/GCC-style stabs. The -g option on native BSD compilers and GCC | |
80 | should get this for you. | |
81 | ||
82 | To learn more, just type a bogus option, like -\?, and a long usage message | |
83 | will be provided. There are a fair number of possibilities. | |
84 | ||
85 | If you're only a C programmer, than this is the end of the message for you. | |
86 | You can quit right now, and if you care to, save off the source and run it | |
87 | when you feel like it. Or not. | |
88 | ||
89 | ||
90 | ||
91 | But if you're a perl programmer, then for you I have something much more | |
92 | wondrous than just a structure offset printer. | |
93 | ||
94 | You see, if you call pstruct by its other incybernation, c2ph, you have a code | |
95 | generator that translates C code into perl code! Well, structure and union | |
96 | declarations at least, but that's quite a bit. | |
97 | ||
98 | Prior to this point, anyone programming in perl who wanted to interact | |
99 | with C programs, like the kernel, was forced to guess the layouts of the C | |
100 | strutures, and then hardwire these into his program. Of course, when you | |
101 | took your wonderfully to a system where the sgtty structure was laid out | |
102 | differently, you program broke. Which is a shame. | |
103 | ||
104 | We've had Larry's h2ph translator, which helped, but that only works on | |
105 | cpp symbols, not real C, which was also very much needed. What I offer | |
106 | you is a symbolic way of getting at all the C structures. I've couched | |
107 | them in terms of packages and functions. Consider the following program: | |
108 | ||
109 | #!/usr/local/bin/perl | |
110 | ||
111 | require 'syscall.ph'; | |
112 | require 'sys/time.ph'; | |
113 | require 'sys/resource.ph'; | |
114 | ||
115 | $ru = "\0" x &rusage'sizeof(); | |
116 | ||
117 | syscall(&SYS_getrusage, &RUSAGE_SELF, $ru) && die "getrusage: $!"; | |
118 | ||
119 | @ru = unpack($t = &rusage'typedef(), $ru); | |
120 | ||
121 | $utime = $ru[ &rusage'ru_utime + &timeval'tv_sec ] | |
122 | + ($ru[ &rusage'ru_utime + &timeval'tv_usec ]) / 1e6; | |
123 | ||
124 | $stime = $ru[ &rusage'ru_stime + &timeval'tv_sec ] | |
125 | + ($ru[ &rusage'ru_stime + &timeval'tv_usec ]) / 1e6; | |
126 | ||
127 | printf "you have used %8.3fs+%8.3fu seconds.\n", $utime, $stime; | |
128 | ||
129 | ||
130 | As you see, the name of the package is the name of the structure. Regular | |
131 | fields are just their own names. Plus the follwoing accessor functions are | |
132 | provided for your convenience: | |
133 | ||
134 | struct This takes no arguments, and is merely the number of first-level | |
135 | elements in the structure. You would use this for indexing | |
136 | into arrays of structures, perhaps like this | |
137 | ||
138 | ||
139 | $usec = $u[ &user'u_utimer | |
140 | + (&ITIMER_VIRTUAL * &itimerval'struct) | |
141 | + &itimerval'it_value | |
142 | + &timeval'tv_usec | |
143 | ]; | |
144 | ||
145 | sizeof Returns the bytes in the structure, or the member if | |
146 | you pass it an argument, such as | |
147 | ||
148 | &rusage'sizeof(&rusage'ru_utime) | |
149 | ||
150 | typedef This is the perl format definition for passing to pack and | |
151 | unpack. If you ask for the typedef of a nothing, you get | |
152 | the whole structure, otherwise you get that of the member | |
153 | you ask for. Padding is taken care of, as is the magic to | |
154 | guarantee that a union is unpacked into all its aliases. | |
155 | Bitfields are not quite yet supported however. | |
156 | ||
157 | offsetof This function is the byte offset into the array of that | |
158 | member. You may wish to use this for indexing directly | |
159 | into the packed structure with vec() if you're too lazy | |
160 | to unpack it. | |
161 | ||
162 | typeof Not to be confused with the typedef accessor function, this | |
163 | one returns the C type of that field. This would allow | |
164 | you to print out a nice structured pretty print of some | |
165 | structure without knoning anything about it beforehand. | |
166 | No args to this one is a noop. Someday I'll post such | |
167 | a thing to dump out your u structure for you. | |
168 | ||
169 | ||
170 | The way I see this being used is like basically this: | |
171 | ||
172 | % h2ph <some_include_file.h > /usr/lib/perl/tmp.ph | |
173 | % c2ph some_include_file.h >> /usr/lib/perl/tmp.ph | |
174 | % install | |
175 | ||
176 | It's a little tricker with c2ph because you have to get the includes right. | |
177 | I can't know this for your system, but it's not usually too terribly difficult. | |
178 | ||
179 | The code isn't pretty as I mentioned -- I never thought it would be a 1000- | |
180 | line program when I started, or I might not have begun. :-) But I would have | |
181 | been less cavalier in how the parts of the program communicated with each | |
182 | other, etc. It might also have helped if I didn't have to divine the makeup | |
183 | of the stabs on the fly, and then account for micro differences between my | |
184 | compiler and gcc. | |
185 | ||
186 | Anyway, here it is. Should run on perl v4 or greater. Maybe less. | |
187 | ||
188 | ||
189 | --tom | |
190 | ||
191 |