dcpiflow - generate DIGITAL Continuous Profiling Infrastructure basic-block graph for a procedure annotated with samples collected during profiling via dcpid.
dcpiflow [-v] procedure-name image-file [sample-files...]
- -samesize
- Assign same size to each block in the control flow graph.
- -v
- Prints detailed progress information.
- -V
- Print program version number.
Dcpiflow generates a basic block graph from a named procedure that is extracted from a specified image file. The basic block graph is annotated with samples that were collected by dcpid and then stored in the named sample files. The basic block graph is printed on standard output, and can be converted to postscript by dcpi2ps(1). The output can also be fed into dcpisource(1) for further annotation.Instead of a procedure name, dcpiflow also accepts an address. Therefore, if an image contains multiple procedures with the same name, you can give an address that falls within one of these procedures to disambiguate the procedure selection.
Samples that do not belong to the specified procedure are ignored. If no sample files are provided, a sample count of zero is assumed for each instruction. If multiple sample files contribute to the samples for the specified procedure, these samples are merged together in the output.
Each node of the control flow graphs corresponds to a basic block. If the -samesize option is specified on the command line, each block is assigned the same size. Otherwise, blocks are scaled depending on the number of samples and instructions in them: If the average number of samples per instruction in small, the corresponding block is printed using tiny font and contains just the start address of the block. The larger the average number of samples per instruction, the larger the font used to print the content of a block.
All (except the smallest-font) nodes of the control flow graph consist of multiple lines. The first line contains the last 24-bits of the (hexadecimal) start address of the basic block. The remaining lines contain the instructions of the basic block. Each line uses the following format, from left to right:
- the last 24 bits of the instruction's address (hexadecimal),
- the source line number (decimal) or 0 if the source code cannot be found,
- colon,
- the instruction's 32-bit machine code (hexadecimal)
- the instruction in mnemonics
- the number of PC samples falling at this instruction address (decimal).
Example:
588584 318:2e4c0000 ldq_u a2, 0(s3) 1558 588588 318:a79d2d70 ldq at, 11632(gp) 191855
Typically, dcpiflow, dcpisource, and dcpi2ps are used together as follows:
dcpiflow idle_thread /vmunix vmunix* | \ dcpisource -f /src/kernel/kern/sched_prim.c | \ dcpi2ps -o idle_thread.ps
During the construction of the basic block graph, dcpiflow tries to determine the targets of all computed jumps. If it fails to do so for a jump, it prints an error message saying that it could not compute jump table targets. In such cases, the user can guide the operation of dcpiflow by telling dcpiflow an upper and lower bound on the value of the index register used in the jump. dcpiflow then uses the upper and lower bounds to determine all possible targets of the computed jump.To tell dcpiflow an upper bound and a lower bound on the value of the index register of a jump in an image, create a file called ".dcpijumps" in the current directory or in the home directory. This file should contain lines of the form:
0x<image_id in hex> 0x<jump address in hex> <lower> <upper>The file should contain one line for each image/compute-jump pair for which dcpiflow could not automatically determine the targets.Use dcpiscan to determine the image_id for an image.
Example:
When we run dcpiflow on a particular procedure, it prints the following message to stderr:
% dcpiflow ... 0x12004bb10: could not compute jump table targetsThe next step is to examine the disassembled code in the neighborhood of the computed jump at address 0x12004bb10. (The output of either dcpilist or dcpiflow can be used for this purpose.)... 04baf8 244:41da53b6 cmpult s5, 0xd2, t8 0 04bafc 244:e6c009d9 beq t8, 0x12004e264 0 04bb00 244:a79d82c0 ldq at, -32064(gp) 0 04bb04 244:41dc0459 s4addq s5, at, t11 0 04bb08 244:a3390000 ldl t11, 0(t11) 0 04bb0c 244:433d0419 addq t11, gp, t11 0 04bb10 244:6bf903e5 jmp zero, (t11), 0x12004caa8 0 ...The pair of instructions cmpult/beq at 0x12004baf8 branch away from the jump instruction if s5 is not in the range [0..0xd1]. The ldq instruction loads into register at the base address of the jump table associated with this computed jump. The s4addq multiplies the index register s5 by 4, adds it to the base of the jump table to get a pointer into the jump table, and stores the resulting pointer in register t11. The ldl instruction loads the corresponding jump table entry into t11 and the following addq adds the gp to the value of t11 since the jump entries are offsets from the contents of gp.Because of the cmpult/beq instruction pair, we know that the jmp instruction is reachable only when the index register s5 has a value in the range [0..0xd1]. Therefore, the following entry should be placed in .dcpijumps:
0x3249774100393048 0x12004bb10 0 209(The image-id 0x3249774100393048 was determined by dcpiscan.)
dcpi(1), dcpiprof(1), dcpilist(1), dcpidis(1), dcpiscan(1), dcpiepoch(1), dcpiflush(1), dcpicalc(1), dcpilabel(1), dcpi2ps(1), dcpicat(1), dcpiquit(1), dcpidiff(1), dcpitopstalls(1), dcpiwhatcg(1), dcpictl(1), dcpisource(1), dcpicc(1), dcpiversion(1), dcpiuninstall(1), dcpi2pix(1), dcpikdiff(1), dcpix(1), dcpisumxct(1), dcpistats(1), dcpid(1), dcpiformat(4), dcpiloader(5)
For more information, see the DIGITAL Continuous Profiling Infrastructure project home page (http://www.research.digital.com/SRC/dcpi/ from outside DIGITAL).
Sanjay Ghemawat, Monika HenzingerThis page was generated automatically by mtex software.