By using the C6000 profiling tools, you can identify the time-critical sections of your code that need to be rewritten as linear assembly. The source code that you write for the assembly optimizer is similar to assembly source code. However, linear assembly code does not need to be partitioned, scheduled, or register allocated. The intention is for you to let the assembly optimizer determine this information for you. When you are writing linear assembly code, you need to know about these items:
Your linear assembly file can be a combination of linear assembly code segments and regular assembly source. Use the assembly optimizer directives to differentiate the assembly optimizer code from the regular assembly code and to provide the assembly optimizer with additional information about your code. The assembly optimizer directives are described in Section 5.4.
The compiler options in Table 5-1 affect the behavior of the assembly optimizer.
Option | Effect | See |
---|---|---|
--ap_extension | Changes the default extension for assembly optimizer source files | Section 3.3.10 |
--ap_file | Changes how assembly optimizer source files are identified | Section 3.3.8 |
--disable_software_pipelining | Turns off software pipelining | Section 4.6.1 |
--debug_software_pipeline | Generates verbose software pipelining information | Section 4.6.2 |
--interrupt_threshold=n | Specifies an interrupt threshold value | Section 3.12 |
--keep_asm | Keeps the assembly language (.asm) file | Section 3.3.2 |
--no_bad_aliases | Presumes no memory aliasing | Section 4.12.3 |
--opt_for_space=n | Controls code size on four levels (n=0, 1, 2, or 3) | Section 4.9 |
--opt_level=n | Increases level of optimization (n=0, 1, 2, or 3) | Section 4.1 |
--quiet | Suppresses progress messages | Section 3.3.2 |
--silicon_version=n | Select target version | Section 3.3.5 |
--skip_assembler | Compiles or assembly optimizes only (does not assemble) | Section 3.3.2 |
--speculate_loads=n | Allows speculative execution of loads with bounded address ranges | Section 4.6.3 |
When you are writing your linear assembly, your code does not need to indicate the following:
As with other code generation tools, you might need to modify your linear assembly code until you are satisfied with its performance. When you do this, you will probably want to add more detail to your linear assembly. For example, you might want to partition or assign some registers.
NOTE
Do Not Use Scheduled Assembly Code as SourceThe assembly optimizer assumes that the instructions in the input file are placed in the logical order in which you would like them to occur (that is, linear assembly code). Parallel instructions are illegal.
If the compiler cannot make your instructions linear (non-parallel), it produces an error message. The compiler assumes instructions occur in the order the instructions appear in the file. Scheduled code is illegal (even non-parallel scheduled code). Scheduled code may not be detected by the compiler but the resulting output may not be what you intended.
The linear assembly source programs consist of source statements that can contain assembly optimizer directives, assembly language instructions, and comments. See Section 5.3.1 for more information on the elements of a source statement.
Registers can be assigned explicitly to user symbols. Alternatively, symbols can be assigned to the A-side or B-side leaving the compiler to do the actual register allocation. See Section 5.3.2 for information on specifying registers.
The functional unit specifier is optional in linear assembly code. Data path information is respected; unit information is ignored.
The assembly optimizer attaches the comments on instructions from the input linear assembly to the output file. It attaches the 2-tuple <x, y> to the comments to specify which iteration and cycle of the loop an instruction is on in the software pipeline. The zero-based number x represents the iteration the instruction is on during the first execution of the kernel. The zero-based number y represents the cycle the instruction is scheduled on within a single iteration of the loop. See Section 5.3.4, for an illustration of the use of source comments and the resulting assembly optimizer output.