Reducing Memory Footprint in L2SRAM

This section details various techniques for reducing L2SRAM usage. The examples shown here make the assumption that the application requires 64K of stack space per core.

Stacks in MSMCSRAM

The default configuration places the stack in each core’s L2SRAM and sizes them to 128K. A technique for reducing L2SRAM usage is to place the thread stacks in MSMCSRAM. For example, with 8 cores and 128KB of stack per core, the trade-off is to use 1MB of MSMCSRAM for stacks and free up 128KB on each core’s local L2SRAM. Steps:

  1. Update the application configuration file to create the heap in MSMCSRAM. Setup shared region 0 on MSCMSRAM instead of DDR3. This will create the heap in MSMCSRAM.

    // 64K per core for stack + 64K for other mallocs
    var sharedHeapSize = 0x90000;
    
    var msmcmem = Program.cpu.memoryMap["MSMCSRAM"];
    
    // Configure a Shared Region with a heap in MSMC memory region
    var SharedRegion   = xdc.useModule('ti.sdo.ipc.SharedRegion');
    SharedRegion.setEntryMeta( sharedRegionId,
                               {   base: msmcmem.base,
                                   len:  sharedHeapSize,
                                   ownerProcId: 0,
                                   cacheEnable: true,
                                   createHeap: true,
                                   isValid: true,
                                   name: "MSMC_SR0",
                               });
    
  2. Update omp_config.c/__TI_omp_configure to allocate thread stacks from the heap Replace

    __TI_omp_config_thread_stack(0, 0);
    

    With

    __TI_omp_config_thread_stack(1, 0x10000);
    
  3. Reduce program.stack to 4K in the configuration file. This stack is only used by OpenMP runtime during initialization. The program’s main thread starts execution in the stack configured in steps 1, 2.

    program.stack = 0x1000;
    

Warning

Placing thread stacks in DDR has potential to significantly degrade performance due to register spills to slow DDR stack within frequently executed loops in the application.