uboot/doc/README.440-DDR-performance
<<
>>
Prefs
   1AMCC suggested to set the PMU bit to 0 for best performace on the
   2PPC440 DDR controller. The 440er common DDR setup files (sdram.c &
   3spd_sdram.c) are changed accordingly. So all 440er boards using
   4these setup routines will automatically receive this performance
   5increase.
   6
   7Please see below some benchmarks done by AMCC to demonstrate this
   8performance changes:
   9
  10
  11----------------------------------------
  12SDRAM0_CFG0[PMU] = 1 (U-Boot default for Bamboo, Yosemite and Yellowstone)
  13----------------------------------------
  14Stream benchmark results
  15-------------------------------------------------------------
  16This system uses 8 bytes per DOUBLE PRECISION word.
  17-------------------------------------------------------------
  18Array size = 2000000, Offset = 0
  19Total memory required = 45.8 MB.
  20Each test is run 10 times, but only
  21the *best* time for each is used.
  22-------------------------------------------------------------
  23Your clock granularity/precision appears to be 1 microseconds.
  24Each test below will take on the order of 112345 microseconds.
  25   (= 112345 clock ticks)
  26Increase the size of the arrays if this shows that you are not getting
  27at least 20 clock ticks per test.
  28-------------------------------------------------------------
  29WARNING -- The above is only a rough guideline.
  30For best results, please be sure you know the precision of your system
  31timer.
  32-------------------------------------------------------------
  33Function      Rate (MB/s)   RMS time     Min time     Max time
  34Copy:         256.7683       0.1248       0.1246       0.1250
  35Scale:        246.0157       0.1302       0.1301       0.1302
  36Add:          255.0316       0.1883       0.1882       0.1885
  37Triad:        253.1245       0.1897       0.1896       0.1899
  38
  39
  40TTCP Benchmark Results
  41ttcp-t: socket
  42ttcp-t: connect
  43ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000  tcp  ->
  44localhost
  45ttcp-t: 16777216 bytes in 0.28 real seconds = 454.29 Mbit/sec +++
  46ttcp-t: 2048 I/O calls, msec/call = 0.14, calls/sec = 7268.57
  47ttcp-t: 0.0user 0.1sys 0:00real 60% 0i+0d 0maxrss 0+2pf 3+1506csw
  48
  49----------------------------------------
  50SDRAM0_CFG0[PMU] = 0 (Suggested modification)
  51Setting PMU = 0 provides a noticeable performance improvement *2% to
  525% improvement in memory performance.
  53*Improves the Mbit/sec for TTCP benchmark by almost 76%.
  54----------------------------------------
  55Stream benchmark results
  56-------------------------------------------------------------
  57This system uses 8 bytes per DOUBLE PRECISION word.
  58-------------------------------------------------------------
  59Array size = 2000000, Offset = 0
  60Total memory required = 45.8 MB.
  61Each test is run 10 times, but only
  62the *best* time for each is used.
  63-------------------------------------------------------------
  64Your clock granularity/precision appears to be 1 microseconds.
  65Each test below will take on the order of 120066 microseconds.
  66   (= 120066 clock ticks)
  67Increase the size of the arrays if this shows that you are not getting
  68at least 20 clock ticks per test.
  69-------------------------------------------------------------
  70WARNING -- The above is only a rough guideline.
  71For best results, please be sure you know the precision of your system
  72timer.
  73-------------------------------------------------------------
  74Function      Rate (MB/s)   RMS time     Min time     Max time
  75Copy:         262.5167       0.1221       0.1219       0.1223
  76Scale:        258.4856       0.1238       0.1238       0.1240
  77Add:          262.5404       0.1829       0.1828       0.1831
  78Triad:        266.8594       0.1800       0.1799       0.1802
  79
  80TTCP Benchmark Results
  81ttcp-t: socket
  82ttcp-t: connect
  83ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000  tcp  ->
  84localhost
  85ttcp-t: 16777216 bytes in 0.16 real seconds = 804.06 Mbit/sec +++
  86ttcp-t: 2048 I/O calls, msec/call = 0.08, calls/sec = 12864.89
  87ttcp-t: 0.0user 0.0sys 0:00real 46% 0i+0d 0maxrss 0+2pf 120+1csw
  88
  89
  902006-07-28, Stefan Roese <sr@denx.de>
  91