uboot/doc/README.memory-test
<<
>>
Prefs
   1The most frequent cause of problems when porting U-Boot to new
   2hardware, or when using a sloppy port on some board, is memory errors.
   3In most cases these are not caused by failing hardware, but by
   4incorrect initialization of the memory controller.  So it appears to
   5be a good idea to always test if the memory is working correctly,
   6before looking for any other potential causes of any problems.
   7
   8U-Boot implements 3 different approaches to perform memory tests:
   9
  101. The get_ram_size() function (see "common/memsize.c").
  11
  12   This function is supposed to be used in each and every U-Boot port
  13   determine the presence and actual size of each of the potential
  14   memory banks on this piece of hardware.  The code is supposed to be
  15   very fast, so running it for each reboot does not hurt.  It is a
  16   little known and generally underrated fact that this code will also
  17   catch 99% of hardware related (i. e. reliably reproducible) memory
  18   errors.  It is strongly recommended to always use this function, in
  19   each and every port of U-Boot.
  20
  212. The "mtest" command.
  22
  23   This is probably the best known memory test utility in U-Boot.
  24   Unfortunately, it is also the most problematic, and the most
  25   useless one.
  26
  27   There are a number of serious problems with this command:
  28
  29   - It is terribly slow.  Running "mtest" on the whole system RAM
  30     takes a _long_ time before there is any significance in the fact
  31     that no errors have been found so far.
  32
  33   - It is difficult to configure, and to use.  And any errors here
  34     will reliably crash or hang your system.  "mtest" is dumb and has
  35     no knowledge about memory ranges that may be in use for other
  36     purposes, like exception code, U-Boot code and data, stack,
  37     malloc arena, video buffer, log buffer, etc.  If you let it, it
  38     will happily "test" all such areas, which of course will cause
  39     some problems.
  40
  41   - It is not easy to configure and use, and a large number of
  42     systems are seriously misconfigured.  The original idea was to
  43     test basically the whole system RAM, with only exempting the
  44     areas used by U-Boot itself - on most systems these are the areas
  45     used for the exception vectors (usually at the very lower end of
  46     system memory) and for U-Boot (code, data, etc. - see above;
  47     these are usually at the very upper end of system memory).  But
  48     experience has shown that a very large number of ports use
  49     pretty much bogus settings of CONFIG_SYS_MEMTEST_START and
  50     CONFIG_SYS_MEMTEST_END; this results in useless tests (because
  51     the ranges is too small and/or badly located) or in critical
  52     failures (system crashes).
  53
  54   Because of these issues, the "mtest" command is considered depre-
  55   cated.  It should not be enabled in most normal ports of U-Boot,
  56   especially not in production.  If you really need a memory test,
  57   then see 1. and 3. above resp. below.
  58
  593. The most thorough memory test facility is available as part of the
  60   POST (Power-On Self Test) sub-system, see "post/drivers/memory.c".
  61
  62   If you really need to perform memory tests (for example, because
  63   it is mandatory part of your requirement specification), then
  64   enable this test which is generic and should work on all archi-
  65   tectures.
  66
  67WARNING:
  68
  69It should pointed out that _all_ these memory tests have one
  70fundamental, unfixable design flaw:  they are based on the assumption
  71that memory errors can be found by writing to and reading from memory.
  72Unfortunately, this is only true for the relatively harmless, usually
  73static errors like shorts between data or address lines, unconnected
  74pins, etc.  All the really nasty errors which will first turn your
  75hair gray, only to make you tear it out later, are dynamical errors,
  76which usually happen not with simple read or write cycles on the bus,
  77but when performing back-to-back data transfers in burst mode.  Such
  78accesses usually happen only for certain DMA operations, or for heavy
  79cache use (instruction fetching, cache flushing).  So far I am not
  80aware of any freely available code that implements a generic, and
  81efficient, memory test like that.  The best known test case to stress
  82a system like that is to boot Linux with root file system mounted over
  83NFS, and then build some larger software package natively (say,
  84compile a Linux kernel on the system) - this will cause enough context
  85switches, network traffic (and thus DMA transfers from the network
  86controller), varying RAM use, etc. to trigger any weak spots in this
  87area.
  88
  89Note: An attempt was made once to implement such a test to catch
  90memory problems on a specific board.  The code is pretty much board
  91specific (for example, it includes setting specific GPIO signals to
  92provide triggers for an attached logic analyzer), but you can get an
  93idea how it works: see "examples/standalone/test_burst*".
  94
  95Note 2: Ironically enough, the "test_burst" did not catch any RAM
  96errors, not a single one ever.  The problems this code was supposed
  97to catch did not happen when accessing the RAM, but when reading from
  98NOR flash.
  99