Recent pii3 trouble

From: Konstantin Olchanski (olchansk@ux1.phy.bnl.gov)
Date: Thu Jul 13 2000 - 16:52:58 EDT

  • Next message: Konstantin Olchanski: "operator account moved from pii3 to opus"

    Dear pii3.brahms.bnl.gov users, aparently pii3 had developed a peculiar
    problem. This problem is specific to the 2.2 Linux kernel series
    and is likely to stay with us for a while (AFAIK, the coming 2.4 Linux
    kernel series also have the same problem). The linux-kernel
    hackers are aware of the problem and hopefully will come up with a
    solution, eventually.
    
    This is what is happening:
    
    when one or more processes consume all available memory (both real and
    swap), Linux will try to free up some memory by killing off processes.
    Unfortunately the algorithm it uses to decide who to kill is flawed.
    Often, instead of killing the offending user processes (like
    the ones that consumed all memory), it kills critical system processes,
    such as the name server (named) or the NIS server (ypserv). I also
    see some dead httpd and mysqld processes, but both the Apache web server
    and the MySQL database server seem to survive the processcide.
    
    I have identified two sources of unlimited memory consumption:
    
    - user processes (i.e. netscape and root) and
    - CGI scripts run by the web server.
    
    To at least partially protect the critical services running
    on pii3 (named, NIS, system logger), I have decided to implement
    the memory limits for the above sources. The current memory limits for
    all users are essentially "unlimited".
    
    For the user processes, I will set the soft memory limit to 200 Mbytes. This
    should have no effect on user programs since pii3 has only 128 Mbytes
    of real memory and running processes that consume more than that is a bug.
    Also note that the soft memory limits can be raised by the user when needed.
    
    For the CGI scripts run from the web server I will try to set the memory
    limit to 64 Mbytes.
    
    Hopefully this will make pii3.brahms.bnl.gov a little bit more stable.
    
    pii3 will be rebooted and an additional message will go out when the memory
    limits are actually implemented.
    
    -- 
    Konstantin Olchanski
    Physics Department, Brookhaven National Laboratory, Long Island, New York
    olchansk@bnl.gov
    



    This archive was generated by hypermail 2b29 : Thu Jul 13 2000 - 16:53:40 EDT