Skip to content

Latest commit

 

History

History

contribs

This is the contribs dir for SLURM.  

SOURCE DISTRIBUTION HIERARCHY
-----------------------------

Subdirectories contain the source-code for the various contributations for
SLURM as their documentation. A quick description of the subdirectories
of the SLURM contribs distribution follows:

  arrayrun            [Adds support for array jobs]
     README                - Description of the arrayrun tool and its use
     arrayrun              - Command used to submit job arrays
     arrayrun_worker       - Back-end to the arrayrun command responsible for
                             spawning the jobs in the array

  cray                [Tools for use on Cray systems]
     etc_init_d_munge      - /etc/init.d/munge script for use with Munge
     etc_sysconfig_slurm   - /etc/sysconfig/slurm for Cray XT/XE systems
     libalps_test_programs.tar.gz - set of tools to verify ALPS/BASIL support
                             logic. Note that this currently requires:
                             * hardcoding in libsdb/basil_mysql_routines.c:
                               mysql_real_connect(handle, "localhost", NULL, NULL, "XT5istanbul"
                             * suitable /etc/my.cnf, containing at least the lines
                               [client]
                               user=basic
                               password=basic
                             * setting the APBASIL in the libalps/Makefile, e.g.
                               APBASIL := slurm/alps_simulator/apbasil.sh
                             To use, extract the files then:
                             > cd libasil/
                             > make -C alps_tests all   # runs basil parser tests
                             > make -C sdb_tests  all   # checks if database routines work
                             A tool named tuxadmin is also also included. When
                             executed with the -s or --slurm.conf option, this
                             contact the SDB to generate system-specific information
                             needed in slurm.conf (e.g. "NodeName=nid..." and
                             "PartitionName= Nodes=nid... MaxNodes=...".
     munge_build_script.sh - script to build Munge from sources for Cray system
     opt_modulefiles_slurm - enables use of Munge as soon as built
     slurm-build-script.sh - script to build SLURM from sources for Cray system.
                             set LIBROOT and SLURM_SRC environment variables
                             before use, for example:
                             LIBROOT=/ufs/slurm/build
                             SLURM_SRC=${SLURM_SRC:-${LIBROOT}/slurm-2.3.0-0.pre4}
     srun.pl               - A perl wrapper for the aprun command. Use of this
                             wrapper requires that SLURM's perlapi be installed.
                             Execute configure with the --with-srun2aprun option
                             to build and install this instead of SLURM's normal
                             srun command.

  env_cache_builder.c [C program]
     This program will build an environment variable cache file for specific 
     users or all users on the system. This can be used to prevent the aborting 
     of jobs submitted by Moab using the srun/sbatch --get-user-env option.
     Build with "make -f /dev/null env_cache_builder" and execute as user root
     on the node where the moab daemon runs.

  lua                [ LUA scripts ]
     Example LUA scripts that can serve as SLURM plugins.
     job_submit.lua - job_submit plugin that can set a job's default partition
                      using a very simple algorithm
     job_submit_license.lua - job_submit plugin that can set a job's use of
                      system licenses
     proctrack.lua  - proctrack (process tracking) plugin that implements a
                      very simple job step container using CPUSETs

  make.slurm.patch   [ Patch to "make" command for parallel build ]
     This patch will use SLURM to launch tasks across a job's current resource
     allocation. Depending upon the size of modules to be compiled, this may
     or may not improve performance. If most modules are thousands of lines
     long, the use of additional resources should more than compensate for the
     overhead of SLURM's task launch. Use with make's "-j" option within an
     existing SLURM allocation. Outside of a SLURM allocation, make's behavior
     will be unchanged. Designed for GNU make-3.81.

  mpich1.slurm.patch [ Patch to mpich1/p4 library for SLURM job task launch ]
     For SLURM based job initiations (from srun command), get the parameters 
     from environment variables as needed. This allows for a truly parallel 
     job launch using the existing "execer" mode of operation with slight 
     modification.  

  pam                [ PAM (Pluggable Authentication Module) for SLURM ]
     This PAM module will restrict who can login to a node to users who have
     been allocated resources on the node and user root.

  perlapi/           [ Perl API to SLURM source ]
     API to SLURM using perl.  Making available all SLURM command that exist
     in the SLURM proper API.

  ptrace.patch       [ Linux Kernel patch required to for TotalView use ]
     0. This has been fixed on most recent Linux kernels. Older versions of
     Linux may need this patch support TotalView.
     1. gdb and other tools cannot attach to a stopped process. The wait that 
     follows the PTRACE_ATTACH will block indefinitely.
     2. It is not possible to use PTRACE_DETACH to leave a process stopped, 
     because ptrace ignores SIGSTOPs sent by the tracing process.

  sjobexit/          [ Perl programs ]
     Tools for managing job exit code records

  sjstat             [ Perl program ]
     Lists attributes of jobs under SLURM control

  skilling.c         [ C program ]
     This program can be used to order the hostnames in a 2+ dimensional
     architecture for use in the slurm.conf file. It is used to generate
     the Hilbert number based upon a node's physical location in the 
     computer. Nodes close together in their Hilbert number will also be 
     physically close in 2-D or 3-D space, so we can reduce the 2-D or 3-D
     job placement problem to a 1-D problem that SLURM can easily handle
     by defining the node names in the slurm.conf file in order of their
     Hilbert number. If the computer is not a perfect square or cube with
     power of two size, then collapse the node list maintaining the numeric
     order based upon the Hilbert number.

  slurmdb-direct     [ Perl program ]
     Program that permits writing directly to SlurmDBD (SLURM DataBase Daemon).

  spank_core.c       [ SPANK plugin, C program ]
     A SLURM SPANK plugin that can be used to permit users to generated
     light-weight core files rather than full core files.

  time_login.c       [ C program ]
     This program will report how long a pseudo-login will take for specific
     users or all users on the system. Users identified by this program 
     will not have their environment properly set for jobs submitted through
     Moab. Build with "make -f /dev/null time_login" and execute as user root. 

  torque/            [ Wrapper Scripts for Torque migration to SLURM ]
     Helpful scripts to make transition to SLURM easier from PBS or Torque.  
     These scripts are easily updatable if there is functionality missing.