University of Pretoria
Operational / Internal Site

FEKO simulations with SLURM

A 24 process FEKO Gold license is available for use on the clusters.

As the license restricts the number of parallel processes possible, simulation parameters must be carefully specified in order to avoid queued jobs aborting due to license issues. The FEKO Gold license seats are defined per physical CPU and hence it is important to set the FEKO and SLURM parameters to match this mode of operation. Furthermore, extra environment variables must be set as the default CPU detection code of FEKO does not work well with the SLURM cgroups based CPU management.

Below there are SLURM templates for four types of FEKO simulations. The first two are primarily for undergraduate project students. The last two are only usable by certain postgraduate students and staff (the SLURM system limits the number of CPUs and memory available to different classes of users).

Please use the smallest feasible values for the simulation so that our cluster usage can be optimised. Also note that FEKO may decide to use out-of-core memory for your simulation if the memory required for your simulation exceeds the allocated memory. This mode of operation is usually indicated early in the log file. If this mode of operation is your intention, you should have a low –mem-per-cpu setting to ensure the rest of the system's memory is available for other users and system cache.

Note: In the templates, there are various parameters you should change to reflect your simulation requirements. These parameters are indicated with <CHANGETHIS> tag. For example:

#SBATCH --mail-user=<CHANGETHIS>

Cancelling FEKO simulations

<note warning> Cancelling a FEKO simulation should only be done as a last resort.

Cancelling a FEKO simulation should only be done as a last resort. </note>

Due to the way the FEKO Gold license works, if a FEKO simulation is not cancelled using the correct method, the licenses will remain checked out even though all the simulation processes have stopped. This will cause any queued FEKO simulation to abort as the SLURM scheduler's tracking of the licenses will be out of sync with that of the FEKO license manager.

To ensure cancellation works correctly, the runfeko process must be started using the srun command as shown in the templates below. This will ensure that the FEKO simulation runs as a distinct job step and will receive the required SIGTERM signal.

Single CPU FEKO simulation: 1 process license

This template, for a model with name feko_single, is suitable for many relatively short running simulations, or simulations that require more memory than is available on project lab computers. A single task is defined that will be allocated a single CPU core. The template also specifies the license requirement, in this case a single FEKO license.

Typically values for –mem-per-cpu are 2000 (for 2G) or 4000 (for 4G) for larger simulations. In the case of many small simulations, first try 1000 (1G).

feko_single.slurm
#!/bin/bash
#SBATCH --output=<CHANGETHIS>.log
#SBATCH --job-name=<CHANGETHIS>
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=<CHANGETHIS>
#SBATCH --licenses=feko:1
#SBATCH --time=<CHANGETHIS>
#SBATCH --mail-type=END
#SBATCH --mail-user=<CHANGETHIS>
 
# Load FEKO environment
source /usr/local/feko/bin/initfeko
 
# Run FEKO model simulation
srun runfeko feko_single

Moderate size Multi-CPU FEKO simulation: 1 process license

This template, for a model with name feko_multi, is suitable for larger simulations that require more memory than is available on project lab computers and would typically run more than an hour on a lab computer. A single task is defined that will be allocated 4 CPU cores. Additional flags are set to ensure that the correct CPU core allocation is done and that the license use by FEKO is minimised.

Typically values for –mem-per-cpu are 2000 (for 2G) or 4000 (for 4G) for larger simulations.

feko_multi.slurm
#!/bin/bash
#SBATCH --output=<CHANGETHIS>.log
#SBATCH --job-name=<CHANGETHIS>
#SBATCH --cpus-per-task=4
#SBATCH --cores-per-socket=4
#SBATCH --mem-per-cpu=<CHANGETHIS>
#SBATCH --licenses=feko:1
#SBATCH --time=<CHANGETHIS>
#SBATCH --mail-type=END
#SBATCH --mail-user=<CHANGETHIS>
 
# Load FEKO environment
source /usr/local/feko/bin/initfeko
 
# Create a machines file based on the node list allocated
hostlist=$(scontrol show hostname $SLURM_JOB_NODELIST)
rm -f machines.feko
echo -n "Target Nodes: "
for f in $hostlist
do
   echo $f':4' >> machines.feko
   echo $f':4'
done
echo
export FEKO_MACHFILE="machines.feko"
 
# Ensure that CPU detection (license use) is correct for cpuset allocation
export FEKO_SECFEKO_USE_FALLBACK_CPUDETECTION=1
export FEKO_CPU_PINNING=0
 
# Run FEKO model simulation with CPU socket binding to minimise license use
srun --cpu_bind=verbose,socket runfeko feko_multi -np 4

Large size multi-CPU FEKO simulation: 1 process license

This template, for a model with name feko_large, is suitable for larger simulations that would require a day or more to run on a desktop computer. A single task is defined that will be allocated 6 CPU cores. Additional flags are set to ensure that the correct CPU core allocation is done and that the license use by FEKO is minimised.

Typically values for –mem-per-cpu are 2000 (for 2G) or 4000 (for 4G) for larger simulations.

feko_large.slurm
#!/bin/bash
#SBATCH --output=<CHANGETHIS>.log
#SBATCH --job-name=<CHANGETHIS>
#SBATCH --cpus-per-task=6
#SBATCH --cores-per-socket=6
#SBATCH --mem-per-cpu=<CHANGETHIS>
#SBATCH --licenses=feko:1
#SBATCH --time=<CHANGETHIS>
#SBATCH --mail-type=END
#SBATCH --mail-user=<CHANGETHIS>
 
# Load FEKO environment
source /usr/local/feko/bin/initfeko
 
# Create a machines file based on the node list allocated
hostlist=$(scontrol show hostname $SLURM_JOB_NODELIST)
rm -f machines.feko
echo -n "Target Nodes: "
for f in $hostlist
do
   echo $f':6' >> machines.feko
   echo $f':6'
done
echo
export FEKO_MACHFILE="machines.feko"
 
# Ensure that CPU detection (license use) is correct for cpuset allocation
export FEKO_SECFEKO_USE_FALLBACK_CPUDETECTION=1
export FEKO_CPU_PINNING=0
 
# Run FEKO model simulation with CPU socket binding to minimise license use
srun --cpu_bind=verbose,socket runfeko feko_large -np 6

Maximum size multi-CPU FEKO simulation: 2 process licenses

This template, for a model with name feko_max, is suitable for very large simulations that would require a week or more to run on a desktop computer. A single task is defined that will be allocated 12 CPU cores. Additional flags are set to ensure that the correct CPU core allocation is done and that the license use by FEKO is minimised.

Typically values for –mem-per-cpu are 2000 (for 2G) or 2500 (for 2.5) for larger simulations.

feko_max.slurm
#!/bin/bash
#SBATCH --output=<CHANGETHIS>.log
#SBATCH --job-name=<CHANGETHIS>
#SBATCH --cpus-per-task=12
#SBATCH --mem-per-cpu=2500
#SBATCH --licenses=feko:2
#SBATCH --time=<CHANGETHIS>
#SBATCH --mail-type=END
#SBATCH --mail-user=<CHANGETHIS>
 
# Load FEKO environment
source /usr/local/feko/bin/initfeko
 
# Create a machines file based on the node list allocated
hostlist=$(scontrol show hostname $SLURM_JOB_NODELIST)
rm -f machines.feko
echo -n "Target Nodes: "
for f in $hostlist
do
   echo $f':12' >> machines.feko
   echo $f':12'
done
echo
export FEKO_MACHFILE="machines.feko"
 
# Ensure that CPU detection (license use) is correct for cpuset allocation
export FEKO_SECFEKO_USE_FALLBACK_CPUDETECTION=1
export FEKO_CPU_PINNING=0
 
# Run FEKO model simulation
srun runfeko feko_max -np 12

Interactive FEKO: 1 process license

For the creation of large models it is sometime necessary to run CADFEKO in interactive mode on the cluster. To do so, simply run cadfeko after logging in on the head node. This will start a single CPU session with 12GB of memory. Note that it will take about 30 to 50 seconds for the session to be allocated on the cluster and for CADFEKO to start. Note furthermore that the session has a hard time limit of 4 hours and will also terminate after 20 minutes of inactivity.

Do not use the interactive session for simulations.

For the X11 graphics to be handled correctly, this assumes you have a X11 server installed on your computer and that you logged in with X11 forwarding enabled.