Difference between revisions of "FAQ"

From Storrs HPC Wiki
Jump to: navigation, search
(I cannot login the the HPC when I am off campus.)
(I get the error says module conflicts with the other modules(s).)
 
Line 9: Line 9:
 
Or
 
Or
 
  $ module switch <Module2> <Module1>
 
  $ module switch <Module2> <Module1>
 +
Or, if neither of these work, you can purge all the modules with
 +
$ module purge
 +
and start fresh
  
 
=== I get the error says the module depends on the other module(s). ===
 
=== I get the error says the module depends on the other module(s). ===

Latest revision as of 14:18, 13 June 2019

I get the error says module conflicts with the other modules(s).

If the 'module load' command returns the following errors:

$ module load <Module1>
<Module1>(4):ERROR:150: Module '<Module1>' conflicts with the currently loaded module(s) '<Module2>'
<Module1>(4):ERROR:102: Tcl command execution failed: conflict <Module_Group>

This means that the module you want to load conflicts with the currently loaded module, <Module2>. To fix it, please unload <Module2> and then load <Module1> again:

$ module unload <Module2>
$ module load <Module1>

Or

$ module switch <Module2> <Module1>

Or, if neither of these work, you can purge all the modules with

$ module purge

and start fresh

I get the error says the module depends on the other module(s).

If the 'module load' command returns the following errors:

$ module load <Module1>
<Module1>(9):ERROR:151: Module '<Module1>' depends on one of the module(s) '<Module2>'
<Module1>(9):ERROR:102: Tcl command execution failed: prereq <Modle2>

This means that the module you want to load depends on the module <Module2>. To fix it, please load <Module2> prior to <Module1>:

$ module load <Module2> <Module1>

You may encounter the above errors many times. Please load/unload the requested/conflicted modules and try again.

I get the error(s) when I am trying to use Intel SDK and MPIs

If you got the following errors while using both Intel SDK and one of the mpi modules:

/apps/intelics/2013.1.039/composer_xe_2013_sp1.0.080/mpirt/bin/intel64/mpirun: line 96: 
/apps/intelics/2013.1.039/composer_xe_2013_sp1.0.080/mpirt/bin/intel64/mpivars.sh: No such file or directory

please unload the intelics and mpi modules and reload them in the order. Make sure that the intelics module is prior to the mpi module. For example,

$ module list
1) modules                                      3) intelics/<version>
2) mpi/<software>/<version>
$ module unload intelics mpi
$ module load intelics/<version> mpi/<software>/<version>
$ module list
1) modules                                      3) mpi/<software>/<version>
2) intelics/<version>

I cannot login the the HPC when I am off campus.

The HPC Cluster only allows the connection of SSH from the campus-wide computers. i.e. the computers in campus's library, the computers in campus's office or lab, the computers connected to UCONN-SECURE WiFi and so on. You will stick in the screen if you are using PuTTY to connect the HPC out of campus or get the following errors if you are trying to connect to the HPC via SSH out of the campus.

$ ssh <NetID>@login.storrs.hpc.uconn.edu
Connection time out.

In order to connect the HPC when you are off campus, please connect VPN of UConn first. After following the steps in the VPN, you can login the HPC as normal.

I cannot run the script via sbatch

If the script could not run via sbatch. The errors usually looks like:

sh: -c: line 0: unexpected EOF while looking for matching `"'
sh: -c: line 1: syntax error: unexpected end of file

It is usually due to the wrong file format. Your file is still in the Windows format but not Linux format.

$ file comsol.sh # with wrong format
comsol.sh: Little-endian UTF-16 Unicode English text, with CRLF line terminators
$ iconv -f utf-16 -t ascii comsol.sh -o comsol.sh # Convert to ASCII first.
$ dos2unix comsol.sh # change CRLF line terminators to Unix format
$ file comsol.sh
comsol.sh: Bourne-Again shell script text executable

I get the errors like: "slurmstepd: task/cgroup: unable to remove step memcg :"

Users may sometimes encounter the following message in their job output:

slurmstepd: task/cgroup: unable to remove step memcg : No such file or directory

This message results from an issue with the way slurm cleans up the job. The last detailed error message(No such file or directory) may vary. This is an known issue in SLURM. The SLURM support is trying to fix it in the next version but for now it is safe to ignore this warning.