Sun Grid Engine, Enterprise Edition 5.3 _x86: maintenance/security patch |
Status: RELEASED
Patch Id: 116659-03
***********************************************************************
READ THE TERMS OF THE AGREEMENT ("AGREEMENT") IN THE LEGAL_LICENSE.TXT
FILE CAREFULLY BEFORE USING THIS SOFTWARE. BY USING THE SOFTWARE, YOU
AGREE TO THE TERMS OF THIS AGREEMENT. IF YOU DO NOT AGREE TO ALL OF THE
TERMS, PROMPTLY DESTROY THE UNUSED SOFTWARE.
***********************************************************************Summary: Sun Grid Engine, Enterprise Edition 5.3 _x86: maintenance/security patch
Date: Jan/20/2006
Installation Requirements:
Additional instructions may be listed below
Solaris Release: 8_x86 9_x86
Sun OS Release: 5.8_x86 5.9_x86
Unbundled Product: Sun Grid Engine Enterprise Edition
Unbundled Release: 5.3
NOTE: This patch is for customers who installed the Sun Grid
Engine using the NON-Solaris Package format. If Sun
Grid Engine product was installed using Solaris
Packaging utilities, please use patch 116658.Xref:
Topic:
Relevant Architecture: sparc
BugId's fixed with this patch:
4930786 4930789 4930793 4949917 4952236 4952767 4957760 4969825 5018669 5018695 5018726 5018733 5018757 5018884 5019595 5019601 5019624 5019635 5020131 5020134 5020139 5020141 5020143 5020153 5020278 5020371 5021405 5040728 5086193 6185208 6252525 6340741 6366691 6370003 6370481 6370485
Changes incorporated in this version:
5040728 5086193 6185208 6252525 6340741 6366691 6370003 6370481 6370485
Patches accumulated and obsoleted by this patch:
Patches which conflict with this patch:
Required Patches:
Obsoleted by:
Files Included in this Patch:
<install_dir>/bin/solaris86/qacct
<install_dir>/bin/solaris86/qalter
<install_dir>/bin/solaris86/qconf
<install_dir>/bin/solaris86/qdel
<install_dir>/bin/solaris86/qhost
<install_dir>/bin/solaris86/qmake
<install_dir>/bin/solaris86/qmod
<install_dir>/bin/solaris86/qmon
<install_dir>/bin/solaris86/qsh
<install_dir>/bin/solaris86/qstat
<install_dir>/bin/solaris86/qsub
<install_dir>/bin/solaris86/qtcsh
<install_dir>/bin/solaris86/sge_commd
<install_dir>/bin/solaris86/sge_coshepherd
<install_dir>/bin/solaris86/sge_execd
<install_dir>/bin/solaris86/sge_qmaster
<install_dir>/bin/solaris86/sge_schedd
<install_dir>/bin/solaris86/sge_shadowd
<install_dir>/bin/solaris86/sge_shepherd
<install_dir>/bin/solaris86/sgecommdcntl
<install_dir>/lib/solaris86/libXltree.so
<install_dir>/utilbin/solaris86/adminrun
<install_dir>/utilbin/solaris86/checkprog
<install_dir>/utilbin/solaris86/checkuser
<install_dir>/utilbin/solaris86/filestat
<install_dir>/utilbin/solaris86/gethostbyaddr
<install_dir>/utilbin/solaris86/gethostbyname
<install_dir>/utilbin/solaris86/gethostname
<install_dir>/utilbin/solaris86/getservbyname
<install_dir>/utilbin/solaris86/infotext
<install_dir>/utilbin/solaris86/loadcheck
<install_dir>/utilbin/solaris86/now
<install_dir>/utilbin/solaris86/openssl
<install_dir>/utilbin/solaris86/qrsh_starter
<install_dir>/utilbin/solaris86/rlogin
<install_dir>/utilbin/solaris86/rsh
<install_dir>/utilbin/solaris86/rshd
<install_dir>/utilbin/solaris86/testsuidroot
<install_dir>/utilbin/solaris86/uidgid
Problem Description:
6370485 stale resource_unknown_list
6370481 increase PATH size limit of 2048 characters
6370003 long lines in accounting entries break qacct
6366691 utilbin/rsh can be used to gain root access
6340741 scheduler dies when array jobs are submitted with -now y
6252525 qmon: complex attributes not removeable
6185208 qmon and equal job arguments
5086193 Problems with load values if execution daemons run in a solaris zone
5040728 Job error state broken
(from 116659-02)
5021405 CSP reconnect problem of scheduler and execd
5020371 sge_shepherd creates world writable files
5020278 a colon in a job name breaks qacct
5020153 mail bomb upon abort with tightly integrated par jobs
5020143 qdel XXX.YY- will delete the first array task of job XXX
5020141 qsh and qlogin accepted the options -h and -hold_jid and ignored them later
5020139 a stored job template in qmon sets -hold_jid to a wrong value
5020134 qhost output broken for global consumables
5020131 renaming a user deletes the user
5019635 schedd_job_info=true causes large delays with parallel job scheduling
5019624 qselect/qstat -l selection wrongly considers load and utilization
5019601 "vmem" in qstat -j keeps the max value
5019595 Dateformat YYMMDDhhmm was interpreted wrong (qacct, qsub, qalter,...)
5018884 SSL vulnerabilities stated in Sun Alert 57524
5018757 HPCT jobs may fail - add variable to job environment which point to SGE binaries
5018733 Empty parameters crashes qstat and qhost
5018726 qalter lacks -dl option!
5018695 loadsensor doing output to stderr can block
5018669 qrsh/qlogin: "Connection refused" due to race condition in shepherd
4969825 not supported array task dependencies are not rejected
(from 116659-01)
4930786 global load values are ignored
4930789 An overwritten string attribut was ignored in the scheduler
4930793 minor issues with the sgeee ticket update interval
4949917 qmon seg faults with a user hold job from qtcsh qtask file
4952236 Broken mail option with SGE 5.3p4 qrsh
4952767 qrsh -notify doesn't work
4957760 Fix needed for CERT CA-2003-26 Multiple Vulnerabilities in SSL/TLS
Revision History:
116659-01 116659-02
Patch Installation Instructions:
--------------------------------
Special Install Instructions:
-----------------------------
Important note if Sun Grid Engine has been installed with openSSL support
-------------------------------------------------------------------------
If Sun Grid Engine has been installed with openSSL support ("CSP mode")
prior to SGEEE 5.3p3 (which was linked with openSSL 0.9.6.c), the
certificates which have been installed with these versions are
incompatible with certificates installed with SGEEE 5.3p4 or later. All
such certificates will need to be recreated after installing this patch
and before restarting Sun Grid Engine. Please refer to the Sun Grid
Engine Administration and User Manual for how to create new certificates
with the utility script "sge_ca", which comes with the distribution.
The reason for the incompatibility is a changed field name between
openSSL version 0.9.6 and 0.9.7 in the certificates, where
"uniqueIdentifier" has been renamed to "userId".
Note for bug id 5020371 ("sge_shepherd creates world writable files")
---------------------------------------------------------------------
If the execution daemon spool directory is located on NFS and the
execution host machine does not have read/write permissions for user root
(which is often the case due to security reasons) the shepherd process
will continue to create some of the files in its job directory with world
writable permissions. If the NFS client has write permissions the fix
will be effective without further changes after patch installation.
To make the fix effective it is required to install the execution daemon
spool directory on a local file system. Also for performance reasons it
is recommended to install the execution daemon spool directory on a local
file system.
1. Changing the execution daemon spool directory for all hosts
simultaneously - there may be no running jobs in the cluster
- shut down qmaster
- shutdown all execution daemons
- edit the global cluster configuration file
<sge_root>/<sge_cell>/common/configuration
and change the path to the configuration value
execd_spool_dir
- restart qmaster
- restart your execution daemons
2. Changing the execution daemon spool directory for each execution host
individually:
- no jobs may be running on the execution host where the spool
directory is going to be changed
- edit the local configuration for this execution host:
% qconf -mconf <hostname>
and add the local spool directory:
execd_spool_dir <path_to_exec_spool_directory>
- shutdown and restart the execution daemons
In addition to these notes please read the full "Special Install
Instructions" section later in this file about requirements when the
patch itself can be installed.
tar.gz Patch Installation:
--------------------------
This patch in 'tar.gz' format cannot be installed with 'patchadd' on Solaris
systems. The patch is installed by unpacking the 'tar.gz' file(s) in this
directory in <install_dir>. <install_dir> is usually your <sge_root>
directory. The installation of this patch later is not visible with the
"showrev -p" command on Solaris.
This patch later cannot be backed out. You may make a backup copy of the
files which will be overwritten when this patch is installed.
Please read "Install Instructions" later in this file and carry out
all steps before you unpack the 'tar.gz' file(s) included in this patch.
This patch in 'tar.gz' format may not be installed if the original
package has been installed with 'pkgadd' on Solaris. In this case please
install the available patches for Sun Grid Engine, Enterprise Edition
from http://sunsolve.sun.com in 'pkgadd' format.
The patch is installed by user root by unpacking the file(s) in the
directory where the original package has been installed:
# cd <install_dir>
# gzip -dc <patchid>/<targzfile> | tar xvpf -
After installing the patch you should correct the file permissions if
your Sun Grid Engine installation is installed as an "admin user" system:
# cd <sge_root>
# util/setfileperm.sh <adminuser> <admingroup> <sge_root>
where <adminuser> is the username of the "admin_user" of your global
cluster configuration and <admingroup> is the group which you set during
your initial installation for the files of your Sun Grid Engine
distribution.
Install Instructions:
---------------------
These installation instructions assume that you are running a homogenous
Sun Grid Engine cluster where all hosts share the same directory for the
binaries. If you are running Sun Grid Engine in a heterogenous
environment (mix of 32-bit and 64-bit binaries for Solaris and/or other
operating systems) it is only necessary to shutdown the daemons for the
architecture for which the patch is applied. If you installed the
binaries on a local partition, you only need to stop the Sun Grid Engine
daemons for that host on which you are installing the patch.
By default there may by no running jobs when the patch is installed.
There may pending batch jobs, but no pending interactive jobs (qrsh,
qmake, qsh, qtcsh).
It is possible to install the patch with running batch jobs. To avoid a
failure of the active "sge_shepherd" binary it is necessary to move the
old shepherd binary (and copy it back prior the installation of the
patch).
In no case it is supported to install the patch with running interactive
jobs, 'qmake' jobs or with running parallel jobs which use the tight
integration support (control_slaves=true in PE configuration is set).
Stopping the Sun Grid Engine cluster to start jobs
--------------------------------------------------
Disable all queues that no new jobs are started:
# qmod -d '*'
Optional (only needed if there are running jobs which should continue to
run when the patch is installed):
# cd $SGE_ROOT/bin
# mv <arch>/sge_shepherd <arch>/sge_shepherd.sge53
It is important that the binary is moved with the "mv" command. It may
not be copied because this could cause a crash of an active shepherd
process of a running job when the patch is installed.
Shutting down Sun Grid Engine qmaster and scheduler
---------------------------------------------------
You need to shutdown (and restart) the qmaster and scheduler daemon and
all execution daemons on all Sun Grid Engine hosts.
Shutdown all your execution hosts. Login to all your execution hosts and
stop the 'sge_execd' and 'sge_commd':
# /etc/init.d/rcsge stop
Then login to your qmaster machine and stop 'sge_qmaster', 'sge_schedd',
'sge_commd' and if the machine is also an execution host 'sge_execd'
# /etc/init.d/rcsge stop
Now verify with the 'ps' command that all Sun Grid Engine daemons on all
hosts are stopped. If you decided to rename the shepherd binary that
running patch job continue to run during the patch installation you may
not kill the 'sge_shepherd' binary.
Installing the patch and restarting Sun Grid Engine
---------------------------------------------------
Now please install the patch by unpacking the 'tar.gz' files included in
this patch as outline above.
After installing the patch you need to restart your cluster. Please login
to your qmaster machine and enter:
# /etc/init.d/rcsge
Now you should repeat this step on all your execution hosts.
After restarting Sun Grid Engine you may again enable your queues:
# qmod -e '*'
If you renamed the shepherd binary you may safely delete the old binary
when all jobs finished which where running prior the patch installation.
README -- Last modified date: Friday, January 20, 2006