Frequently Asked Questions
Contents
Administrative Questions
How to generate a grid host cert
After backing up the old cert files run the following.
source $VDT_LOCATION/setup.sh
./globus/bin/grid-cert-request -host <hostname>
How do I adjust the memory parameter on the kernel config line in a rocks CDROM kernel roll
The purpose of this fix is to allow Rocks to create very large raid partitions on 64bit machines.
Edit the following file:
rocks/src/roll/kernel/src/rocks-boot/enterprise/4/images/x86_64/isolinux.cfg
change:
label internal
kernel vmlinuz
append ramdisk_size=150000 initrd=initrd.img devfs=nomount ks
ksdevice=eth0 kssendmac selinux=0
to:
label internal
kernel vmlinuz
append ramdisk_size=150000 initrd=initrd.img devfs=nomount ks
ksdevice=eth0 kssendmac selinux=0 mem=1024M
then rebuild the kernel roll:
# cd rocks/src/roll/kernel
# make roll
Why does /dev not appear with the right files on RHEL4 in chroot?
/dev is not a directory but a mount.
mount -t tmpfs --bind /dev /sysroot/dev
LCG VOMSRS
https://lcg-voms.cern.ch:8443/vo/cms/vomrs
Complete Listing of OSG Configuration Variables
A complete list of OSG Configuration variables
https://twiki.grid.iu.edu/twiki/bin/view/Main/OSGConfigurationParameters
How to Output the Text form of Grid Certificates (Host Cert)
openssl x509 -in cert.pem -text
Converting PEM x509 Format Files to P12(DER) for import into Mozilla/Firefox
Log into the machine on which you have you x509 PEM files for your cert.
cd ~/.globus
openssl pkcs12 -in foo.pem -inkey bar.pem -export -out foo.p12
It will then ask for your password, this is the same password that you would use e.g. when you run voms-proxy-init.
NewCert? .p12 can then be imported into your browser!
Entering a UCSD ACT Customer Service Request
http://blink.ucsd.edu/go/csr
Preserving the RSL file from submitted grid jobs
On the server side, edit $VDT_LOCATION/globus/etc/globus-job-manager.conf and set "-save-logfile always"
That should preserve the gram_job_mgr files in the user home dir for debugging.
Updating OSG CA Certificates
http://vdt.cs.wisc.edu/releases/1.6.1/certificate_authorities.html
From
Run the following in $VDT_LOCATION
# pacman -update CA-Certificates
Useful Grid Twiki for GRAM Errors
http://goc.grid.sinica.edu.tw/gocwiki/SiteProblemsFollowUpFaq
Installing Ubuntu on a Mac Mini
Using Grub
http://doc.gwos.org/index.php/UbuntuOnApple#Introduction_to_Linux_Installation_on_i386_Mac_Mini
UCSD Testing and Monitoring Links
Links, tools and sites related to monitoring the UCSD Tier2.
UCSDT2 ITB RSV Monitoring Link
https://osg-gw-3.t2.ucsd.edu:8443/rsv/
Query the ITB Ress and BDII
condor_status -pool osg-ress-4.fnal.gov -constraint 'GlueSiteName=="UCSDT2-ITB1"' -l
ldapsearch -x -h is-itb.grid.iu.edu -p 2170 -b mds-vo-name=UCSDT2-ITB1,mds-vo-name=local,o=grid
Checking BDII publishing
Sites to check to see whether UCSD is properly reporting to the BDII?
http://is.grid.iu.edu/cgi-bin/status.cgi
SAM Test Page
https://twiki.cern.ch/twiki/bin/view/CMS/SAMForCMS
CMS Prod Exit Code Results (CMS)
http://t2.unl.edu/pa/xml/quality_map_query?team=OSG
Job Robot Report
http://jobrobot.web.cern.ch/JobRobot/
VORS Monitoring
http://vors.grid.iu.edu/cgi-bin/index.cgi
SCRAM Template.pm Error
Error:
SCRAM Error: It appears that the module "Template.pm" is not installed. Please check your installaion. If you are an administrator, you can find the Perl Template Toolkit at www.cpan.org or at the web site of the author (Andy Wardley):
Fix: Install perl-Template-Toolkit and supporting packages
Purging CE Jobs
To fully purge the CE of jobs you need to
- Remove or move the contents of the condor home area (eg. /state/data/condor_local)
- Remove or move the contents of the GRAM area $GLOBUS_LOCATION/tmp/gram_job_state/gram_condor_log.*
Installing the cert infrastructure only from the VDT
This will install the parts needed to request host certs as well as keep CRLs and CAs up to date on a machine.
#!/bin/sh
mkdir -p /data/vdt
cd /data/vdt
wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.25.tar.gz
tar zxvf pacman-3.25.tar.gz
chown root:root -R pacman-3.25
cd pacman-3.25
source setup.sh
cd /data/vdt
VDTSETUP_AGREE_TO_LICENSES=y
export VDTSETUP_AGREE_TO_LICENSES
VDTSETUP_ENABLE_ROTATE=y
export VDTSETUP_ENABLE_ROTATE
VDTSETUP_EDG_CRL_UPDATE=y
export VDTSETUP_EDG_CRL_UPDATE
VDTSETUP_CA_CERT_UPDATER=y
export VDTSETUP_CA_CERT_UPDATER
VDTSETUP_INSTALL_CERTS=r
export VDTSETUP_INSTALL_CERTS
pacman -pretend-platform:linux-rhel-4
pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:CA-Certificates
pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:CA-Certificates-Updater
pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:PPDG-Cert-Scripts
Condor jobs cannot find /state/data/condor_local/execute/dir_XXXX
Due to remounting order is important, check to make sure all underlying file systems are mounted before the remounts.
WS Gram Performance Optimization
http://www-unix.globus.org/toolkit/docs/4.0/execution/wsgram/WS_GRAM_Performance_Guide.html
Resting priority factors on the Condor cluster
for i in `condor_userprio -all -allusers |grep "@" | awk -F"@" '{print $1}'|grep ligo`; do for j in `seq 2 5`; do condor_userprio -setfactor ${i}@osg-gw-${j}.t2.ucsd.edu 100; done; done
Local Users Mappings
uscms048
uscms1581
uscms099
uscms1658
uscms1633
uscms076
uscms1586
uscms1285
uscms1674
WS GRAM Errors
Error initializing GAHP
Check that Java is installed and the condor_config correctl points to its location
Additional CMS Config for OSG
Copy the following file from the old install to the new
add-attributes.conf
./lcg/etc/add-attributes.conf
alter-attributes.conf
./lcg/etc/alter-attributes.conf
Getting a slot wn-client environment on a node interactively
Log into a node as root
# chroot /chroot/cafuser1
# su - cafuser1
# source /code/osgcode/wn-client-itb/setup.sh
Rocks Commands
Adding a cabinet to rocks
rocks add appliance cabinet-5 membership="Cabinet 5" short-name='c' node='cab5-compute'
OSG-RSV Commands at UCSD
osg-gw-4
$VDT_LOCATION/osg-rsv/setup/configure_osg_rsv --user rsv --init --server y --ce-probes --ce-uri "osg-gw-4.t2.ucsd.edu" --srm-probes --srm-uri "srm-3.t2.ucsd.edu" -srm-dir /pnfs/t2.ucsd.edu/data4/cms/phedex/store/user/tmartin --gridftp-probes --gratia --grid-type "OSG" --consumers --verbose --setup-for-apache --proxy /tmp/x509up_u59001
osg-gw-2
$VDT_LOCATION/osg-rsv/setup/configure_osg_rsv --user rsv --init --server y --ce-probes --ce-uri "osg-gw-2.t2.ucsd.edu" --srm-probes --srm-uri "srm-3.t2.ucsd.edu" -srm-dir /pnfs/t2.ucsd.edu/data4/cms/phedex/store/user/tmartin --gridftp-probes --gratia --grid-type "OSG" --consumers --verbose --setup-for-apache --proxy /tmp/x509up_u59001
OSG RSV
Testing CA Cert Probe by hand
su rsv -c "./cacert-crl-expiry-probe -m org.osg.certificates.cacert-expiry -u osg-gw-4.t2.ucsd.edu -x /tmp/x509up_u59001"
Gratia Search Links
https://t2.unl.edu/gratia/xml/dn_efficiency_summary?vo=cms&facility=UCSD&fixed-height=False https://t2.unl.edu/gratia/xml/dn_wasted_summary?vo=cms&facility=UCSD&fixed-height=False
Making the RAID devices on the nodes by hand
In the event you need to do this by hand
Create the partitions on the new disk
Stop the devices
mdadm --stop /dev/md0
mdadm --stop /dev/md1
mdadm --create /dev/md0 --chunk=256 --level=0 --raid-devices=4 /dev/sda2 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm --create /dev/md1 --chunk=256 --level=0 --raid-devices=4 /dev/sda5 /dev/sdb3 /dev/sdc3 /dev/sdd3
Make the file systems
mkfs.ext3 /dev/md0; mkfs.ext3 /dev/md1
tune2fs -m0 /dev/md0; tune2fs -m0 /dev/md1
Fixing a corrupt ext3 Journal
debugfs -w -R "feature ^has_journal,^needs_recovery" /dev/md2
fsck -y /dev/md2
tune2fs -j /dev/md2
Bulk CA Certs for Web Browsers
TACAR keeps a repository of all the IGTF CAs. You can individually
install the ones you care about directly in your browser (or try a bulk
download and install)
https://www.tacar.org/repos/
SRM Ping
srm-ping srm://bsrm-1.t2.ucsd.edu:8443/srm/v2/server
VOMS Proxy and FTS
https://twiki.cern.ch/twiki/bin/view/CMS/PhedexAdminDocsVomsProxies
HADOOP
Error 1
Exception in thread "main" java.io.IOException: Mkdirs failed to create /cms/store/user/tmartin
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:358)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:487)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:468)
Call to org.apache.hadoop.conf.FileSystem::create((Lorg/apache/hadoop/fs/Path;ZISJ)Lorg/apache/hadoop/fs/FSDataOutputStream;) failed!
Check to make sure the hadoop-site.xml is properly configured, or the CLASSPATH is set correctly.
Rocks Command Add Appliance at UCSD
rocks add appliance cabinet-5 membership="Cabinet 5" short-name='c' node='cab5-compute'
rocks add appliance cabinet-4 membership="Cabinet 4" short-name='c' node='cab4-compute'
rocks add appliance cabinet-6 membership="Cabinet 6" short-name='c' node='cab6-compute'
rocks add appliance cabinet-7 membership="Cabinet 7" short-name='c' node='cab7-compute'
Memory copy of hadoop fsimage when restarting
First put hadoop into safe mode
then run
hadoop dfsadmin -metasave
Authors
--
RamiVanguri - 03 Aug 2006
--
TerrenceMartin - 19 Mar 2007