Frequently Asked Questions



Contents



Administrative Questions

How to generate a grid host cert

After backing up the old cert files run the following.

source $VDT_LOCATION/setup.sh
./globus/bin/grid-cert-request -host <hostname> 



How do I adjust the memory parameter on the kernel config line in a rocks CDROM kernel roll

The purpose of this fix is to allow Rocks to create very large raid partitions on 64bit machines.

Edit the following file:

rocks/src/roll/kernel/src/rocks-boot/enterprise/4/images/x86_64/isolinux.cfg

change:

label internal
       kernel vmlinuz
       append ramdisk_size=150000 initrd=initrd.img devfs=nomount ks
ksdevice=eth0 kssendmac selinux=0

to:

label internal
       kernel vmlinuz
       append ramdisk_size=150000 initrd=initrd.img devfs=nomount ks
ksdevice=eth0 kssendmac selinux=0 mem=1024M

then rebuild the kernel roll:

   # cd rocks/src/roll/kernel
   # make roll 



Why does /dev not appear with the right files on RHEL4 in chroot?

/dev is not a directory but a mount.

mount -t tmpfs --bind /dev /sysroot/dev



LCG VOMSRS

https://lcg-voms.cern.ch:8443/vo/cms/vomrs



Complete Listing of OSG Configuration Variables

A complete list of OSG Configuration variables

https://twiki.grid.iu.edu/twiki/bin/view/Main/OSGConfigurationParameters



How to Output the Text form of Grid Certificates (Host Cert)

openssl x509 -in cert.pem -text

Converting PEM x509 Format Files to P12(DER) for import into Mozilla/Firefox

Log into the machine on which you have you x509 PEM files for your cert.

cd ~/.globus
openssl pkcs12 -in foo.pem -inkey bar.pem -export -out foo.p12 

It will then ask for your password, this is the same password that you would use e.g. when you run voms-proxy-init. NewCert? .p12 can then be imported into your browser!



Entering a UCSD ACT Customer Service Request

http://blink.ucsd.edu/go/csr



Preserving the RSL file from submitted grid jobs

On the server side, edit $VDT_LOCATION/globus/etc/globus-job-manager.conf and set "-save-logfile always"

That should preserve the gram_job_mgr files in the user home dir for debugging.



Updating OSG CA Certificates

http://vdt.cs.wisc.edu/releases/1.6.1/certificate_authorities.html

From

Run the following in $VDT_LOCATION

# pacman -update CA-Certificates



Useful Grid Twiki for GRAM Errors

http://goc.grid.sinica.edu.tw/gocwiki/SiteProblemsFollowUpFaq



Installing Ubuntu on a Mac Mini

Using Grub

http://doc.gwos.org/index.php/UbuntuOnApple#Introduction_to_Linux_Installation_on_i386_Mac_Mini



UCSD Testing and Monitoring Links

Links, tools and sites related to monitoring the UCSD Tier2.

UCSDT2 ITB RSV Monitoring Link

https://osg-gw-3.t2.ucsd.edu:8443/rsv/

Query the ITB Ress and BDII

condor_status -pool osg-ress-4.fnal.gov -constraint 'GlueSiteName=="UCSDT2-ITB1"' -l 

ldapsearch -x -h is-itb.grid.iu.edu -p 2170 -b mds-vo-name=UCSDT2-ITB1,mds-vo-name=local,o=grid   

Checking BDII publishing

Sites to check to see whether UCSD is properly reporting to the BDII?

http://is.grid.iu.edu/cgi-bin/status.cgi

SAM Test Page

https://twiki.cern.ch/twiki/bin/view/CMS/SAMForCMS

CMS Prod Exit Code Results (CMS)

http://t2.unl.edu/pa/xml/quality_map_query?team=OSG

Job Robot Report

http://jobrobot.web.cern.ch/JobRobot/

VORS Monitoring

http://vors.grid.iu.edu/cgi-bin/index.cgi



SCRAM Template.pm Error

Error:

SCRAM Error: It appears that the module "Template.pm" is not installed. Please check your installaion. If you are an administrator, you can find the Perl Template Toolkit at www.cpan.org or at the web site of the author (Andy Wardley):

Fix: Install perl-Template-Toolkit and supporting packages

Purging CE Jobs

To fully purge the CE of jobs you need to

  1. Remove or move the contents of the condor home area (eg. /state/data/condor_local)
  2. Remove or move the contents of the GRAM area $GLOBUS_LOCATION/tmp/gram_job_state/gram_condor_log.*


Installing the cert infrastructure only from the VDT

This will install the parts needed to request host certs as well as keep CRLs and CAs up to date on a machine.


#!/bin/sh


mkdir -p /data/vdt
cd /data/vdt
wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.25.tar.gz
tar zxvf pacman-3.25.tar.gz
chown root:root -R pacman-3.25
cd pacman-3.25
source setup.sh
cd /data/vdt


VDTSETUP_AGREE_TO_LICENSES=y
export VDTSETUP_AGREE_TO_LICENSES
VDTSETUP_ENABLE_ROTATE=y
export VDTSETUP_ENABLE_ROTATE
VDTSETUP_EDG_CRL_UPDATE=y
export VDTSETUP_EDG_CRL_UPDATE
VDTSETUP_CA_CERT_UPDATER=y
export VDTSETUP_CA_CERT_UPDATER
VDTSETUP_INSTALL_CERTS=r
export VDTSETUP_INSTALL_CERTS


pacman -pretend-platform:linux-rhel-4
pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:CA-Certificates
pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:CA-Certificates-Updater
pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:PPDG-Cert-Scripts



Condor jobs cannot find /state/data/condor_local/execute/dir_XXXX

Due to remounting order is important, check to make sure all underlying file systems are mounted before the remounts.

WS Gram Performance Optimization

http://www-unix.globus.org/toolkit/docs/4.0/execution/wsgram/WS_GRAM_Performance_Guide.html

Resting priority factors on the Condor cluster

for i in `condor_userprio -all -allusers |grep "@" | awk -F"@" '{print $1}'|grep ligo`; do for j in `seq 2 5`; do condor_userprio -setfactor ${i}@osg-gw-${j}.t2.ucsd.edu 100; done; done

Local Users Mappings

uscms048
uscms1581
uscms099
uscms1658
uscms1633
uscms076
uscms1586
uscms1285
uscms1674

WS GRAM Errors

Error initializing GAHP

Check that Java is installed and the condor_config correctl points to its location

Additional CMS Config for OSG

Copy the following file from the old install to the new

add-attributes.conf

./lcg/etc/add-attributes.conf

alter-attributes.conf

./lcg/etc/alter-attributes.conf

Getting a slot wn-client environment on a node interactively

Log into a node as root

# chroot /chroot/cafuser1
# su - cafuser1
# source /code/osgcode/wn-client-itb/setup.sh

Rocks Commands

Adding a cabinet to rocks

rocks add appliance cabinet-5 membership="Cabinet 5" short-name='c' node='cab5-compute'

OSG-RSV Commands at UCSD

osg-gw-4

$VDT_LOCATION/osg-rsv/setup/configure_osg_rsv --user rsv --init --server y  --ce-probes --ce-uri "osg-gw-4.t2.ucsd.edu" --srm-probes --srm-uri "srm-3.t2.ucsd.edu" -srm-dir /pnfs/t2.ucsd.edu/data4/cms/phedex/store/user/tmartin --gridftp-probes  --gratia --grid-type "OSG"  --consumers --verbose --setup-for-apache --proxy /tmp/x509up_u59001

osg-gw-2

$VDT_LOCATION/osg-rsv/setup/configure_osg_rsv --user rsv --init --server y  --ce-probes --ce-uri "osg-gw-2.t2.ucsd.edu" --srm-probes --srm-uri "srm-3.t2.ucsd.edu" -srm-dir /pnfs/t2.ucsd.edu/data4/cms/phedex/store/user/tmartin --gridftp-probes  --gratia --grid-type "OSG"  --consumers --verbose --setup-for-apache --proxy /tmp/x509up_u59001

OSG RSV

Testing CA Cert Probe by hand

 su rsv -c "./cacert-crl-expiry-probe -m org.osg.certificates.cacert-expiry -u osg-gw-4.t2.ucsd.edu -x /tmp/x509up_u59001"

Gratia Search Links

https://t2.unl.edu/gratia/xml/dn_efficiency_summary?vo=cms&facility=UCSD&fixed-height=False https://t2.unl.edu/gratia/xml/dn_wasted_summary?vo=cms&facility=UCSD&fixed-height=False

Making the RAID devices on the nodes by hand

In the event you need to do this by hand

Create the partitions on the new disk

Stop the devices

mdadm --stop /dev/md0
mdadm --stop /dev/md1
mdadm --create /dev/md0 --chunk=256 --level=0 --raid-devices=4 /dev/sda2 /dev/sdb1 /dev/sdc1 /dev/sdd1
 mdadm --create /dev/md1 --chunk=256 --level=0 --raid-devices=4 /dev/sda5 /dev/sdb3 /dev/sdc3 /dev/sdd3

Make the file systems

mkfs.ext3 /dev/md0; mkfs.ext3  /dev/md1
tune2fs -m0 /dev/md0; tune2fs -m0 /dev/md1

Fixing a corrupt ext3 Journal

debugfs -w -R "feature ^has_journal,^needs_recovery" /dev/md2
fsck -y /dev/md2
tune2fs -j /dev/md2

Bulk CA Certs for Web Browsers

TACAR keeps a repository of all the IGTF CAs. You can individually install the ones you care about directly in your browser (or try a bulk download and install)

https://www.tacar.org/repos/

SRM Ping

srm-ping srm://bsrm-1.t2.ucsd.edu:8443/srm/v2/server

VOMS Proxy and FTS

https://twiki.cern.ch/twiki/bin/view/CMS/PhedexAdminDocsVomsProxies

HADOOP

Error 1

Exception in thread "main" java.io.IOException: Mkdirs failed to create /cms/store/user/tmartin                        
        at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:358)                                 
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:487)                                                 
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:468)                                                 
Call to org.apache.hadoop.conf.FileSystem::create((Lorg/apache/hadoop/fs/Path;ZISJ)Lorg/apache/hadoop/fs/FSDataOutputStream;) failed!                 

Check to make sure the hadoop-site.xml is properly configured, or the CLASSPATH is set correctly.

Rocks Command Add Appliance at UCSD

rocks add appliance cabinet-5 membership="Cabinet 5" short-name='c' node='cab5-compute'
rocks add appliance cabinet-4 membership="Cabinet 4" short-name='c' node='cab4-compute'
rocks add appliance cabinet-6 membership="Cabinet 6" short-name='c' node='cab6-compute' 
rocks add appliance cabinet-7 membership="Cabinet 7" short-name='c' node='cab7-compute'

Memory copy of hadoop fsimage when restarting

First put hadoop into safe mode then run

hadoop dfsadmin -metasave

Authors

-- RamiVanguri - 03 Aug 2006

-- TerrenceMartin - 19 Mar 2007

Topic revision: r47 - 2009/11/10 - 23:59:32 - TerrenceMartin
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback