Setting up R-GMA for CMS/LCG0 Testbed DRAFT!

  1. UI
  2. BOSS DB
  3. Receiver
  4. Worker Nodes
  5. Farm (and receiver) Servlets
  6. Registry

User Interface

LCG0 has no UK eScience CA stuff - Need to add UK e-Science CA RPM manually.
LCG0 has no CRL fetch-update script - add manually

IC-young: install R-GMA+BOSS on top of std RH7.3, add LCG0 UI; can't
connect to LB. Looks that LB tries a reverse DNS lookup?

Instead add BOSS to LCG0 UI done by scripts: seems to work
Copy across the BOSS binaries as compiled on IC-young. Leave DB and
receiver on IC-young

	boss submit -scheduler edg -jobtype counterdemo -classad CounterDemo.jdl

Steer submitted job to correct farm by condition in JDL

	Requirements = (other.GlueCEUniqueID=="gw37.hep.ph.ic.ac.uk:2119/jobmanager-pbs-lcgq");


Worker Nodes

May need to add UK e-Science CA RPM to WN (and CE/GK).

Jobs running on the worker nodes use the linked-in R-GMA API to send data to a nearby set of user servlets, probably on a MON box associated with the farm. Hence all that's really needed are some config files to tell the API where to look for the servlets.

It would also be a good idea to install some support rpms in case of problems with static linking.

Install 
	w3c-libwww-5.3.2-5.i386.rpm
	cppunit-1.8.0-1.i386.rpm
	log4cpp-0.3.4b-3.i386.rpm
	xerces-c-1.7.0-1.i386.rpm

	GNU.LANG_gcc_sys-2.95.2-0_asis_1.i386.rpm
	Add /usr/local/lib to ${LD_LIBRARY_PATH}

Create directory
	/opt/edg/etc/edg-rgma
and set RGMA_PROPS variable pointing to this.

create one-line files:
	${RGMA_PROPS}/CanonicalProducer.props
		canonicalProducerServletLocation=http://servlethost:8080/R-GMA/CanonicalProducerServlet		
	${RGMA_PROPS}/CircularBufferProducer.props
		producerServletLocation=http://servlethost:8080/R-GMA/ProducerServlet
	${RGMA_PROPS}/DatabaseProducer.props
		databaseProducerServletLocation=http://servlethost:8080/R-GMA/DBProducerServlet
	${RGMA_PROPS}/LatestProducer.props
		producerServletLocation=http://servlethost:8080/R-GMA/LatestProducerServlet
	${RGMA_PROPS}/Producer.props
		producerServletLocation=http://servlethost:8080/R-GMA/ProducerServlet	
	${RGMA_PROPS}/ResilientStreamProducer.props
		producerServletLocation=http://servlethost:8080/R-GMA/ResilientStreamProducerServlet
	${RGMA_PROPS}/StreamProducer.props
		producerServletLocation=http://servlethost:8080/R-GMA/StreamProducerServlet

	${RGMA_PROPS}/XMLConverter.props
		schemaLocation=${RGMA_PROPS}/XMLResponse.xsd

with the name of the servlet host (5a on diagram) substituted in, and make 
sure the schema file is at ${RGMA_PROPS}/XMLResponse.xsd.

Also, IF you use a web proxy and have environment variables set up for 
wget, etc. THEN make sure they are in exactly the form below 
(lower case, full URL, include localhost):
	http_proxy="http://wwwcache.brunel.ac.uk:10000/"
	no_proxy="brunel.ac.uk,brunel.ac.uk:8080,localhost,localhost:8080"

There is a shell script available which will set up the files for you (remember to edit the name of your servlet host), and tell you which environment variables/RPMs you'll have to setup up manually (it will download any needed RPMs but won't install them itself). You may also need this XMLResponse.xsd file (from R-GMA 3.1.39).

If you don't yet have a servlet host/MON box, you can put in a dummy name or just use young.brunel.ac.uk, the script's default (nothing should be using R-GMA yet). [If your WNs can see your local web proxy, you could specify the latter in the env. vars described above, in which case young.brunel.ac.uk should even work - but this is presented as a bodge for your amusement, rather than a serious suggestion/requirement!]

Trying this at IC:

Create directory
	/opt/edg/etc/edg-rgma
Bodge  RGMA_PROPS variable by adding to end of
/opt/globus/etc/globus-user-env.sh
	RGMA_PROPS="/opt/edg/etc/edg-rgma"

Should really create /etc/profile.d/edg-rgma-env.sh containing 
	RGMA_PROPS="/opt/edg/etc/edg-rgma" 

Dependency also on w3c-libwww - added check in script.

In R-GMA 3.2.22, deps on 
	w3c-libwww
	cppunit
	log4cpp
are removed. Still stuck with Xerces though.

Can download dependencies from  http://www.brunel.ac.uk/~eesrjjn/grid/boss/deps/

Farm Servlets

Setting up a set of servlets on a scratch RH73 box (like a manual MON box install, but without any monitoring...). Based on "Notes on setting up EDG UI" above.

Pre-install

Firewall: you will need ports 8080 (for http) and 8443 (for https) open both outbound and incoming, depending on which you are using. It might be wise to have an externally-visible DNS entry as well as the "real" IP address.

Start from "stock" RedHat 7.3 box with current set of updates - make sure 
have relevant packages.

I'm using OpenSSH -
        openssh-askpass-3.1p1-14
        openssh-askpass-gnome-3.1p1-14
        openssh-clients-3.1p1-14
        openssh-3.1p1-14
        openssh-server-3.1p1-14

Need libwww, including devel -
        w3c-libwww-apps-5.3.2-5
        w3c-libwww-devel-5.3.2-5
        w3c-libwww-5.3.2-5

Need OpenSSL, including devel -
        openssl-0.9.6b-35.7
        openssl-devel-0.9.6b-35.7
        // openssl-perl-0.9.6b-35.7

Need wget -
        wget-1.8.2-4.73

Need Python2 -
        python2-2.2.2-11.7.3
        python2-devel-2.2.2-11.7.3
        expat-1.95.2-2
        expat-devel-1.95.2-2

The PYTHONPATH variable is set up in the edg_user_env script, so
no need to do anything permanent now - but may be good idea to set it 
in shell used for install process.
[ setenv PYTHONPATH "/opt/edg/lib" or whatever ]

Since you will be using the bits of R-GMA that need a MySQL database, 
(i.e. the Archiver servlet, unless you mess around trying to selectively 
disable it) DON'T rely on the MySQL version that comes with RedHat 7.3 - 
you'll spend hours trying to get round the dependency failures.

R-GMA

Need MySQL for Archiver servlet, even if not running registry. This is not used for BOSS/R-GMA, so you could use an existing MySQL host: specify it when running rgma_config later.

Full MySQL 3.23 from MySQL site
        MySQL-3.23.56-1.i386.rpm
        // MySQL-bench-3.23.56-1.i386.rpm 
        MySQL-client-3.23.56-1.i386.rpm 
        MySQL-devel-3.23.56-1.i386.rpm
        MySQL-Max-3.23.56-1.i386.rpm
        MySQL-shared-3.23.56-1.i386.rpm

If upgrading std. 7.3 install need to remove
        mod_auth_mysql-1.11-1
        MyODBC-2.50.39-4
        MySQL-python-0.9.1-1
        perl-DBD-MySQL-1.2219-6
        php-mysql-4.1.2-7.3.6
        qt-MySQL-3.0.5-7.14
(came with RH7.3) and replace them after.

Probably best to add a password for root now. Upgrade will still have old values.
	mysqladmin -u root password 'new password'
	mysqladmin -u root -h {machinename} password 'new password'

If for some reason MySQL doesn't want to start automatically on reboot, do 
        /sbin/chkconfig mysql on
as root.


mm.MySQL (JDBC driver for MySQL, available in the EDG 2 MON external 
collection):
        mm.mysql-2.0.14-1edg


Java 
(available in the EDG 2 MON external collection):
        j2sdk-1.4.1_01-fcs.i586.rpm 
        j2sdk_profile-1.4.1_01-1.noarch.rpm

(You can get away with only the run-time env. jre if you will always use 
non-Java APIs to R-GMA. Remember to remove kaffe/jikes first if they are
already installed...)


Tomcat (from EDG 2 MON external collection):
        tomcat4-4.1.18-full.1jpp.noarch.rpm

>>Once this rpm has been installed please read the instructions given by
>>the rpm. It tells you to create a file /etc/java.conf. This file should
>>contain the location of java eg. JAVA_HOME=/usr/java/j2sdk1.4.1

Tomcat only needs install (rpm -ivh ...) - it is configured and started up 
by R-GMA's setup script, but you may want to briefly fire it up and check 
for errors in the logs. Stop tomcat4 service, and ensure tomcat4 user can 
read config files (server.xml) in /var/tomcat4/conf/ (don't chown/chgrp to 
'tomcat4' as R-GMA must still configure).

"For starters create a /etc/sysconfig/edg file with a 
	EDG_LOCATION=/opt/edg"
Steve Traylen

Get latest R-GMA from WP6 repository by browsing the RH7.3 autobuild 
directory:

server+Java API:
	edg-rgma-api-java-3.1.39-1.noarch.rpm
	edg-rgma-common-3.1.39-1.noarch.rpm
	edg-rgma-servlets-3.1.39-1.noarch.rpm
	edg-rgma-sqlutil-3.1.39-1.noarch.rpm

	Deps from EDG 2 MON external collection
		bouncycastle-jdk14-1.14-1.noarch.rpm
		log4j-1.2.6-1jpp.noarch.rpm
		netlogger-jar-1.0.0-1.i386.rpm
		prevayler-1.3.3-2.i386.rpm
		xerces-j1-1.4.4-12jpp.noarch.rpm
		xml-commons-1.0-0.b2.1jpp.noarch.rpm
		xml-commons-apis-1.0-0.b2.1jpp.noarch.rpm
	Deps from EDG 2 MON edg collection
		edg-java-security-client-1.4.1-1.noarch.rpm
	Deps from EDG 2 UI edg collection
		edg-user-env-0.3-1.noarch.rpm (may not actually be needed)

Pulse:
	edg-rgma-pulse-3.1.39-1.noarch.rpm
	Deps from EDG 2 MON external collection:
		jas-jar-1.0.0-1.i386.rpm
		jxUtil-jar-1.0.1-1.i386.rpm

C++ API (if need to compile something):
	edg-rgma-api-cpp-3.1.39-1.i386.rpm  
	edg-rgma-api-cpp-devel-3.1.39-1.i386.rpm (header files)
	Deps (hunt in WP6 repository)
		log4cpp-0.3.4b-3.i386.rpm
		xerces-c-1.7.0-1.i386.rpm
	Deps from EDG 2 MON external collection:
		GNU.LANG_gcc_shr-2.95.2-0_asis_1.noarch.rpm
		GNU.LANG_gcc_sys-2.95.2-0_asis_1.i386.rpm

As root, run edg-rgma-config (/opt/edg/sbin/edg-rgma-config)
and answer questions about registry and schema (see below).

Then source /opt/edg/etc/profile.d/edg-rgma-env.csh in user .cshrc.

In /opt/edg/etc/edg-rgma/rgma-schemaBrowser.xml edit and replace whole 
{URI} for 
Schema and registry servers to be used - note this includes servlet path
and name (/R-GMA/SchemaServlet, etc.). This is probably
	http://infocat.gridpp.ac.uk:8080/R-GMA/SchemaServlet
[This is young.brunel.ac.uk for CMS/BOSS test]
[ general: infocat.gridpp.rl.ac.uk:8080]
If you are likely to be playing with more than one registry, you may want 
to create different files and use a symbolic link to select the 
appropriate one. 

In user's .bashrc or .cshrc
alias pulse="edg-rgma-pulse ${RGMA_PROPS}/rgma-schemaBrowser.xml &" 
saves a lot of typing...

In the ${RGMA_PROPS} directory, edit all the .props files so that they
contain the full DNS name of the machine rather than just "localhost"

Should be it.

Servlet logging is controlled by
	$EDG_LOCATION/share/webapps/R-GMA/WEB-INF/classes/log4j.properties
while Java API logging is controlled by
	$RGMA_PROPS/log4j.props

Upgrades (to R-GMA 3.2.22, 3.3.28, 3.3.35, 3.3.37, 3.3.45):
	- Stop tomcat
	- rpm -Fvh any updated dependencies
	- rpm -Fvh R-GMA
	- run rgma-config
		If on a machine with any servlets, claim user servlets are 
		on localhost - this will create new-format web.xml with
		correct registry entries. Then re-run giving all FQDNs.
	- check .props files
	- correct $EDG_LOCATION/share/webapps/R-GMA/WEB-INF/web.xml
	- check /opt/edg/var/edg-rgma/registryconfig.xml
	- fix pulse config files

MON boxes: 
Check have
	# Java VM options for memory control
	JAVA_OPTS="-Xms64m -Xmx256m"

in /etc/tomcat4/tomcat4.conf near start

Add 
	#Bodge for R-GMA
	ulimit -n 3072

to /etc/init.d/tomcat4, near start
(must restart using /sbin/service for this to take effect)

Open the web.xml file (/var/tomcat4/webapps/R-GMA/WEB-INF/web.xml). Find 
the 'maxTupleMemory' init parameter and set this to -1 (it will be under 
the StreamProducerServlet). Then restart tomcat.
Rob Byrom 12 Dec 2003 

Back