SPNEGO: Map SPNs and Create Combined Keytab Files In One Step


I have been wanting to blog about my SPNEGO install guide for a while but have been just a bit busy lately (my usual excuse). However, I just had to help a client setup SPNEGO for their IBM Connections environment so I decided the time for procrastination is over.

 

If you look at the IBM documentation, the process to create the SPNEGO keytab files and mapping the correct URLs and Fully Qualified Hostnames of servers to the AD account is rather onerous. IBM documentation will have you create separate keytab files for each url/FQHN that you want to include in the SPNEGO config and then merge them. For the normal user that is setting up SPNEGO for the fist time that is painful indeed and confusing. My process below does it all in one step (one step per URL/fqhn) and adds all the settings to ONE keytab file. I am usually done in 5 minutes and then create the config file using wsadmin commands and am up and running in SPNEGO in under an hour.

Note: all commands below have to happen ON AN AD DOMAIN CONTROLLER, running them on your workstation will not work.

 

Environment / Variables:

  • SPNEGOAD account: SPNEGOAccount@DOMAIN.COM – domain\SPNEGOAccount
  • Server FQHN: serverfqhn1.example.com, serverfqhn2.example.com, serverfqhn3.example.com, etc.
  • Connections URL (c-record): connections.example.com



Check Current SPN mappings for SPNEGO AD Account:

  • setspn -l SPNEGOAccount
    (review output)


Step 2: Add SPN mapping to SPNEGOAccount
 and create Keytab files

[setspn -s] or [setspn -a] could be used just to add/map the SPNs to the account, but this does not create the keytab files.

  • setspn -s HTTP/servernew.example.com SPNEGOAccount
  • setspn -s HTTP/newsite.example.com SPNEGOAccount

 

Run commands to create a SINGLE keytab file AND map accounts at the same time:

  • ktpass -princ HTTP/servernew.example.com@example.com -ptype KRB5_NT_PRINCIPAL -mapUser SPNEGOAccount -mapOp set -pass password1A -in C:\Temp\KRB\krb5.keytab-out C:\Temp\KRB\krb5.keytab
  • ktpass -princ HTTP/newsite.example.com@example.com -ptype KRB5_NT_PRINCIPAL -mapUser SPNEGOAccount -mapOp add -pass password1A -in C:\Temp\KRB\krb5.keytab -out C:\Temp\KRB\krb5.keytab

 

Note: the first command has the command [set], all the following commands (one for each url/fqhn you want to add) has the command [add]. If you do not use the [add] command, each of your subsequent commands will override your previous one, leaving your AD account with only one fqhn/URL mapped to it. THIS IS IMPORTANT!
Check whether the SPNS are all correct:

  • setspn -l SPNEGOAccount
    (get output and show it has mappings)
  • ldifde -f c:\temp\new-output1.txt -r “(servicePrincipalName=HTTP/ serverfqhn1.example.com)”
  • ldifde -f c:\temp\new-output2.txt -r “(servicePrincipalName=HTTP/connections.example.com)”
    (Get output files and review)

 

 

Some Gotchas

Which  URLs/c-records and server FQHNs to map:

I map EVERYTHING. The main reason is that often your C-record for the site (our example connections.example.com) will point to the fqhn of a server or a load balancing device. In that case you need BOTH of them mapped. I mal all webservers/HIS, WAS servers and (if existing) the LB address (this s usually overkill and not necessary … but paranoia pays off sometimes).

Command errors:

Depending you your AD forest, the above ktpass command might need the AD account your are mapping to either in the [ACCOUNTNAME@DOMAIN.COM] format or [DOMAIN\ACCOUNTNAME] format. You will see the error right away when you run it for the first time.

SPNEGO setting in WebSphere:

If you go by the IBM documentation (there is allot flying around) you will see they generally tell you to add the fqhn of the Deployment Manager as the HOSTNAME in SPNEGO. Keep in mind that works for them because generally they testers tend to work with single server test installs where ALL the systems run on one server and the Dmgr is also the HIS server and often they don’t bother to change the URL for the Connections setup. What you need in there is the C-Record your users will be putting into their browsers to get to Connections in in our example connections.example.com. Should the C-record point to the FQHN of a web server then you could input that address as well. That is why I generally map EVERYTHING, that way you have maximum flexibility should you need to finagle with your architecture and move functionality around.

Oops, you forgot something …

If you suddenly notice you have to add servers to the SPNEGO setup (maybe you are migrating) – DO NOT ADD MORE MAPPINGS TO THE SPNEGO AD ACCOUNT. That will invalidate the existing keytab files and you will have a n SSO outage. To add additional files you have to stop all WebSphere servers involved , add the mappings with the ktpass command using the [ADD] variable and use the existing keytab file from one of your WebSphere servers. Then recreate the config file using wsdmin and replace the old keytab files with the new one.

IBM Connections: Metrics / Cognos and the HTTP timeout . . . .


Let me preface this post with a statement:

I really, really don’t like Cognos. Metrics is a pain as well.
And .. in case it lacked sufficient emphasis … I really, R E A L L Y don’t like  Cognos.

 

This is an interesting one that I have been battling with for quite some time at my current client. We had been running into errors with Cognos reports not finishing and the only errors we saw were in the syemOut.log files for HTTP sessions suddenly being reset:

 [3/5/13 11:51:05:341 EST] 000000b8 CognosBIReque 3 com.ibm.connections.metrics.reportgeneration.cognos.CognosBIRequestProcessor processCognosBIRequest post jobTemplateSearchPath=/content/folder[@name=’IBMConnectionsMetrics’]/package[@name=’Metrics’]/folder[@name=’static’]/jobDefinition[@name=’jobtemplate5′]

[3/5/13 11:56:05:532 EST] 000000b8 SystemErr R java.net.SocketException: Connection reset

Metrics sends Cognos 5 HTTP requests for each report time range – these correspond to the Jobtemplate1 – Jobtemplate5 reports in Cognos that are called and executed. These HTTP requests are synch calls so they have to stay connected and wait until the Jobtemplate call is finished so metrics can update the process. for all successful calls yo will see HTTP status 200 results and that is exactly what you want. We were seeing the above resets for calls to the Jobtemplate4 and Jobtemplate5 calls – it was KILLING ME.

Metrics was not at fault – it has it’s timeout settings in the metrics-config.xml file (secsPerRequest) and that was set to 3600 so it was off the list of culprits.

We reset the HTTP servers plug-in.xml setting for timeouts (ServerIOTimeout) first to 400 seconds and then to 600 and we saw no change.

We then did a test – we changed the interService href in the LotusConenctions-config.xml file as follows – btw that only works because we have a single Cognos server, not a clustered pair:

sloc:serviceReference bootstrapHost=”” bootstrapPort=”” clusterName=”admin_replace” enabled=”true” serviceName=”cognos” ssl_enabled=”true”> 
<sloc:href>
 
<sloc:hrefPathPrefix>/cognos</sloc:hrefPathPrefix>
 
<sloc:static href=”
http://connect.domain.com” ssl_href=”https://connect.domain.com“/> 
      <sloc:interService href=”https://cognosserverFQHN.domain.com:9443“/> 

Drum-roll ….. Here we go, it fixed the issue, but now the progress display (“xxx% complete”) on the metrics page to be permanently stuck at 0%. What this did do was point ut to the problem …. the F5 load balancer that we in front of the dual HTTP servers. It had a permanent 5 minute http thread timeout set and was killing ANY thread that was going over 5 minutes.

 

The Takeaway:

Metrics/Cognos spawns exactly 110 jobs for each Community metrics update request, many of these requests will go over 5 minutes and you should check that any device/server in your network has a higher HTTP timeout seting.

 

WebSphere: Errors installing Plug-in fix pack on IHS V7.x


This was a new one, not even the IBMers that I consulted with had run into this before.

I have been working at a client site on a large IBM Connections project since last year – V3.0.0, upgrade to 3.0.1, upgrade to 3.0.1.1 … now multiple code drops for V4 beta installs and preparation to get gold code V4 up as soon as possible (once it is released). In the course of the last year I have probably installed and upgraded more WAS and IHS servers that I have in several years previously – loving every moment of it!

Problem:

Today I had a new V 7.0 IHS on AIX to set up and we were running into issues installing the Plug-in fix for 7.0.0.21. The IHS FPO went without a problem, but the plug-in did not work. Errors, errors, errors:

java.lang.NullPointerException
        at com.ibm.ws.install.ni.framework.simplugins.SimVerifyFilePermissionsPlugin$ValidateFilePermissions.checkFilePermissions(SimVerifyFilePermissionsPlugin.java:245)
        at com.ibm.ws.install.ni.framework.simplugins.SimVerifyFilePermissionsPlugin$ValidateFilePermissions.checkFilePermissions(SimVerifyFilePermissionsPlugin.java:317)
        at com.ibm.ws.install.ni.framework.simplugins.SimVerifyFilePermissionsPlugin$ValidateFilePermissions.run(SimVerifyFilePermissionsPlugin.java:139)
java.lang.NullPointerException
        at com.ibm.ws.install.ni.framework.simplugins.SimVerifyFilePermissionsPlugin$ValidateFilePermissions.checkFilePermissions(SimVerifyFilePermissionsPlugin.java:245)
        at com.ibm.ws.install.ni.framework.simplugins.SimVerifyFilePermissionsPlugin$ValidateFilePermissions.checkFilePermissions(SimVerifyFilePermissionsPlugin.java:317)         at com.ibm.ws.install.ni.framework.simplugins.SimVerifyFilePermissionsPlugin$ValidateFilePermissions.run(SimVerifyFilePermissionsPlugin.java:139)

Nothing was working … I looked at fie permissions, ownership etc. – no change. root, or no root – it failed. I did some searching and after a while came across this tech noteswg21408430.

the errors were close enough – I was using the latest version of the IBM UpdateInstaller (at this time 7.0.0.23) but I decided to wipe out the UDI, install it once more new into a DIFFERENT folder (queue the drum roll) …. that made the difference.

So, sometimes it is not the tool being installed, but rather the tool doing the install that is at fault.  AND – UNIX file permissions are not always at fault either. Poor little Unix – there’s good boy!!