Skip to content

Automatic emails when your job start/stop

In this page, I will show you how to configure email notification when your job start/stop. For this example, I will use Simple Email Service (SES), but you can use any SMTP provider.

Info

Note1: By default, the Scale-Out Computing on AWS admin user you created during the installation does not have any associated email address. If you want to use this account you must edit LDAP and add the "mail" attribute.

Note2: All qmgr command must be executed on the scheduler host

Configure SES sender domain

Open the SES console and verify your domain (or specific email addresses). For this example I will verify my entire domain (soca.dev) and enable DKIM support to prevent email spoofing.

Click 'Verify this domain', you will get list of DNS records to update for verification. Once done, wait a couple of hours and you will receive a confirmation when your DNS are validated.

Configure Recipients addresses

By default SES limits you to send email to unique recipients which you will need verify manually

If you want to be able to send email to any addresses, you need to request production access.

Notification code

Create a hook file (note: this file can be found under /apps/soca/$SOCA_CONFIGURATION/cluster_hooks/job_notifications.py on your Scale-Out Computing on AWS cluster)

Edit the following section to match your SES settings

ses_sender_email = '<SES_SENDER_EMAIL_ADDRESS_HERE>'
ses_region = '<YOUR_SES_REGION_HERE>'

Create the hooks

Once your script is created, configure your scheduler hooks by running the following commands:

user@host: qmgr -c "create hook notify_job_start event=runjob"
user@host: qmgr -c "create hook notify_job_complete event=execjob_end"
user@host: qmgr -c "import hook notify_job_start application/x-python default /apps/soca/$SOCA_CONFIGURATION/cluster_hooks/job_notifications.py"
user@host: qmgr -c "import hook notify_job_complete application/x-python default /apps/soca/$SOCA_CONFIGURATION/cluster_hooks/job_notifications.py"

Note: If you make any change to the python file, you must re-run the import hook command

Test

Let's submit a test job which will last 5 minutes

qsub -N mytestjob -- /bin/sleep 300

Now let's verify if I received the alerts correctly.

Job start:

5 minutes later:

Add/Update email

Run ldapsearch -x uid=<USER> command to verify if your user has a valid mail attribute and if this attribute is pointing to the correct email address. The example below shows a user without email attribute.

user@host: ldapsearch -x uid=mickael
## mickael, People, soca.local
dn: uid=mickael,ou=People,dc=soca,dc=local
objectClass: top
objectClass: person
objectClass: posixAccount
objectClass: shadowAccount
objectClass: inetOrgPerson
objectClass: organizationalPerson
uid: mickael
uidNumber: 5001
gidNumber: 5001
cn: mickael
sn: mickael
loginShell: /bin/bash
homeDirectory: /data/home/mickael

To add/update an email address, create a new ldif file (eg: update_email.ldif) and add the following content

dn: uid=mickael,ou=People,dc=soca,dc=local
changetype: modify
add: mail
mail: mickael@soca.dev

Then execute the ldapadd as root

user@host: ldapadd -x -D cn=admin,dc=soca,dc=local -y /root/OpenLdapAdminPassword.txt -f update_email.ldif
modifying entry "uid=mickael,ou=People,dc=soca,dc=local"

Finally re-run the ldapsearch command and validate your user now has mail attribute

user@host: ldapsearch -x uid=mickael
dn: uid=mickael,ou=People,dc=soca,dc=local
objectClass: top
objectClass: person
objectClass: posixAccount
objectClass: shadowAccount
objectClass: inetOrgPerson
objectClass: organizationalPerson
uid: mickael
uidNumber: 5001
gidNumber: 5001
cn: mickael
sn: mickael
loginShell: /bin/bash
homeDirectory: /data/home/mickael
mail: mickael@soca.dev

Check the logs

Scheduler hooks are located:

  • /var/spool/pbs/server_logs/ for notify_job_start on the Scheduler
  • /var/spool/pbs/mom_logs/ for notify_job_complete on the Execution Host(s)