Should I use backup and restore of my mid-range queue manager?

In several places the MQ Knowledge centre mentions backing up your queue manager, for example if case of problems when migrating.

I could not find an emojo showing a worried wizard, so let me explain my concerns so you can make an informed decision about using it.

Firstly some obvious statements

  • You take a backup so you can restore it at a later date to the same state
  • When you do a restore in-place you overwrite what was there before
  • The result of a restore should be the same as when you did the backup

See really obvious – but you need to think through the consequences of these.

Creating duplicate messages

Imaging there is a message on the queue saying “transfer 1 million pounds to Colin Paice”. This gets backed up. The messages gets processed, and I am rich!

You restore from the backup – and this message reappears – so unless the applications are smart and can detect a duplicate message – I will get even richer!

Losing messages

An application queues was empty when it was backed up. A message is put to the queue “Colin Paice pays you 1 million pounds”. Before this message gets processed the system is restored – resetting the queue to when it was backed up – so the message disappears and you do not get your money.

Status information gets out of step

The queue manager holds information in queues. For example each channel has information about the sequence number of the message flow. If this gets out of sync, then you have to manually resync them.

If you restore the SYSTEM.CHANNEL.SYNCQ from last week – it will have the values from last week. If you restore this data, the channels will fail to start because the sequence numbers do not match, and you need to use the reset channel command.

If you really want to do backup and restore…

Before you back up..

  • If this is a full repository, “just” move it to another queue manager.
  • Stop receiver channels, so work stops flowing into the queue manager
  • Set all application input queues to put disabled, to stop applications from putting to the queues.
  • Let the applications drain all application queues (and send the replies back)
  • Make sure all queues, such as Dead Letter Queue, and Event Queues have been processed and the queues are empty.
  • Make sure all transmission queues are empty, including SYSTEM.CLUSTER.TRANSMIT.QUEUE and any Split Cluster Transmit queues.
  • Shut down the queue manager, letting all work end cleanly.
  • Backup the queue manager files
  • Make a record of every configuration change you make, such as alter qlocal.

If you need to restore.. you need to empty the queue manager before you overwrite it.

  • Make sure there are no threads in doubt.
  • Make sure all transmission queues are empty
  • Have applications process all of the application messages, or offload the messages
  • Shut down MQ
  • Restore from your backup.
  • Any MCA channels which have been used may have the wrong sequence numbers, and will need to be reset
  • You may need to refresh cluster, so that you get the latest definitions sent the machine, and the information on local objects is propagated back to the full repository.
  • If you offloaded application messages, restore them
  • Reapply any changes, such as alter QL…
  • You need to be careful about applications connecting to your queue manager it is ready to do work. You might want to use strtmqm -ns to start in restricted mode.

It is dangerous if you restore the queue manager in a different place

You need to be careful if you restore a queue manager to a different place.

If you restore it, and start it, then channels are likely to start, and messages flow. For example it will contact the full repository, and send information about the objects in the newly restored queue manager. The full repository will get confused as you have two queue managers with the same name, and same unique name sending information. It is difficult to resolve this once it has occurred. People have been known to do this when testing their disaster recovery procedures.