Credit: IBM
IBM WebSphere Application Server Version 8 and later (V8.x) provides
support for asynchronous messaging based on the Java™
Message Service (JMS) version 1.1 specification. Using either the default
messaging provider or IBM WebSphere MQ, you can write message-driven beans
(MDBs) that listen on a destination (either a message queue or a topic).
When a message arrives at the destination, the MDB is invoked and its
onMessage() method called. If a “poison message” is delivered to an MDB,
the application can choose to reject it.
In this situation, what happens to the message and how does the application
server behave?
What is a poison
message?
A poison message is simply a message that the receiving MDB application is
unable to process. It could be that the message has become corrupt, is in
an unexpected format, or contains information that cannot be handled by
the MDB’s business logic. For example, suppose you have an MDB that
processes book orders. If the MDB receives an order for a book that
doesn’t exist, the message could be considered a poison message.
If a poison message is delivered to an MDB, the bean can do one of three
things:
- Roll back the message to the destination that it came
from.This can be done if the MDB is running within a
transaction, and ensures that the message is not lost. Returning
the message to its original destination will give the MDB the
chance to process the message again. This is useful if the
application was unable to handle the message due to a temporary
problem, such as a database being unavailable. To roll back the
message, the MDB should call the setRollbackOnly() method on the
message-driven context associated with the bean. - Move the message to a different destination.
This
is particularly useful when the MDB is not running inside a
transaction, as it prevents the poison message from being lost. A
systems administrator can examine the message at a later date to
find out why it could not be processed, and potentially move it
back to the destination being monitored by the MDB so that it can
be reprocessed. - Discard the message, by doing nothing.
This means
that the message is gone forever.
It is the responsibility of the MDB application to determine if it has
received a poison message and how it should be handled. There is no way
for the JMS provider or the application server to determine if a message
is corrupt or cannot be processed.
Rolling back a poison
message
An MDB running inside of a transaction can choose to roll back a message
that cannot be processed. What does the application server do in this
situation? The answer depends on what JMS provider is being used.
Using the default messaging
provider
MDBs that are configured to use the default messaging provider monitor
queues or topic spaces hosted by a service integration bus. When messages
arrive on the queue or are published on the topic, they are delivered to
the MDB. The behaviour of the application server when an MDB rolls a
message back depends on the values of three properties:
- The JMS message property Redelivery count indicates
the number of times a JMS message has been delivered to an
application. This property is incremented if an MDB has rejected the
message after delivery. - The JMS destination property Maximum failed
deliveries specifies the number of times a message on a
destination will be delivered to an MDB before it is moved to the
exception destination that has been defined for this destination. The
default value of the property is 5, which means that if a message is
rolled back five times, the application server will move it to a
different location. This property can be changed on the destination’s
configuration panel in the WebSphere Application Server administrative
console (Figure 1).
Figure 1. Maximum failed deliveries property for
the TestDestination
The JMS destination property Exception destination tells
the application server what to do with poison messages that have been
rolled back the number of times specified in the Maximum failed deliveries
property. The Exception destination property can have one of three
values:
- System: Route messages to the system exception
destination _SYSTEM.Exception.Destination.<messaging engine
name> - None: Leave the message on the original
destination. - Specify: Move the message to a user-specified
exception destination.
The default value for this property is System, so any messages that have
been rolled back more than Maximum failed deliveries will be moved to the
system exception destination defined for the messaging engine that hosts
the destination. Like the Maximum failed deliveries property, Exception
destination can be changed via the destination’s configuration panel in
the WebSphere administrative console (Figure 2).
Figure 2. Exception destination for the
TestDestination is set to “System”
If the Exception destination property is set to “None” then, by default,
the application server will look at the value of the messaging engine
property Default blocked destination retry interval to
determine how long to wait before redelivering a poison message to an MDB.
This property has the default value of 5000 milliseconds, which equates to
5 seconds.
Figure 3. Default blocked destination retry
interval
You can override this time period for individual JMS destinations by
setting two JMS destination properties:
- You can select Override messaging engine blocked retry timeout
default. - You can set Blocked retry timeout in
milliseconds to a value of0
or greater. The default
value of this property is-1
. When you set the property to this value
and select Override messaging engine blocked retry timeout
default,
the value of Default blocked destination retry interval (the messaging
engine property) is used to determine how long to wait before
redelivering the message to an MDB.
Figure 4. Blocked retry timeout in milliseconds
for the TestDestination is set to 10000 milliseonds (or 10
seconds)
When a message is rolled back, its Redelivery count is incremented and
compared against the value of Maximum failed deliveries for the
destination from which the message originally came.
If the Redelivery count is less than the Maximum failed deliveries, the
message is returned to the destination so that it can be reprocessed.
If the Redelivery count is equal to or greater than Maximum failed
deliveries, the messaging engine will either move the message to the
Exception destination specified, or wait for the period of time specified
by sib.processor.blockedRetryTimeout before attempting to deliver it
again, if the Exception destination is set to None. This behaviour is
shown in Figure 5.
Figure 5. How the default messaging provider
handles poison messages
The default behaviour
By default, the Maximum failed deliveries property has the value 5, and the
Exception destination is set to System. If these default values are used,
what happens when a poison message arrives on a destination and is then
delivered to an MDB?
- Since the MDB is unable to process the message, it rolls it back,
which causes the Redelivery count to be increased to 1. The messaging
engine returns the message to the destination because the Redelivery
count is less than the Maximum failed deliveries for the
destination. - The MDB receives the message again, but is still unable to process it,
so the MDB performs a rollback as before. The Redelivery count of the
message is now set to 2, which is still less than Maximum failed
deliveries for the destination, so the messaging engine puts the
message back where it originated. - This pattern repeats until the message has been rolled back 5
times. - Now, the value of the Redelivery count is the same as the
destination’s Maximum failed deliveries. Rather than return the
message to its original destination, the messaging engine moves the
message to the destination’s Exception destination, which is specified
as
_SYSTEM.Exception.Destination.<messaging engine name>
.
Using the WebSphere MQ messaging
provider
MDBs that use the WebSphere MQ messaging provider can either use activation
specifications or listener ports to monitor queues or topics hosted by
WebSphere MQ. When a message is put onto a queue or is published on a
specific topic, the message is detected by the activation specification or
listener port and delivered to the MDB.
When an MDB rolls a message back, the behaviour of the application server
is different depending on whether the MDB was bound to an activation
specification or a listener port.
Activation specifications
If an MDB that was configured to use an activation specification rolls a
message back, the behaviour of the application server depends on five
properties:
- The first two:
- Stop endpoint if message delivery fails and
- Number of sequential delivery failures before
suspending endpoint
are activation specification advanced properties, and work
together to determine if an activation specification should stop
after a message has been rolled back.The Stop
endpoint if message delivery fails property is a
checkbox. When selected, the activation specification will keep a
count of the number of rollbacks that have been performed by the
MDBs that are using it. When a message is rolled back, the
rollback counter increases by one. When a message is successfully
processed by an MDB, the rollback counter is reset to zero.If the rollback counter reaches the value specified by the
Number of sequential delivery failures before
suspending endpoint property, then the activation
specification stops.By default, the Stop endpoint
if message delivery fails checkbox is selected and
the Number of sequential delivery failures before
suspending endpoint property has the value 0. This
means that, as soon as an MDB rolls a message back, the activation
specification will stop. These properties can be changed on the
Advanced properties panel for an activation specification in the
WebSphere administrative console (see Figure 6).Figure 6. The Stop endpoint if message
delivery fails and Number of sequential delivery failures
before suspending endpoint properties for
TestActivationSpec - The JMS message property Redelivery count indicates
the number of times a JMS message has been delivered to an
application. This property is incremented if an MDB has rejected the
message after delivery. - The WebSphere MQ queue property Backout threshold
(BOTHRESH) specifies the maximum number of times a message can be put
onto a queue before it is moved onto a different location. The default
value for this property is 0, which means that an activation
specification will never attempt to re-queue messages that have been
rolled back by an MDB. The value of Backout threshold
can be set using either the WebSphere MQ command line utility
runmqsc
, or the Queue Properties panel in the WebSphere
MQ Explorer (Figure 7).Figure 7. Backout threshold property for
test queue - The WebSphere MQ queue property Backout requeue queue
(BOQNAME) is the queue location where a message is moved to when the
message has been rolled back onto a queue the number of times
specified in the Backout threshold property. Backout requeue queue has
no default value, which means that the application server will move
any messages that have exceeded the backout threshold to the
SYSTEM.DEAD.LETTER.QUEUE. The Backout requeue queue property can be
set using either the WebSphere MQ utility runmqsc, or the WebSphere MQ
Explorer. The Queue Properties panel is shown in Figure 8.Figure 8. Backout requeue queue property
for the test queue is set to
SYSTEM.DEAD.LETTER.QUEUEWhen an activation specification detects a message on a
JMS destination, the first thing it does is to compare the value
of the the message’s Redelivery count to the value of the queue’s
Backout threshold. If the Redelivery Count is less than the
Backout threshold, the message is delivered to the MDB for
processing. However, if Redelivery count is equal to the Backout
threshold, the WebSphere MQ messaging provider moves the message
onto the Backout requeue queue. If no Backout requeue queue has
been defined, the message is moved to the
SYSTEM.DEAD.LETTER.QUEUE.If the message is delivered to
the MDB and is then rolled back, the activation specification puts
the message back onto the JMS destination that it came from and
increments the value of Redelivery count.The activation
specification then checks if the Stop endpoint if message
delivery fails checkbox is selected. If it is, then
the activation specification increments its internal rollback
counter and compares the value of the counter to the value of the
Number of sequential delivery failures before
suspending endpoint property. If the two values are
equal, then the activation specification is stopped.This
behaviour is shown in Figure 9.Figure 9. How the application server
handles poison messages when using the WebSphere MQ JMS
Provider
The default behaviour
By default, the Stop endpoint if message delivery fails
checkbox is selected, the Number of sequential delivery failures
before suspending endpoint property has a value of 0 and both
the Backout threshold and Backout requeue queue properties have no value.
So, what is the default behaviour when a poison message is delivered to an
MDB that is using an activation specification to monitor a JMS destination
being hosted by WebSphere MQ?
The MDB rolls back the message, which means that the Redelivery count of
the message will be increased to 1. The activation specification now
checks the Stop endpoint if message delivery fails
checkbox and finds it has been selected, so it increments the rollback
counter.
The activation specification then checks the value of the Number of
sequential delivery failures before suspending endpoint
property, and compares this to the value of the rollback counter. As the
rollback counter is greater than the Number of sequential delivery
failures before suspending endpoint, the activation
specification is stopped.
When the activation specification is restarted, it will detect the poison
message again and compare the Redelivery count of the message with the
value of the queue’s Backout threshold property. This property has no
value, so the message is delivered to the MDB. If the MDB is still unable
to process it, the message will be rolled back onto the queue, and its
Redelivery count will be incremented to 2. Once again, the activation
specification will check the Stop endpoint if message delivery
fails checkbox, finds it is selected and increments the
rollback counter. The rollback counter now has a value of 2, which is
greater than the value of Number of sequential delivery failures
before suspending endpoint, so the activation specification
is stopped again.
The next time the activation specification is restarted, it will detect the
message, and the whole cycle will be repeated.
As a result of this behaviour, it is possible to end up in a situation
where a poison message blocks the processing of other messages on the
queue.
When a message is rolled back, it is returned to its original position on
the queue. Activation specifications always start processing messages from
the top of the queue, so if the very first message on the queue is a
poison message, the activation specification will detect it, and deliver
it to the MDB. As the message cannot be processed, the MDB will roll it
back, which will cause it to go back to the top of the queue again. The
activation specification will then be shut down. When it restarts, the
activation specification will detect the poison message again, and
redeliver it to the MDB. The MDB will roll it back, again causing the
activation specification to stop.
Change the default
behaviour
As you can see, the default behaviour will continue until the poison
message is deleted from the queue by a systems administrator.
To prevent this from happening, you need to:
- Ensure that the queue being monitored by the activation specification
has both a Backout threshold and a Backout requeue queue defined. - Either:
- Unselect the Stop endpoint if message delivery
fails checkbox, or - Leave the Stop endpoint if message delivery
fails checkbox, and set the Number of
sequential delivery failures before suspending
endpoint property to a value greater than the
Backout threshold.
- Unselect the Stop endpoint if message delivery
For example, suppose that you have an activation specification defined
called TestActivationSpecification, which is monitoring the WebSphere MQ
queue test for messages. The activation specification has the Stop
endpoint if message delivery fails checkbox selected, and the
Number of sequential delivery failures before suspending
endpoint property set to the value 5. The queue test has a
Backout threshold of 1 and a Backout requeue queue of
SYSTEM.DEAD.LETTER.QUEUE.
A message arrives on the queue test, is detected by the activation
specification, and delivered to your MDB. Now, suppose that your MDB is
unable to process this message and rolls it back. The Redelivery count of
the message is now set to 1.
The WebSphere MQ messaging provider checks the Stop endpoint if message
delivery fails checkbox for the activation specification and finds it is
selected, so increments the rollback counter.
It then compares the value of the rollback counter to the value of the
Number of sequential delivery failures before suspending endpoint
property, which has a value of 5. This is greater than the value of the
rollback counter, so the activation specification continues running.
The next time the activation specification detects the message, the
WebSphere MQ messaging provider checks the message’s Redelivery count and
finds it has a value 1. It now looks at the Backout threshold for the
queue test, which also has a value of 1. Therefore, the WebSphere MQ
messaging provider decides to back the message out.
The WebSphere MQ messaging provider queries the queue’s Backout requeue
queue property. This is set to SYSTEM.DEAD.LETTER.QUEUE, so the WebSphere
MQ messaging provider removes the message from the test queue, and puts it
onto this one.
The activation specification then goes back to monitoring the queue test
for more messages to arrive.
How the Maximum server sessions
property affects poison messages
The activation specification advanced property Maximum server sessions
defines the maximum number of messages that can be processed concurrently.
If this property has a value of 10, and there are 10 messages on the
destination being monitored by the activation specification, then all 10
messages will be processed at the same time by an internal server session
associated with the activation specification.
It is important to note that if the Stop endpoint if message
delivery fails checkbox is selected for an activation
specification, and the Number of sequential delivery failures
before suspending endpoint property is set to a value greater
than zero, then the rollback counter maintained by the activation
specification applies across all server sessions.
This means that if different poison messages are rolled back by different
server sessions at the same time, then the activation specification might
stop without trying to move the poison messages to the backout queue.
In addition to this, as soon as a message detected by an activation
specification is successfully processed by a server session, the rollback
counter for the activation specification is reset to zero.
For example, suppose our activation specification
testActivationSpecification has:
- Stop endpoint if message delivery fails selected
- Number of sequential delivery failures before suspending
endpoint property set to 3. - Maximum server sessions property set to 5.
The activation specification is configured to monitor the queue called
test, which has the Backout threshold property set to 5 and the Backout
queue name set to SYSTEM.DEAD.LETTER.QUEUE.
When the activation specification starts up, there are ten poison messages
on the queue. What happens? Good question.
The activation specification detects the first five messages, and delivers
them to five server sessions for processing.
The first server session hands the first poison message to an MDB. The MDB
tries to process it, finds it is unable to do so and rolls it back onto
the queue. The redelivery count of the message is now set to 1, and the
rollback counter associated with the activation specification is set to
1.
In parallel, the second server session gives the second poison message to
another instance of the same MDB. This MDB instance tries to process the
message, is unable to do so, and so rolls it back onto the queue. The
redelivery count of this message is now set to 1. The internal rollback
counter for the activation specification is set to 2.
While all this processing is going on, the third server session passes the
third poison message to another MDB instance. This MDB instance also rolls
the message back, as it is unable to process it. The redelivery count of
the third poison message is set to 1, and more importantly the rollback
counter for the activation specification is set to 3.
At this point, the WebSphere MQ messaging provider detects that the
rollback counter is equal to the value of the Number of sequential
delivery failures before suspending endpoint for the activation
specification. As a result of this, the WebSphere MQ messaging provider
stops the activation specification.
Poison messages 4 and 5 will still be processed by server sessions four and
five respectively, as the messages were given to the server sessions
before the activation specification was stopped.
It is also possible that poison messages 6 and 7 might also be processed
before the activation specification is stopped. This is because they will
be delivered to the first and second server sessions once those server
sessions have rolled back poison messages 1 and 2.
In order to change this behaviour, and ensure that the activation
specification always tries to move poison messages to the specified
backout queue rather than stopping, ensure that the Stop endpoint
if message delivery fails checkbox is unselected.
Listener ports
Listener ports have been available since WebSphere Application Server V5,
and provide an alternative mechanism for MDBs to monitor JMS destinations
for messages. The behaviour of listener ports has not changed since
WebSphere Application Server V6.1, and has been stabilised since WebSphere
Application Server V7, which means that no new functionality has been
added to them for a while.
This means that the information on listener ports in the developerWorks
article How WebSphere Application Server V6 handles poison messages is
still valid for WebSphere Application Server V8 and later.
WebSphere MQ security
considerations
One question that comes up quite a lot is:
What WebSphere MQ authorizations does my WebSphere Application Server
system need in order to back out messages?
In order for WebSphere Application Server to back out messages, the user ID
under which the application server is running needs to have the following
permissions on the backout requeue queue:
- Get
- Inquire
- Pass All Context
- Put
- Set All Context
If the application server does not have these permissions, the application
server will move the message to the dead letter queue that has been
defined for the queue manager.
Using the WebSphere MQ provider
on z/OS
The way poison messages are handled when using the WebSphere MQ JMS
provider on z/OS® is the same as described above, with one
minor difference.
WebSphere MQ on z/OS uses in-memory copies of messages. The first time a
message is detected by a listener port, WebSphere MQ stores a copy of it
in memory before passing it to the application server. If the message is
processed successfully, the in-memory copy is deleted and the actual
message is deleted from storage.
In the situation where the MDB rolls back the message, WebSphere MQ will
increment the Redelivery count of the in-memory copy. The application
server then looks at the value of this property on the copy to determine
whether the actual message should be moved to the Backout requeue queue
and whether the listener port should be stopped. If the value of the
in-memory copy’s Redelivery count is more than the Backout threshold, the
actual message is moved to the Backout requeue queue, and the in-memory
copy deleted.
This has implications if the WebSphere MQ system is stopped before a
message has been backed out.
Suppose you have an activation specification defined, called
TestActivationSpec2. The activation specification is configured to monitor
the WebSphere MQ queue test2 for messages, and to stop after 10 sequential
delivery failures. The queue is hosted by a WebSphere MQ queue manager
running on z/OS, and has the Backout threshold property set to 5:
A message arrives on the queue and is detected by
TestActivationSpecfication2. A copy of the message is made, and is
delivered to an MDB. However, the message cannot be processed, so the MDB
rolls it back. The Redelivery count of the in-memory copy of the message
is incremented, and now has the value 1.
The activation specification detects the message again, and once again
delivers it to the MDB, where it is rolled back for a second time. The
in-memory copy of the message now has a Redelivery count of 2.
Suppose that the z/OS queue manager is shut down at this point. Since the
application server has only been working on an in-memory copy of the
message, the Redelivery count of the actual message stored on the queue is
still set to 0. When the queue manager is restarted, the activation
specification will detect the message again. A new in-memory copy of the
message is made, which has a Redelivery count of 0. When the message is
rolled back by the MDB, the value of the copy’s Redelivery count will be
incremented to 1.
This behaviour continues until the MDB has rolled back the message 5 times.
At this point, the application server determines that the Redelivery count
of the in-memory copy of the message is equal to the queue’s Backout
threshold, and moves the actual message to the queue specified by the
Backout requeue queue property.
In this scenario, the poison message has actually been processed seven
times before it was backed out, which is more than the Backout threshold
that has been defined for the queue.
This behaviour is not ideal.
To prevent this from happening, WebSphere MQ on z/OS needs to be configured
to update the Redelivery count of the actual message in storage as well as
the in-memory copy when a rollback occurs. This means that the Redelivery
count value will be persisted, and therefore can survive a queue manager
restart. To do this, the HardenGetBackout (HNDBKTCNT) property for the
queue being monitored by the MDB needs to be set to YES.
Conclusion
This article described what a poison message is, and explained what JMS
applications can do when they encounter them. You also learned how the
default messaging provider and the WebSphere MQ JMS provider handle
situations where an MDB rolls back a poison message, how the default
behaviour can be changed, and how some listener port properties affect the
behaviour of the application server when using the WebSphere MQ JMS
provider.
Downloadable resources
Related topics
Credit: IBM