Fairness in RabbitMQ and SQS
Fairness sounds like a portable queue concept. In practice, RabbitMQ fairness and Amazon SQS fair queues solve two different fairness problems.
-
RabbitMQ is trying to avoid giving more work to a busy worker.
-
SQS fair queues are trying to avoid letting one tenant increase dwell time for everyone else.
That difference sounds small until you look at where each system measures fairness. RabbitMQ measures it at the consumer delivery boundary. SQS measures it at the tenant or message-group scheduling boundary. Same word, different layer.
The Confusion
Imagine a shared queue with four workers and three tenants.
tenant-a sends 10,000 messages
tenant-b sends 20 messages
tenant-c sends 20 messages
That setup contains two different fairness problems: worker capacity and tenant latency. Comparing RabbitMQ fair dispatch and SQS fair queues as equivalent features is misleading because they operate on different entities:
| System | Fairness target | What it tries to prevent |
|---|---|---|
| RabbitMQ fairness / fair dispatch | Consumers / workers | A slow worker accumulating too many unacknowledged messages |
| Amazon SQS fair queues | Tenants / message groups | One tenant increasing dwell time for other tenants |
The implementation follows from that target.
Push vs Pull Is the Mental Model
The easiest way to get confused here is to bring an SQS mental model to RabbitMQ.
In SQS, consumers poll:
consumer -> ReceiveMessage -> SQS returns messages
SQS does not usually think in terms of “these are my N connected consumers, and I am pushing the next message to one of them.” A worker asks for messages, SQS chooses which messages to return, and those messages become in flight until they are deleted or their visibility timeout expires.
RabbitMQ is different. Consumers subscribe to a queue:
consumer -> basic.consume(queue)
After that, RabbitMQ has a live consumer registered on the queue. If there are three consumers subscribed, the broker can push deliveries to those consumers as messages become available.
In the simple case, that looks like round-robin delivery:
message-1 -> consumer A
message-2 -> consumer B
message-3 -> consumer C
message-4 -> consumer A
That is why RabbitMQ can make fairness a consumer-level problem. The broker knows which consumers are subscribed and can track how much unacknowledged work each one is holding.
SQS fair queues do not work that way. SQS has consumer-side settings like batch size, long polling, visibility timeout, and Lambda event source concurrency, but those are not the fairness model. Fair queues are about which message group should be returned to a polling consumer, not which consumer should receive the next pushed delivery.
RabbitMQ Fairness: Fair Dispatch
RabbitMQ does not have an SQS-style “fair queue” feature. The common RabbitMQ term is fair dispatch, and it is not a separate queue type. It is the result of two normal AMQP/RabbitMQ mechanisms used together:
- manual acknowledgements
- consumer prefetch
By default, a RabbitMQ queue distributes messages to active consumers in round-robin order. If there are two workers, RabbitMQ can hand odd-numbered messages to one worker and even-numbered messages to the other.
That works if every message takes roughly the same amount of time.
It breaks down when message cost is uneven:
message-1 takes 30 seconds
message-2 takes 1 second
message-3 takes 30 seconds
message-4 takes 1 second
With naive round-robin dispatch, one worker can get stuck on the expensive jobs while the other worker keeps finishing cheap jobs quickly. RabbitMQ is not looking into the message body or estimating processing cost. It only sees deliveries and acknowledgements.
The usual fix is:
channel.basic_qos(prefetch_count=1)
Prefetch is a cap on outstanding work. With prefetch_count=1, RabbitMQ will not send a consumer another message until the previous one has been acknowledged.
How RabbitMQ Decides a Worker Is Busy
RabbitMQ does not know true worker capacity. It does not know whether a worker is using CPU, waiting on I/O, blocked on a database, or stuck because of a bug. It only knows whether a delivered message has been acknowledged.
So the fairness signal is:
busy = this consumer has reached its unacknowledged message limit
That is the whole mechanism behind RabbitMQ fair dispatch: keep track of unacknowledged messages per consumer, and avoid sending more work to consumers already at their prefetch limit.
The Throughput Tradeoff
prefetch_count=1 gives RabbitMQ the freshest view of which workers are available, but it can reduce throughput because workers receive less buffered work.
In practice, prefetch is a tuning knob:
lower prefetch -> better worker fairness, less local buffering
higher prefetch -> better throughput, more chance that slow workers hold work
That tradeoff reinforces the main point: RabbitMQ fairness is worker flow control, not tenant isolation.
If tenant A publishes 10,000 messages before tenant B publishes 10, RabbitMQ’s fair dispatch pattern does not automatically give tenant B its own share of delivery opportunities. You would need to model that yourself with separate queues, exchanges, priorities, routing keys, consumer pools, or application-level scheduling.
SQS Fair Queues
Amazon SQS fair queues are almost the inverse.
The consumer code does not set a prefetch count. The producer adds a tenant identifier to each message:
SendMessageRequest request = new SendMessageRequest()
.withQueueUrl(queueUrl)
.withMessageBody(messageBody)
.withMessageGroupId("tenant-123");
On a standard SQS queue, MessageGroupId becomes a fairness identifier. It does not make the queue FIFO. It does not create ordering within that group. It gives SQS a way to recognize which messages belong to the same logical tenant, customer, client application, request type, or workload group.
That is why there is no direct SQS equivalent of RabbitMQ’s per-consumer prefetch fairness. SQS can observe in-flight messages and message groups, but it is not pushing messages to named subscribed consumers and balancing deliveries across them.
The target metric is dwell time:
dwell time = time from message arrival to message processing
The noisy-neighbor problem is not just that tenant A has a large backlog. A large backlog is expected if tenant A sends more work than the system can process. The real problem is tenant A causing tenant B and tenant C to experience elevated dwell time.
SQS fair queues try to keep quiet tenants’ dwell time low even while the noisy tenant’s own dwell time rises.
What SQS Is Tracking Internally
SQS fair queues identify tenants by MessageGroupId. Messages without a MessageGroupId can still exist in the queue, but each one is treated as its own distinct tenant. AWS recommends setting MessageGroupId on every message so the grouping maps cleanly to a real entity in your system.
SQS then detects noisy neighbors using two signals:
| Signal | Meaning |
|---|---|
| Concurrency share | The tenant’s in-flight messages as a fraction of all in-flight messages |
| Processing time share | The tenant’s recent share of total consumer processing time |
The second signal is important. A tenant can be noisy without having a huge number of messages in flight. If its messages are few but very slow, it can still occupy a large share of consumer time.
AWS documents two noisy-neighbor triggers:
| Trigger | Condition |
|---|---|
| Concurrency share | The tenant has more than 10% of in-flight messages and at least 30 of its own messages in flight |
| Processing time share | The tenant’s recent share of total consumer processing time exceeds 10% |
Those values are approximate. SQS is a distributed system, so detection does not necessarily activate at the exact threshold.
Once SQS marks a tenant as noisy, receive calls are biased toward quiet tenants when quiet-tenant messages are available.
Conceptually:
on ReceiveMessage:
if quiet-tenant messages are available:
prefer returning quiet-tenant messages
else:
return noisy-tenant messages too
SQS does not drop the noisy tenant’s messages. It does not throttle that tenant’s consumption rate in a strict per-tenant way. If there is spare capacity, or no quiet-tenant work is waiting, consumers can still receive noisy-tenant messages. This is scheduling bias, not admission control.
If multiple tenants are noisy at the same time, AWS says SQS returns messages from the noisy tenant with the fewest in-flight messages first. The goal is still balancing processing time rather than enforcing a fixed tenant quota.
When a Noisy Tenant Becomes Quiet Again
The recovery behavior is also worth noticing.
AWS describes prioritization as continuing until the noisy tenant’s concurrency share and processing-time share fall to levels comparable to quiet tenants.
According to the AWS docs, a tenant stops being treated as noisy when either:
- its backlog has been fully consumed
- no messages from that tenant have been in flight for a continuous 5 minutes
After that, its messages are no longer deprioritized.
That means SQS fair queues are not a permanent classification system. A tenant becomes noisy because of recent queue behavior, and later returns to normal once the noisy condition clears.
This is very different from creating static per-tenant queues or hard per-tenant rate limits. SQS is adapting delivery order inside the shared standard queue.
One SQS Caveat: Consumer Capacity Still Matters
Fair queues do not remove the need for enough consumers.
For the concurrency-share signal to be useful, SQS needs enough messages in flight for one tenant’s share to stand out. If the consumer fleet is too small, or if Lambda event source mapping is configured with low concurrency or small batches, the in-flight population may be too small for that signal to say much.
AWS says the processing-time-share signal can still detect noisy tenants when in-flight counts are low, but fair queues work best when consumers process enough messages concurrently for both signals to be evaluated.
The Implementation Difference
RabbitMQ:
queue
|
| delivery
v
consumer A ---- unacked count = 1, prefetch = 1 -> not eligible
consumer B ---- unacked count = 0, prefetch = 1 -> eligible
consumer C ---- unacked count = 0, prefetch = 1 -> eligible
SQS:
standard queue with MessageGroupId
tenant-a -> many in-flight messages or high processing-time share -> noisy
tenant-b -> quiet
tenant-c -> quiet
ReceiveMessage prefers tenant-b / tenant-c when their messages are available
RabbitMQ fairness is about which consumer can accept another delivery.
SQS fairness is about which tenant should be favored during message selection.
When I Would Use Each One
I would reach for RabbitMQ’s fair-dispatch pattern when the problem is worker utilization.
Good fit:
- jobs have variable processing cost
- workers should not build large local backlogs
- failed workers should return unacked jobs to the queue
- you control long-running consumers
- you want to tune throughput versus fairness with prefetch
I would reach for SQS fair queues when the problem is shared-queue multi-tenancy.
Good fit:
- many tenants share one standard queue
- one tenant’s burst should not raise dwell time for others
- consumer autoscaling has lag
- you want standard queue throughput and delivery semantics
- you can assign a meaningful
MessageGroupIdto every message
They can both be called fair. But they are fair to different things.
Key Takeaways
- RabbitMQ fairness is worker flow control: consumer prefetch and manual acknowledgements stop busy consumers from receiving more work.
- SQS fair queues are tenant-aware scheduling:
MessageGroupIdhelps SQS prefer quiet message groups when one group consumes too much processing capacity. - The knobs live on different sides: RabbitMQ fairness is configured by consumers. SQS fair queues depend on producers setting a useful
MessageGroupId. - The word “fair” hides different goals: RabbitMQ protects workers from overload. SQS fair queues protect quiet groups from noisy-neighbor latency.