Fairness in RabbitMQ and SQS

Fairness sounds like a portable queue concept. In practice, RabbitMQ fairness and Amazon SQS fair queues solve two different fairness problems.

RabbitMQ is trying to avoid giving more work to a busy worker.
SQS fair queues are trying to avoid letting one tenant increase dwell time for everyone else.

That difference sounds small until you look at where each system measures fairness. RabbitMQ measures it at the consumer delivery boundary. SQS measures it at the tenant or message-group scheduling boundary. Same word, different layer.

The Confusion

Imagine a shared queue with four workers and three tenants.

tenant-a sends 10,000 messages
tenant-b sends 20 messages
tenant-c sends 20 messages

That setup contains two different fairness problems: worker capacity and tenant latency. Comparing RabbitMQ fair dispatch and SQS fair queues as equivalent features is misleading because they operate on different entities:

System	Fairness target	What it tries to prevent
RabbitMQ fairness / fair dispatch	Consumers / workers	A slow worker accumulating too many unacknowledged messages
Amazon SQS fair queues	Tenants / message groups	One tenant increasing dwell time for other tenants

The implementation follows from that target.

Push vs Pull Is the Mental Model

The easiest way to get confused here is to bring an SQS mental model to RabbitMQ.

In SQS, consumers poll:

consumer -> ReceiveMessage -> SQS returns messages

SQS does not usually think in terms of “these are my N connected consumers, and I am pushing the next message to one of them.” A worker asks for messages, SQS chooses which messages to return, and those messages become in flight until they are deleted or their visibility timeout expires.

RabbitMQ is different. Consumers subscribe to a queue:

consumer -> basic.consume(queue)

After that, RabbitMQ has a live consumer registered on the queue. If there are three consumers subscribed, the broker can push deliveries to those consumers as messages become available.

In the simple case, that looks like round-robin delivery:

message-1 -> consumer A
message-2 -> consumer B
message-3 -> consumer C
message-4 -> consumer A

That is why RabbitMQ can make fairness a consumer-level problem. The broker knows which consumers are subscribed and can track how much unacknowledged work each one is holding.

SQS fair queues do not work that way. SQS has consumer-side settings like batch size, long polling, visibility timeout, and Lambda event source concurrency, but those are not the fairness model. Fair queues are about which message group should be returned to a polling consumer, not which consumer should receive the next pushed delivery.

RabbitMQ Fairness: Fair Dispatch

RabbitMQ does not have an SQS-style “fair queue” feature. The common RabbitMQ term is fair dispatch, and it is not a separate queue type. It is the result of two normal AMQP/RabbitMQ mechanisms used together:

manual acknowledgements
consumer prefetch

By default, a RabbitMQ queue distributes messages to active consumers in round-robin order. If there are two workers, RabbitMQ can hand odd-numbered messages to one worker and even-numbered messages to the other.

That works if every message takes roughly the same amount of time.

It breaks down when message cost is uneven:

message-1 takes 30 seconds
message-2 takes 1 second
message-3 takes 30 seconds
message-4 takes 1 second

With naive round-robin dispatch, one worker can get stuck on the expensive jobs while the other worker keeps finishing cheap jobs quickly. RabbitMQ is not looking into the message body or estimating processing cost. It only sees deliveries and acknowledgements.

The usual fix is:

channel.basic_qos(prefetch_count=1)

Prefetch is a cap on outstanding work. With prefetch_count=1, RabbitMQ will not send a consumer another message until the previous one has been acknowledged.

How RabbitMQ Decides a Worker Is Busy

RabbitMQ does not know true worker capacity. It does not know whether a worker is using CPU, waiting on I/O, blocked on a database, or stuck because of a bug. It only knows whether a delivered message has been acknowledged.

So the fairness signal is:

busy = this consumer has reached its unacknowledged message limit

That is the whole mechanism behind RabbitMQ fair dispatch: keep track of unacknowledged messages per consumer, and avoid sending more work to consumers already at their prefetch limit.

The Throughput Tradeoff

prefetch_count=1 gives RabbitMQ the freshest view of which workers are available, but it can reduce throughput because workers receive less buffered work.

In practice, prefetch is a tuning knob:

lower prefetch -> better worker fairness, less local buffering
higher prefetch -> better throughput, more chance that slow workers hold work

That tradeoff reinforces the main point: RabbitMQ fairness is worker flow control, not tenant isolation.

If tenant A publishes 10,000 messages before tenant B publishes 10, RabbitMQ’s fair dispatch pattern does not automatically give tenant B its own share of delivery opportunities. You would need to model that yourself with separate queues, exchanges, priorities, routing keys, consumer pools, or application-level scheduling.

SQS Fair Queues

Amazon SQS fair queues are almost the inverse.

The consumer code does not set a prefetch count. The producer adds a tenant identifier to each message:

SendMessageRequest request = new SendMessageRequest()
    .withQueueUrl(queueUrl)
    .withMessageBody(messageBody)
    .withMessageGroupId("tenant-123");

On a standard SQS queue, MessageGroupId becomes a fairness identifier. It does not make the queue FIFO. It does not create ordering within that group. It gives SQS a way to recognize which messages belong to the same logical tenant, customer, client application, request type, or workload group.

That is why there is no direct SQS equivalent of RabbitMQ’s per-consumer prefetch fairness. SQS can observe in-flight messages and message groups, but it is not pushing messages to named subscribed consumers and balancing deliveries across them.

The target metric is dwell time:

dwell time = time from message arrival to message processing

The noisy-neighbor problem is not just that tenant A has a large backlog. A large backlog is expected if tenant A sends more work than the system can process. The real problem is tenant A causing tenant B and tenant C to experience elevated dwell time.

SQS fair queues try to keep quiet tenants’ dwell time low even while the noisy tenant’s own dwell time rises.

What SQS Is Tracking Internally

SQS fair queues identify tenants by MessageGroupId. Messages without a MessageGroupId can still exist in the queue, but each one is treated as its own distinct tenant. AWS recommends setting MessageGroupId on every message so the grouping maps cleanly to a real entity in your system.

SQS then detects noisy neighbors using two signals:

Signal	Meaning
Concurrency share	The tenant’s in-flight messages as a fraction of all in-flight messages
Processing time share	The tenant’s recent share of total consumer processing time

The second signal is important. A tenant can be noisy without having a huge number of messages in flight. If its messages are few but very slow, it can still occupy a large share of consumer time.

AWS documents two noisy-neighbor triggers:

Trigger	Condition
Concurrency share	The tenant has more than 10% of in-flight messages and at least 30 of its own messages in flight
Processing time share	The tenant’s recent share of total consumer processing time exceeds 10%

Those values are approximate. SQS is a distributed system, so detection does not necessarily activate at the exact threshold.

Once SQS marks a tenant as noisy, receive calls are biased toward quiet tenants when quiet-tenant messages are available.

Conceptually:

on ReceiveMessage:
  if quiet-tenant messages are available:
    prefer returning quiet-tenant messages
  else:
    return noisy-tenant messages too

SQS does not drop the noisy tenant’s messages. It does not throttle that tenant’s consumption rate in a strict per-tenant way. If there is spare capacity, or no quiet-tenant work is waiting, consumers can still receive noisy-tenant messages. This is scheduling bias, not admission control.

If multiple tenants are noisy at the same time, AWS says SQS returns messages from the noisy tenant with the fewest in-flight messages first. The goal is still balancing processing time rather than enforcing a fixed tenant quota.

When a Noisy Tenant Becomes Quiet Again

The recovery behavior is also worth noticing.

AWS describes prioritization as continuing until the noisy tenant’s concurrency share and processing-time share fall to levels comparable to quiet tenants.

According to the AWS docs, a tenant stops being treated as noisy when either:

its backlog has been fully consumed
no messages from that tenant have been in flight for a continuous 5 minutes

After that, its messages are no longer deprioritized.

That means SQS fair queues are not a permanent classification system. A tenant becomes noisy because of recent queue behavior, and later returns to normal once the noisy condition clears.

This is very different from creating static per-tenant queues or hard per-tenant rate limits. SQS is adapting delivery order inside the shared standard queue.

One SQS Caveat: Consumer Capacity Still Matters

Fair queues do not remove the need for enough consumers.

For the concurrency-share signal to be useful, SQS needs enough messages in flight for one tenant’s share to stand out. If the consumer fleet is too small, or if Lambda event source mapping is configured with low concurrency or small batches, the in-flight population may be too small for that signal to say much.

AWS says the processing-time-share signal can still detect noisy tenants when in-flight counts are low, but fair queues work best when consumers process enough messages concurrently for both signals to be evaluated.

The Implementation Difference

RabbitMQ:

queue
  |
  | delivery
  v
consumer A ---- unacked count = 1, prefetch = 1 -> not eligible
consumer B ---- unacked count = 0, prefetch = 1 -> eligible
consumer C ---- unacked count = 0, prefetch = 1 -> eligible

SQS:

standard queue with MessageGroupId

tenant-a -> many in-flight messages or high processing-time share -> noisy
tenant-b -> quiet
tenant-c -> quiet

ReceiveMessage prefers tenant-b / tenant-c when their messages are available

RabbitMQ fairness is about which consumer can accept another delivery.

SQS fairness is about which tenant should be favored during message selection.

When I Would Use Each One

I would reach for RabbitMQ’s fair-dispatch pattern when the problem is worker utilization.

Good fit:

jobs have variable processing cost
workers should not build large local backlogs
failed workers should return unacked jobs to the queue
you control long-running consumers
you want to tune throughput versus fairness with prefetch

I would reach for SQS fair queues when the problem is shared-queue multi-tenancy.