NSQ Docs 1.3.0

Deployment

What is the recommended topology for nsqd?

We strongly recommend running an nsqd alongside any service(s) that produce messages.

nsqd is a relatively lightweight process with a bounded memory footprint, which makes it well suited to “playing nice with others”.

This pattern aids in structuring message flow as a consumption problem rather than a production one.

Another benefit is that it essentially forms an independent, sharded, silo of data for that topic on a given host.

NOTE: this isn’t an absolute requirement though, it’s just simpler (see question below).
Why can’t nsqlookupd be used by producers to find where to publish to?

NSQ promotes a consumer-side discovery model that alleviates the upfront configuration burden of having to tell consumers where to find the topic(s) they need.

However, it does not provide any means to solve the problem of where a service should publish to. This is a chicken and egg problem, the topic does not exist prior to the first publish.

By co-locating nsqd (see question above), you sidestep this problem entirely (your service simply publishes to the local nsqd) and allow NSQ’s runtime discovery system to work naturally.
I just want to use nsqd as a work queue on a single node, is that a suitable use case?

Yep, nsqd can run standalone just fine.

nsqlookupd is beneficial in larger distributed environments.
How many nsqlookupd should I run?

Typically only a few depending on your cluster size, number of nsqd nodes and consumers, and your desired fault tolerance.

3 or 5 works really well for deployments involving up to several hundred hosts and thousands of consumers.

nsqlookupd nodes do not require coordination to answer queries. The metadata in the cluster is eventually consistent.

Publishing

Do I need a client library to publish messages?

NO! Just use the HTTP endpoints for publishing (/pub and /mpub). It’s simple, it’s easy, and it’s ubiquitous in almost any programming environment.

In fact, the overwhelming majority of NSQ deployments use HTTP to publish.
Why force a client to handle responses to the TCP protocol’s PUB and MPUB commands?

We believe NSQ’s default mode of operation should prioritize safety and we wanted the protocol to be simple and consistent.
When can a PUB or MPUB fail?
1. The topic name is not formatted correctly (to character/length restrictions). See the topic and channel name spec.
2. The message is too large (this limit is exposed as a parameter to nsqd).
3. The topic is in the middle of being deleted.
4. nsqd is in the middle of cleanly exiting.
5. Any client connection-related failures during the publish.
(1) and (2) should be considered programming errors. (3) and (4) are rare and (5) is a natural part of any TCP based protocol.
How can I mitigate scenario (3) above?

Deleting topics is a relatively infrequent operation. If you need to delete a topic, orchestrate the timing such that publishes eliciting topic creations will never be performed until a sufficient amount of time has elapsed since deletion.

Design and Theory

How do you recommend naming topics and channels?

A topic name should describe the data in the stream.

A channel name should describe the work performed by its consumers.

For example, good topic names are encodes, decodes, api_requests, page_views and good channel names are archive, analytics_increment, spam_analysis.
Are there any limitations to the number of topics and channels a single nsqd can support?

There are no built-in limits imposed. It is only limited by the memory and CPU of the host nsqd is running on (per-client CPU usage was greatly reduced in issue #236).
How are new topics announced to the cluster?

The first PUB or SUB to a topic will create the topic on an nsqd. Topic metadata will then propagate to the configured nsqlookupd. Other readers will discover this topic by periodically querying the nsqlookupd.
Can NSQ do RPC?

Yes, it’s possible, but NSQ was not designed with this use case in mind.

We intend to publish some docs on how this could be structured but in the meantime reach out if you’re interested.

pynsq Specific

Why are you forcing me to use Tornado?

pynsq was originally intended as a consumer-focussed library and consuming the NSQ protocol is far simpler to implement in Python with an asynchronous framework (notably due to NSQ’s push oriented protocol).

Tornado’s API is simple and performs reasonably well.
Is the Tornado IOLoop required to publish?

No, nsqd exposes HTTP endpoints (/pub and /mpub) for dead-simple programming language agnostic publishing.

If you’re worried about the overhead of HTTP, don’t be. Additionally, /mpub reduces the overhead of HTTP by publishing in bulk (atomically!).
When would I want to use Writer then?

When high performance and low overhead is a priority.

Writer uses the TCP protcol’s PUB and MPUB commands which have less overhead compared to their HTTP counterparts.
What if I just want to “fire and forget” (I can tolerate message loss!)?

Use Writer and don’t specify a callback to the publish method.

NOTE: this only has the effect of simpler client side code, behind the scenes pynsq still has to handle the response from nsqd (i.e. there isn’t a performance benefit of doing this).

Special thanks to Dustin Oprea (@DustinOprea) for kickstarting this FAQ.