Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: go-nbd – A Pure Go NBD Server and Client (github.com/pojntfx)
113 points by pojntfx on March 29, 2023 | hide | past | favorite | 44 comments
Hey HN! I just released go-nbd, a lightweight Go library for effortlessly creating NBD servers and clients. Its a neat tool for creating custom Linux block devices with arbitrary backends, such as a file, byte slice or what I'm planning to use it for, a tape drive. While there are a few partially abandoned projects like this out there already, this library tries to be as maintainable as possible by only implementing the most recent handshake revision and baseline functionality for both the client and the server, while still having enough support to be useful.

I'd love to get your feedback :)




NBD = network block device. Hope I saved somebody a google.


Did nobody else learn to spell out an acronym the first time it's used?

I had heard of it, but I had to read "NBD" far too many times in that repo before I saw what it stood for.


I've always thought of it as common courtesy. It's super frustrating when you're reading something and thinking "great, but wtf are you actually going on about?"


no one knows how to write or how to use hypertext properly anymore and it drives me nuts.


Here I was thinking it's go "no big deal" server and client


"The Network Block Device is a Linux-originated lightweight block access protocol that allows one to export a block device to a client."


how does nbd compare to, say, iSCSI ?

beyond likely being simpler to understand/manage, i mean.


SCSI was a fairly wide-ranging protocol, supporting anything from hard disks to CD recorders to document scanners, and iSCSI could theoretically encapsulate all of that. SCSI also came with a lot of historical quirks, like 6/10/12/16 byte addressing, which were progressively added as devices got larger and requirements got more complex. As a result, implementing software to interact with iSCSI is a pain, because there's simply so much legacy weirdness to deal with.

NBD is much more narrowly focused. It exposes a single block device to the kernel, with a minimal set of commands focused on that use case (e.g. read, write, trim, prefetch, etc). It doesn't do as many things as iSCSI, but that's probably for the better.


It's much, much simpler than iSCSI, which is an advantage.

It's possibly more idiomatically Linux. But the Linux iSCSI initiator might (last I checked?) do a better job of utilizing the kernel block multiqueue interface than nbd, and thus might get higher I/O performance.

nbd is extremely simple to set up; iSCSI less so.


Thanks. My first thought is Next Business Day, then follow by 'Why this thing need server/client?'


My first thought was that it was a No Big Deal server akin to Python’s simple HTTP server



NBD is a simple protocol, I used it to recover a RAID5 hardware array that lost parity [0] [1], in just a few lines of C.

[0] https://dev.to/rkeene/raid5-lost-raid5-recovered-3kld

[1] https://www.rkeene.org/projects/info/resources/diatribes/blo...


That is an amazing story! How did you write the userspace driver? Was the enclosure running a well-known RAID5 system?


It was a Sun enclosure, so proprietary and no documentation on the raw disk layout.

I'm not sure what you mean regarding how I wrote it, it's a single C file I wrote on my workstation that I had attached the enclosure to in JBOD mode so I could read every disk directly.


Very cool! I'm curious if you've explored testing error cases yet? Years ago I fooled around with nbdkit and developed a "bad sectors" and "bad disk" plugin and found that the error handling around these scenarios left a little to be desired.

https://github.com/pepaslabs/nbdkit-baddisk-plugin

https://github.com/pepaslabs/nbdkit-badsector-plugin


Thanks! I have not yet actually - I am planning to test this with MHVTL to get some artificial delay in there (for the upcoming tape backend), but something like this would be interesting to integrate/port!


You should probably defer mutex unlock() and not use naked returns: https://github.com/pojntfx/go-nbd/blob/main/pkg/backend/file...

Defering overhead is very small nowdays.


I did think of doing that, but from my understanding there is a slight performance hit from `defer`, and there is no other branch it could deadlock - or am I missing something here? Thanks either way!

Edit: Oh I just saw the addition to your comment - that is exactly what I was thinking of ^^


Defer overhead was mostly fixed in Go 1.14. From: https://go.dev/doc/go1.14

> This release improves the performance of most uses of defer to incur almost zero overhead compared to calling the deferred function directly. As a result, defer can now be used in performance-critical code without overhead concerns.

EDIT: https://github.com/golang/go/issues/14939 I believe is the main tracking bug for this.


what about during a panic?


I think there was no performance hit on panic it was memory leak being fixed long time ago


They mean that the code could deadlock on a panic if the unlock isn't deferred. (At least, in the case where the panic ended up being handled somewhere and the process didn't just exit.)


Good point, thanks!


Since this heavily involves networking, take a look into using gnet [0]. You might find some interesting performance improvements by using that over just net.Conn.

[0] https://github.com/panjf2000/gnet


Probably wouldn't do this unless you really needed to; nbd workloads are probably easier than HTTP workloads (a single NBD "mount" might have lots of connections, but you're not adding and removing hundres of connections per second).


You might be right. Different workloads will definitely have different effects. That said, implementing the gnet api is pretty easy and doesn't require a huge context switch. It is worth a test to see which one performs better.

I used it for a tcp connection (json-rpc) workload and it was far better and the code was cleaner.


Right, I don't want to talk down gnet, it's neat, but you're basically writing non-idiomatic libevent-style networking code --- ie, very non-idiomatic Go code --- and it seems to me like most of the perf win here is minimizing the number of goroutines you have serving blocking operations, which is not really a problem you're going to have with an NBD implementation.


I see your point. In my case it was basically a proxy concentrator. On one side, accept and hold open a huge number of connections, then maintain a single open connection on the other side. It worked really well for this situation.


Thanks, I had not heard of that package, I will be sure to check it out!


I use NBD for my single-user NAS. You can use FDE (full disk encryption, luks) client-side.

The system managing the physical block devices never sees unencrypted data.

Such a setup shines when you use a laptop as your main machine and wish to have a lot of secure storage. The approach I chose was to bond ethernet & wifi6e, use wireguard (w/ PSK). And wonder the house with uninterrupted access.


This sounds really interesting, as in: It's the thing I didn't know I've been looking for because I didn't know it existed.

Could you elaborate on your NBD setup? (What do you use for the server?) And what kind of latencies do you see when you're at home / not at home? How do you handle backups? (Do you back up the encrypted blob server-side or do you back up the unencrypted data client-side, at the cost of having to deal with (probably) limited bandwidth & latencies.)


I just use nbd-server, nbd-client (kernel module nbd). While at home things are fine, when not at home things can get bad unless you use something like UDPspeeder <https://github.com/wangyu-/UDPspeeder> then you just need to deal with slower speeds.

I do backups with borg while connected to the NAS with ethernet.


Thanks so much for the insights!

> While at home things are fine, when not at home things can get bad

What does "bad" mean exactly? Unrecoverable errors / faulty writes? It might be an irrational gut feeling on my part but for some reason operating a block device over a network makes me feel uneasy. How easy is it to mess up one's data?


NBD uses TCP. And I had no unrecoverable (journal) corruption happen.

Imagine you had a physical drive which had unreliable reads and writes but that would always go through eventually. In that scenario I believe you would just have abysmal latency as the OS keeps retrying - and if the latency between retries was in milliseconds the actual speed would drop next to 0.

That said, sudden unrecoverable loss of connectivity would be equivalent of yanking out a physical drive. You would want to use a journaling filesystem.


Thanks for elaborating!


NBD is fairly simple. I wrote a minimal server in Python[0]. It is just a few lines.

0. https://github.com/rvalles/pyamigadebug/blob/master/NBDServe...


Exactly! NBD is a really simple protocol, its part of why I enjoy using it so much.


I misread this as NZB for usenet and was excited someone had written one in Go.




Or if you're not on a mobile device, https://en.wikipedia.org/wiki/Network_block_device


Why is there even a separate domain for mobile that has these issues?


German: English, but Capitalized.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: