These days, I design everything for home with extreme simplicity coupled with detailed documentation on how I set things up.
Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware. Then if at all possible, I use standard docker images provided by the software developer with no modifications (maybe some small tweaks in a docker-compose file to map to local resources).
Anyway, my advice is to keep the number of customizations to a bare minimum, minimize the number of moving parts in your home solutions, document everything you do (starting with installing the OS all the way through configuring your applications), capture as much of the configuration as you can in declarative formats (like docker compose files), back up all your data, and just as importantly, back up every single configuration file.
A significant amount of data collection by third parties can be eliminated or reduced by retaining control over the internet gateway. Arguably this amount is even greater than what can be affected by simply switching to using carefully selected alternative third parties. IMO, it is a mistake to believe that one can reliably eliminate/reduce data collection simply by choosing the "right" third parties. Whack-A-Mole, cat-and-mouse, whatever the term we use, this is a game the user cannot win. Third parties providing "services" over the internet are outside the user's control. For worse not better, they are subjected to market forces that drive them to collect as much user data as they can get away with.
Regardless of these privacy-destructive market forces, it is still possible to build decent routers from BSD project source code and inexpensive hardware. IMO, this is time well spent.
1. Control by the user
Most of the data passing it are encrypted: https, SSH.
Cutting off the phone-home requests is best done on respective devices: you can run firewalls on most desktops and laptops, and even phones. Рhones often go online via GSM or LTE, without passing through the home router.
While a proxy like pihole can be helpful sometimes, cutting off tracking and ads is done best by browser extensions and by using open-source clients, where available.
The best the home router should do is to not be vulnerable to exploits, and otherwise up-to-date, and fully under the owner's control That's why my home router runs openwrt.
"... cutting off tracking and ads is best done by browser extensions ..."
What if the browser vendor, who is also a data collector, requires user to log in or otherwise identify herself before she can use extensions.
A home "gateway" is a computer running a kernel with IP forwarding enabled that is being used as the point of egress from the home network to the internet. That is a broad definition and allows for much creativity. That is what I mean by the term "gateway". As such, a gateway can, both in theory and in practice, do anything/nothing that "desktops and laptops, and even phones" can do. Relying solely on pre-configured "limited/special purpose" OS projects as a replacement for DIY and creativity in setting up a gateway was not what I had in mind, but is certainly an option amongst many others.
An on-device firewall can firewall individual processes and applications. An upstream / gateway firewall does not have such fine-grained control. That was my point.
Running stuff on my router is entirely possible, but I limit it to routing and running a wireguard endpoint. I prefer to run my private stuff in the confines of the home LAN.
Never said that I did! :)
I use a text-only browser for reading HTML. Many times I do even use a browser for making HTTP requests.
Truly one can use all these strategies, application-based, router-based, gateway-based, if they are available. They are not mutually exclusive. Personally I just would not feel like I can rely on extensions or other solutions tied to some software I do not compile myself. (I do edit and compile the text-only browser.)
All due respect to Firefox, but I have found compiling it is way too time and resource-intensive. It is way beyond what I need for recreational web use. Firefox users seem to rely on Mozilla to do the right things on their behalf. That is not the sort of "control by the user" I am after.
What is behind Mozilla. Online advertising money. Cannot really count on them to do what I want.
Multi-wan is easier with appliances. I used pfSense over the last 12 years or so with multi-wan on and off (currently off). I've run pfSense in a kvm VM, and you can do multi-wan with this. Though I generally recommend dedicated NICs for the WANs and LAN.
I've looked at the linux based appliances (as late as last week) and only clearos or openwrt supported multi-wan. I could be wrong (I'd like to be as pfSense/OPNsense are FreeBSD based, and that comes with, sadly, huge amounts of baggage, limited hardware support, etc.). I'll likely be looking at that package as a potential replacement for the pfSense system, though if clearos can't handle what I need, OPNsense is like pfSense, but with far less baggage.
If you don't mind tinkering, you might be able to use mwan3.
If you prefer OpenWRT, you can look at running it in a VM along with mwan3.
It is uniquely suited to run services without interruption, like file share, messengers, torrent, http server, etc. Smartphone and laptops are mobile devices with spotty connection, powered by battery.
Instead we've got Cloud.
This happened a few times to me over the years and then I was lucky enough to go on a packer/terraform course.
Now everything is scripted and stored in git. A Gitlab job rebuilds the VMs from scratch every two weeks to include the latest bugfixes and updates.
It was a lot of work at first but actually most of it was a learning experience.
What happens when those images are not available, terraform/packer change APIs, etc?
Of course, you need to figure out what you need to change and why, but you'll never not need to do this, if you're rolling your own infra. K8s allows you to roll a lot more of the contextual stuff into the system.
Then you've just traded maintaining one system for maintaining another.
The annoying part is that when I do want to do updates (i.e. updating cert-manager from 0.1.x to 1.0.x, etc) it can be a pain. So I save these large updates for once a year or so.
So I guess that's one more thing to worry about it seems, maintaining your own images repository!
I develop mobile applications, and use SonarType's Nexus repository storage as my primary dependency resolver. Everytime I fetch a new dependency it gets cached.
A monthly script then takes care of clearing out any cached dependencies which are not listed in any tagged version of my applications.
Plenty of primary sources package their software for Docker these days.
It doesn't need to be perfect - I have a onenote notebook that has the customizations that I've done to my router (static IP leases and edits to /etc/config/network), and some helper docs for a local Zabbix install in docker that I have. I recently how to migrate a database from one docker image to another and there is no way I would remember how to do that for the next time, so I wrote everything I learned down.
Just a simple copy/paste and some explanatory text is usually good enough. Anything more complex (e.g., mirroring config files in github) still (IMO) needs enough bootstrap documentation because unless you're working with it daily you're going to forget how your stuff works.
Additionally a part of my brain is worried that if I get hit by a bus my wife/kids will have a hell of a time figuring out what I did to the network. Onenote won't help them there but I haven't figured out the best way of dealing with this.
(I recognize the irony in a "I'll host it myself" post in storing stuff in onedrive with onenote but oh well)
Besides being relatively lightweight and simple to setup, out-of-the-box draw.io integration is nice. Makes diagramming networks and other things dead simple. And I know "dead simple" means I'm infinitely more likely to actually do it.
That folder is just a collection of markdown files for each program / system and when I save on one device it updates the documentation on them all.
I use Atom to view and edit them on my Linux machines and a markdown editor app on my phone. This allows me to search across the notes too.
I've had this fairly simple, free, open source setup for years with no problems.
I've preferred VMs for functional appliances for a while now. I like the isolation compared to containers. Though YMMV.
Right now, the hardest migration I have is my mail system, which makes use of a fairly powerful pipeline of filters in various postfix connected services. Its not fragile, but it is hard to debug.
I host it myself, as the core thesis of the article pointed out, you can be deplatformed, for any reason, with no recourse. And if you lose your mail, you are probably in a world of hurt.
The one thing I am concerned about is long term backup. I need a cold storage capability of a few 10s of TB, that won't blow up my costs too badly. Likely the best route will be a pair of servers at different DCs, running minio or similar behind a VPN that I can rsync to every now and then. Or same servers with zfs and zfs send/recv.
Thinking about this, but still not sure what to do.
So it's a bit of a catch-22, I want a secure and stable home system, I don't want to spend much time working on it, but I want full flexibility to install and run what I want, and don't want to trust some off the shelf consumer solution that's likely going to be out of support in a couple years.
Ansible will be around for a while, but even if it's not its (yaml) syntax is incredibly easy to read. Any successor in that area is somewhat likely to have compatibility or at least a migration path.
This together reaps the benefits of Docker (enhanced through Compose), and Ansible is documentation in itself. There's barely any actual comments. The code speaks for itself. Also, I can reproduce my stack with incredible ease.
This right here. I recently lost my home server of 10 years courtesy of the Texas power issues during the winter storm. I rebuilt and started with fresh Linux install. Having a recent backup of /etc made it so much easier than it could have been. I had more trouble with the network driver on the new mb then with all my services, customizations and data.
Eventually, whatever platform / tool you use will need to be upgraded. Security vulnerabilities, new features, etc happen and projects like these can get abandoned within a 5-year timeframe. When you have to migrate to either a new or upgraded platform, you have to figure it all out yourself. When the config is broken by an upstream dependency, you’re on the hook too. Who knows if the build tools you used still work on current versions of things.
Like it or not, we’re all kind of stuck on these platforms we don’t control. The alternative is to become fluent in yet another technical stack, but one that will be used infrequently and won’t really translate to anything else unless you’re trying to build your own cloud service on consumer-grade hardware.
The complexity is still under the bed. We're all going to have to dig under that bed one day. Or we're just going to end up buying new hockey sticks, football pads, etc. Which is to say, we're going to end up with Linux on top of Docker on top of Linux.
That's just my opinion of course. It's possible to write confusing Dockerfiles but really they're mostly just shell scripts. And the idea of "Linux on top of Docker" seems a bit odd - there's only ever one Linux kernel no matter how many containers you have running. Docker is built on Linux.
Also, easy to backup.
Kind of worrisome to abandon lower layers with their problems and build on top of them, but what can you do, but get good at jenga.
Container solutions really only get you out of “doing the work” if you can leverage a prepackaged container management solution from a cloud provider. Self-hosted containers are frequently more trouble than they’re worth since the solutions for managing them are either insanely complex (Kubernetes) or so simplistic you have to write some custom logic to build/deploy them.
Agree with the point about CentOS though; at this point the idea of a Linux “distro” is dead. The way forward is a hypervisor model where the kernel is protected from application code (including all dependencies from the userland) with barebones Docker images like Alpine used as the basis for a declaratively-defined system. The one thing Docker does very well is isolate dependencies which reduces integration complexity and lets you modularuse the whole thing. Incidentally this actually makes it harder for hobbyists to get into as the mental model for it has a lot more complexity and you need more expert knowledge to write the scripts to build and configure your app automatically.
Yes, every time a new debian release is out something I fiddled with will break and I have to remember how it works, but I see it as some sort of training. As a dev I don't want to be completely clueless about how things I use every day work, so while changes that break old configs and workflows are annoying I'm forced to learn what changed in the gnu/linux/debian world and maybe even find out why.
Also I get better at documenting things. Years ago I didn't even know what exactly to document since the moment you do something everything just naturally comes together, but after a couple times you kinda get a feeling for what the important bits will be two years down the line.
So about once a year I reserve a perfectly good weekend to upgrade, restructure or otherwise maintain my little home server running debian with things like mariadb, nginx, a filtering bridge for my lan, dnsmasq with a block list, borgbackup, syncthing, cups for printing and a couple other things I don't remember right now.
The worst thing is that I have often gone through a lot of effort around making it easy to set up and deploy (docker and whatnot) but even that I have forgotten about. (I came across a docker file in an old project and couldn't get it to work properly until I noticed that there was a docker compose file lying around that I had missed)
How do you keep track of documentation? I guess for a project a README in the git's root is a good start, but what about more complex systems stuff that does not live in a git project? For example, I had to manually edit a bunch of config files on my Proxmox setup to get docker and some other things to work properly. Where would I document such manual steps? I am thinking a text file somewhere in cloud storage but then of course I'd need to remember that...
For my VPS, I have a Notion page where each project (name, url, mapped ports) is a row in a table. Then the project page contains a copy of my docker config and various informations I might need for maintenance/reinstallation.
What is the learning curve of Nix?
It is very different which can be very off-putting, but is not usually gratuitously different, and once you get used to it, it's pretty straightforward.
After having done it a few times, I find that I can adapt a random project not already in Nixpkgs for nix in under an hour, and it's something I do maybe twice a year or so.
One counterintuitive advantage I found switching to nix from other systems is that since the 4 step "download, configure, make, make install" usually doesn't work, I take the time to make a nix expression. On Gentoo and Arch, I would often just install to /usr/local from source and then forget what I had installed and not know how to upgrade it. If you have more discipline than I do, then it's a bug not a feature, but for me it's super helpful.
1: If the project uses cmake or autotools and has no strange dependencies, then packaging it for nix is trivial. However a surprising number of packages do things like downloading dependencies from the internet at build time, and it's not always immediately obvious how to adapt that to nix. Projects using npm or pip also probably won't work right away just because the long-tail of dependencies means that there will be at least one dependency that isn't already in nixpkgs (haskell should in theory be just as bad, but the strange proclivity for haskellers to use nix means that someone has probably already done the work for you).
Why should you need to remember it? Like you wrote later on, you just " document everything you do", as you do it. That's better than any sort of script or version control, since you can describe it how you like, which means quite succinctly. And it's not too difficult to adopt the mindframe of "What would I have needed to know 10 minutes ago to understand what to do".
> Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware
This actually illustrates, IMHO, how containers and docker are overused. You're talking about a single machine with a single purpose/set-of-purposes, and no occasional switching of configurations. So - why containerize? Whatever you have on Docker, just have that on your actual system.
> my advice is to keep the number of customizations to a bare minimum
Sound advice based on my own (limited) experience with self-hosted home-servers.
> and just as importantly, back up every single configuration file.
Fine, but don't rely on this too much. It's always a pain, if at all possible, to restore stuff based on the config files. Usually easier to follow your self-instructions for configuration.
Just got into situation where mobo has started failing and I have said it is time to reinstall, bought new hardware and throw it together. Migrated in a week, everything that was customized during ~20 years without taking notes (but with my good ol trusty diffing software) and was migrated from previous server, optimized a bit, removed unnecessary settings, upgraded postfix and dovecot to new server, replaced spamassasin with rspamd, php-fmt,...
The server was mostly operational in 2 days. Everything else was studying new software, doing things that i always wanted to but didnt want to turn around whole configuration (like stuffing everything into jails), customizing netdata, upgrading database/nextcloud/... etc. It would take longer if I would loose the data, but I trust zraid and my LTO drive, they never failed me.
Now I will sniff around it every week, maybe run some update etc. and I will be fine until some disk fails.
Being system administrator is not my occupation, but it helps if you NEVER EVER become a hostage to a cloud provider. The more you go into leisure of someone else doing everything for you - the less chances you will have to learn new things and the more you will be on mercy of someone else. And the longer you are enjoying such situation, the more technology progresses and the bigger the gap between the knowledge you have and the knowledge needed to, in this case, setup a server.
I remember dreaming in 1991, as a kid, that in 20 years everyone will have his own server at home, and how we will transfer files simply. But now everyone is saying they don't have time and buy ready made boxes or pay for the cloud.
The technology came, now it is simpler to use than ever.
But no one has "time" for that now. Or is really the time?
Dont take having a home server as a pain. It is a great way to learn things.
I can bake a perfect bread too. "Kicked" all my girlfriends out of kitchen. Brewing home beer. ...
But all this wouldn't happen if I would rather go to bakery. Eat in restaurants. Buying beer in a store. ... Sure I could, but I didnt.
Definitely a good strategy, yeah. And on the documentation front, that's exactly what I've started doing differently, too; after having to recreate my mail server one too many times, I eventually decided "you know what, I should probably write down what I'm doing", and now if things go belly-up again I at least have that starting point.
As a bonus, you can also put it online as a tutorial, which is exactly what I did: https://mail.yellowapple.us :)
All my hosts run a default OS with a 20 line firewall and a bridge.
The top level host has a zpool backed by a blank file in /tank.device.
The actual work is done by a bunch of LXC hosts all cloned from a standard base installation. Anything persistent goes in a per container zfs filesystem mounted in each container’s root. The only thing I ever backup is the /tank.device file.
Wiping everything and building from scratch is a pleasure.
LXC is more like deploying disposable OS instances and then having to deploy your app yourself, separately. It’s more bare metal but has fewer surprises in terms of supporting IPv6 and not having any inscrutable iptables magic happening on the host OS.
Docker is IKEA. LXC is hiring a joiner. (Not a value judgement.)
For me i treat that just like I do any service in my home. I am the type that will tear about my Dryer to fix it vs buying new or bring in a repair person.
I repair my own car, appliances, etc. For me the Home server is the same.
1. Code is only as good as its documentation
2. Write code in a way where you’d understand it one year from now
BUT - I'm really thankful for people who keep posting and sharing these sorts of projects; they're the ones iterating the process for the rest of us who need something a bit more turn-key.
I'm excited to see this eventually result in something like the following:
- Standard / Easy to update containerized setup.
- Out of the box multi-location syncs (e.g. home, VPS, etc.)
- Takes 5 minutes to configure/add new locations
I want this to be as easy as adding a new AP to my mesh wifi system at home: plug it in, open the app, name the AP, and click "Done".
(Edit - formatting)
At sometime you will hit something interesting: Personal Sovereignty.
I've seen other folks hit this in weird ways.
My friend started working on cars with his buddy. They finally got to an old vehicle they took all the way apart and put it together. He had gotten to the point where he could pull the engine and put it on a stand, weld things, paint, redo the wiring harness.
I remember one day I went and looked at it and he sort of casually said, "I can do anything".
Anyway, I think the diagram says something else to me. It says he understands what his setup does enough to show it/explain it to someone else.
I run a very similar setup only my VPS is only a proxy for my home server and it requires very little maintenance. I run everything with docker-compose and I haven't had to work on my setup at all this year and only about 8 hours in 2020 to setup the Wireguard network to replace the ssh tunnels I was using previously for VPS -> server communications.
At the end of the day YMMV and use what you are comfortable with, but it's not as crazy undertaking as it sounds.
You can also enable automatic backups for your servers.
But it looks like I got it on a sale, TERAKVM-400 comes down to $6.94 at normal prices.
One of the big things I wanted to accomplish was low cost and easy to integrate / recover from for family in case of bus-factor.
I didn’t expect to compete with the major cloud providers on cost, but the architecture I was dreaming of just wasn’t quite feasible even though it’s tantalizingly close...basically, all the benefits of a p2p internal network with all the convenience of NextCloud and all the export-ability of “just copy all these files to a new disk or cloud provider.”
It’s so close, there’s just always some bottleneck: home upload is too slow, cold cloud storage too hard to integrate with / cache, architecture requires too much maintenance, or similar.
I think NextCloud is very close for personal use, if only there was a plug and play p2p backend datastore / cache backed by plug and play immutable cold storage that could pick up new entries from the p2p layer.
Most of my setup was done through SSH during boring classes in college so I had plenty of time to read documentation and figure out new tools.
Breakdown of (my) issues with the diagram:
- author's interaction with each device is explicitly included, adding unnecessary noise
- "partial" and "full" real-time sync are shown as separate processes, whereas there's no obvious need to differentiate them in such a high-level overview
- devices with "partial" and "full" sync (see above) are colour-coded differently; again differentiation unnecessary
- including onsite & off-site backups in the same diagram is cool but would probably be nicer living in a dedicated backup diagram for better focus
Here's a simplified version of the same diagram:
┌───────────┐ ┌───────────────► │
│ nextcloud │ │ │ phone │
│ music │ │ └───────────────────┤
│ videos ◄─────realtime sync├───────────────► │
│ photos │ │ │ laptop │
│ docs │ │ ├───────────────────┤
│ calendar │ │ │ │
├───────────┤ │ │ │
│ │ ├───────────────► desktop │
│ crm ◄─────┐ │ │ │
│ │ │ │ │ │
├───────────┤ │ │ ├───────────────────┤
│ │ │ │ │ │
│ analytics ◄─────┤ │ │ │
│ │ │ └───────────────► │
├───────────┤ │ │ │
│ │ │ │ │
│ web │◄── daily sync────────────────────► │
│ │ │ synology │
├───────────┤ │ │ │
│ │ │ │ │
│ git ◄─────┤ │ │
│ │ │ │ │
├───────────┤ │ └───────────────────┘
│ │ │
│ devtools ◄─────┘
The messed up presentation on mobile is 100% a mobile bug, for which there is a very easy fix on the dev side, and no good workaround on the commenter side.
Business wise, I'm not sure I'd be willing to pay for just the automation... in reality you don't use it very often. Could be interesting to try (re)selling tightly knit VPSs, more advanced automation features or support.
I think this solution still captures the self hosted ideology while also providing some cool value. I see people reinventing the wheel all the time while trying to automate self hosted processes... but then again maybe that's why we do it, we like the adventure!
Like, how do you persuade the audience of enthusiasts (think: Unifi buyers) to pay for a subscription to managed software they run on their own computers, raspis, whatever? I would probably spend $10/mo on something like that, but much above that and you'd be fighting against the armchair commentary of users who won't appreciate the effort that goes into stability and will basically have a "no wireless, less space than a Nomad, lame" attitude.
On the software side, integrate tightly with your own subscription services (offsite backups, VPS, etc) to upsell to those who want that, and win over the enthusiast crowd by making it possible to host your own alternatives to those services with a little technical know-how.
Open source most components to appeal to enthusiasts, but keep the secret sauce that makes everything seamless and easy to use "source available" so you don't unintentionally turn your core business into a commodity.
Seems viable to me.
I and probably many others would be OK with paying for the upsell part, if it's an optional convenience, but nothing I saw on your site indicates it is, or that "own your data" is in any meaningful way true. How do I own my data if any use of it requires me running stuff on your proprietary box, subscribing to your proprietary service?
Data from Helm is accessible using IMAP, SMTP, CardDAV, CalDAV, WebDAV on the local network (without requiring our service). You own the device, you own the data. There is a standards-based way of accessing that data just as there is with the hard drive from Best Buy.
I can plug the Best Buy harddrive into almost any computer/SAN I want to and utilize for the purpose I bought for without any lock in. Using the hard drive for my data requires very little trust in Best Buy's good intentions at the time of purchase and zero trust in the continued existence, technical competence or good intentions beyond that -- it is very unlikely that best buy will find a way to snoop on my data even if they wanted to. Everything will continue to work fully as intended until mechanical failure sets in. Feels like ownership to me.
In your case I rent some box from you which will lose almost all its intended function the moment your company goes bust or I stop paying you an annual fee. Furthermore it seems I am completely at the mercy of your original and continued good intentions as in addition to using your lock-in for a big price hike later you presumably can also snoop on my data. As far as I can tell there is no really substantial trust differentiator to protonmail. I have to trust them when they claim they won't read my email and are competent enough to keep things secure and I have to trust you (and continue to trust you as long as I want to use the device) that you will encrypt my data and not exfiltrate any of it (or the private key), and furthermore that you run your servers securely enough that no third party will. But what is to stop it? The box is running closed source software that you can remotely update anytime you feel like it, right? I have physical access, but since I don't control the software, what use is that?
Maybe I didn't understand something right, but so far this does not feel like ownership to me.
It sounds more like the worst of all worlds: the lock-in and lack of ownership of a proprietary cloud-based subscription service with the added hassle, inconvenience, downtime and costs of babysitting (and supplying electricity to) a cloud server for you, the provider.
> Data from Helm is accessible using IMAP, SMTP, CardDAV, CalDAV, WebDAV on the local network (without requiring our service).
In what ways is that better than running an IMAP client on my laptop and using it to send data via protonmail, using my own domain, and keeping offline copies of everything (with a periodic upload to backblaze or some other e2e encrypted backup solution)? That seems to offer about the same control over my data, but is cheaper, easier, more convenient, has higher redundancy/uptime and if anything less lock-in. It also doesn't require an additional device that has no use beyond adding an additional failure point and cost center that's my responsibility.
> There is a standards-based way of accessing that data just as there is with the hard drive from Best Buy.
Accessing the data is not enough because what you are selling me is not an overpriced and unergonomic hard drive. You are selling me the ability to send, receive and store email (and likely more).
Don't get me wrong, I kind of like the idea of buying a physical box and a subscription service to self host stuff in a way that gives me better control over my data for an acceptable amount of hassle. But that really requires some amount of openness/auditability and interoperability that currently appears to be absent.
No - we do not rent hardware. When people buy the server from us, they own it. Full stop. There are ongoing costs to make email at home work: a static IP address with good reputation, a security gateway, traffic, etc. If people don't want to pay us for those costs, they will pay them to an ISP and/or an infrastructure provider like AWS. The ease of setup and management comes from the integration of hardware, software and service.
> I am completely at the mercy of your original and continued good intentions as in addition to using your lock-in for a big price hike later you presumably can also snoop on my data.
This is true of any paid service you use right? They can increase your costs at any time. I'm not sure why you think there's something uniquely bad about us for this reason. We have pretty clear values around wanting to know as little about our customers as possible and designing our products end to end around that. We have worked pretty hard at reducing costs, bringing the server price down 60% while doubling its specifications. Our goal is to make this as cost effective and accessible as possible for everyone. We are not interested in locking in customers - it's easy for anyone to take their data off Helm and go to a server of their own making or another service of their choosing. That's not hypothetical - like any company, we have churned customers and supported them in their migration off our product. It's easy to sling these hypotheticals you are concocting but they are not borne out of any reality.
> As far as I can tell there is no really substantial trust differentiator to protonmail.
There is actually a substantial difference. Protonmail holds your data on their servers and therefore can turn it over without a warrant. Well it's encrypted, right? So what could any entity do with that data? Well, Protonmail may be compelled to modify their service to intercept the password on login to decrypt your inbox and turn it over to a government authority (if you don't think that can happen, see what the German government did to Tutanota).
We aren't in a position to do that. Even if the US government came with a court order for your encrypted backups from us, we don't have access to the keys to decrypt them. If we were asked to make firmware changes, we would be retracing the steps of the FBI/Apple San Bernardino case and would enlist the help of the EFF, ACLU and others to fight. I personally believe the case law is pretty clear that they wouldn't win, which is partly why the FBI relented earlier.
> that you can remotely update anytime you feel like it
You make this sound like a terrible thing but really it's not. It allows us to keep our products patched and secured over time.
> In what ways is that better than running an IMAP client on my laptop and using it to send data via protonmail, using my own domain, and keeping offline copies of everything (with a periodic upload to backblaze or some other e2e encrypted backup solution)?
I didn't say people couldn't roll their own solutions. Sure they can - it's just more work, hassle and fragile. And I already covered the tradeoffs of keeping that data in the cloud. Protonmail has access to all your email in the clear (inbound and outbound). We do not and anyone running a server at home would have similar privacy. That's a clear difference.
> Accessing the data is not enough because what you are selling me is not an overpriced and unergonomic hard drive. You are selling me the ability to send, receive and store email (and likely more).
Actually it is because we were talking about data ownership. Your specific dig was about how "own your data" was in any way true ("or that "own your data" is in any meaningful way true" in your parent post).
At that point, I might as well just go with a paid ProtonMail or similar solution.
My expectations for self hosted isn't to have annual or monthly fees.
If you want to self-host email, you need a trustworthy static IP address with reverse DNS. It's considerably more expensive to get this from an ISP. Our annual fee also includes storage for offsite backups. You don't get the same privacy assurances using Protonmail as you do with self-hosting either. For example, Protonmail is privy to the content of all outbound email messages in the clear unless you are communicating with the recipient using E2EE.
From a cost perspective, Helm V2 starts at $199 for 256GB of storage. First year costs, including subscription work out to $298. With Protonmail, their entry level plan with added storage at the same price buys you an inbox with about 28GB, a small fraction of what you would get with Helm storage-wise, not to mention we don't limit users, email addresses, domains, etc.
it’s mostly tough because of the high upfront capital costs (manufacturing, r&d, and marketing). people still talk fondly about discontinued apple routers and what nest could have been as an independent venture, for example.
This seems a lot easier to me than on-prem cloud services, either in BYOH form ("but it's just software") or as a packaged appliance ("another hub to install, really?").
I would say that the closest thing to this right now for paid is coming from the storage side— NAS providers like Synology using hardware sales to support a limited ecosystem of "one click" deployable apps. And for free, it's ecosystems like HomeAssistant, which a lot of people just deploy as a fire-and-forget RPi image, but as expected with a free ecosystem, as soon as you get off the ultra-common use cases, you're reading source code to figure out how it works, and wading through a tangle of unmaintained "community" plugins that only do half of what you want.
home audio/theater from prior to the internet revolution might be a good analogy: a bunch of separate boxes that each provide tailored functionality but all work together seemlessly without a lot of technical knowledge. that, but for all sorts of computing devices.
I also think it's still a little too nerd-focused for the average consumer. I'd say I know far more about security, networking and hardware than the average consumer but, compared to the HN crowd, I know next to nothing. I struggle to use a lot of the current solutions because they get bogged down in doing cool technical stuff that is so far outside the scope of the average potential user's wants/needs or the DIY solution will be "easy"... for someone with an extensive CS background and years of experience.
for instance, plug in a smart device and have confidence that it's not doing surreptitious things behind your back, because it's automatically segregated into its own vlan and given only enough network access to be controlled by you without needing to know much about the underlying technologies involved.
"automatically segregated into its own vlan"
Aren't these goals fundamentally at odds? I would imagine that Joe consumer (if they care at all about any of this) would be rather more inclined to entrust the role of orchestrating/segregating their home network devices to an entity like Google than to some random startup.
When you lose trust you end up with your crazy uncle leaving Fox News for Alex Jones and YouTube. You have people becoming QAnon followers.
I say this not to make a political point, but that the problem is fundamentally hopeless and I see no way out. You end up landing on one side of the fence or the other. You either just don't think about it and continue to use Google and Facebook and remain ignorant of the problem, or you spiral down the never-ending hole of despair.
We have seen articles recently that tell us not even Signal can be fully trusted. Whether or not it's true is beside the point. The point is, not even the HN crowd is safe from the cliff of paranoia. The seed of doubt has been planted.
Is someone going to trust a small tech startup in 2021? No, not like they would have in 1997. The market for trust has effectively been sealed off today. Because, paradoxically, the Googles and Facebooks ruined it all. They stripped us (all of us, not just HN) of our innocence and naivety. We know not to trust Google, but they are also a known known. A small tech company is a total unknown. We're familiar with how Google is going to bend us over. So if someone is going to do us dirty, it may as well be a known entity. Or... you go and build a cabin in the woods and start writing manifestos.
The most prominent of which is "What happens when they drop support for my use case/lock me out of my account?"
Unfortunately, the cost of running your own one-off solution is rather high. And doubly unfortunately, while I would pay money for a box that I could plug into my computer that provided all of these services, I wouldn't pay enough money to justify someone building it, and selling it to me.
most people, whether they rationalize it or not, are cognizant that we live in the grey gradient of trust for various companies and brands. the vector field is all sorts of wacky and inscrutible, but maybe we can point a few of those vectors in the right direction and some folks will happily slide down it to better (but perhaps not perfect) safety and privacy.
Those curious can check out /r/SelfHosted.
Yeah. I have a basic home server and I feel like even with fairly modest needs/desires (Jellyfin, Deluge, Zoneminder, some kind of file syncing, I gave up on photos because my whole family uses Google for that), it's hard to find a reasonable workflow/setup that covers it all. It was basically down to partitioning by VM (proxmox) or partitioned by container (docker), and I went with Docker + Portainer, but I'm not really happy with it; even basic functionality like redeploying a Compose configuration has sat as a feature-ask for three years .
Maybe I'm wanting it to be something that it just isn't, and I'd be happier with microk8s and managing the apps as Helm charts. But is that just inviting additional complexity where none is needed?
Ben running like this for 3 years. No fuss.
Renting space in a rack at a colo facility and putting an nginx server on it is really simple, but it's also expensive compared to the complex solution in the original post.
I have migrated from using cloud provided storage to Nextcloud (been running that for over 4 years now without issues), and have my calendar and contacts in there as well.
My ongoing task is to fully migrate all my images, videos and calibre library from Dropbox to other self hosted entities.
It is a process made over a long time.
Use a cloud provider or your own automation to create a cluster, then apply a set of configs to bring your services up.
I bought a Qnap NAS a month ago. I thought I would get it setup right away for my Linux machines, Macbooks, and network. I was wrong. But I'm slowly learning every couple days and now I have a systemd service that loads two volumes using NFS to my Linux machine.
How I solved it:
1) I use well vetted cloud services for things that are difficult/impossible to self host or have a low impact if lost. (Email, domains, github, etc...)
2) I self host things that are absolutely critical with cloud backups. (Files, Photos, code, notes, etc..)
Using a VPS can also make you more identifiable. Your traffic isn't as easily lost in the noise. The worst thing that I know of people doing is using a VPS for VPN tunneling. While it can have its uses, privacy certainly isn't one of them. You're the only one connecting into it and the only traffic coming out of it.
“Using a VPS can make you more identifiable”
I think you have a problem of “threat model” here. You’re mixing up hiding against hackers, governments, etc and just lumping it under “privacy and security”
Using a VPS isn’t going to make you more identifiable to google, because you’re not using google now. Using a VPN isn’t going to make you more identifiable to your ISP, because all they can see is that you have a VPN up.
Why not use a VPS for VPN? Well you’re only right it would suck if your threat model includes governments or hostile actors, me hiding from my ISP or on a public Wi-Fi? Not a problem.
You conflate a few ideas and threat models.
Security = The ability to not have your stuff accessed or changed.
Privacy = The ability to not have your stuff seen.
Anonymity = The ability to not have your stuff linked back to you.
Threat model = Who are you protecting yourself from?
E.g. The steps I take to not get hacked by the NSA are going to be different then the steps I use to make comments on 4chan or whatever are different than the steps I take to use public Wi-Fi.
Ref: I work for Amazon AWS, my opinions are my own insane ramblings.
Common on laptops, but I wouldn’t assume that for systems/SANs in a data center, much less their virtual disks. Would love to be corrected.
Which AWS has, by definition.
> VM disk encryption is now tied into an HSM or TPM these days, host access wouldn’t help.
Are you passing all of the data through the TPM? If no: you still need to keep the key in memory somewhere, the TPM is just used for offline storage. If yes: the TPM, and the communication with it, is still under AWS' control.
> As for memory, that is now usually encrypted, so no dice there either.
Still need to keep the key somewhere, so same concern as for disk encryption. Except I can pretty much guarantee you're not putting the TPM on the memory's critical path, so...
> The security of a big name public VPS is astoundingly better than what you can do yourself.
Feel free to back such claims up in the future. Because right now this seems to be as false as the rest of your post.
> Using a VPS isn’t going to make you more identifiable to google, because you’re not using google now.
What? It certainly won't make you less identifiable either.
> Using a VPN isn’t going to make you more identifiable to your ISP, because all they can see is that you have a VPN up.
Your VPN provider, on the other hand, can now see all of the traffic, where before they couldn't. So the question is ultimately whether you trust your ISP or VPN provider more.
> Why not use a VPS for VPN? Well you’re only right it would suck if your threat model includes governments or hostile actors, me hiding from my ISP
Sure, if you trust the Amazon over your ISP that makes perfect sense. Then again, this is the Amazon that seems to love forcing their employees to piss in bottles, and is on a huge misinformation campaign against treating their employees properly.
That seems like an upstanding place with great leadership.
> or on a public Wi-Fi? Not a problem.
Makes some sense, but it wouldn't really give you much more than hosting the VPN at home. (Well, you'd still have to do the same calculus here for home ISP vs Amazon.)
> You conflate a few ideas and threat models.
Pot, meet kettle.
> Ref: I work for Amazon AWS, my opinions are my own insane ramblings.
Good to know that AWS employees are either clueless about their own offerings, or deliberately spreading misinformation.
Seems like a place that I'd love to trust...
::shrugs:: I don't work for that part of AWS. My opinion came from other experience.
You're not only wrong, but you managed to insult me while being wrong. That's the worst kind of wrong.
If you want some further reading, there is some cool work being done in this space.
If you want privacy and security and you don't trust your provider, then you have to build your own hardware and compile everything you run on it from vetted source, including your kernel. You can do it, but most people decide that on balance its better to trust someone.
Does it really? It just seems like instead of trusting a big company that everyone knows, you trust a smaller company that not everyone knows that involves more work for you.
I'm pretty sure I've seen articles on HN where VPS companies (maybe DO?) have kicked people off their infrastructure with zero notice. So, not at all different from being locked out of Apple/Google/Amazon.
Yes. VPS is a standardized commodity. If one provider shuts you down you can just move to another.
As you can see, AWS is far from the only game in town. If you can't find two or three from that list that will meet your needs then perhaps you should reassess your quality metric.
(I note in passing that my preferred provider, Linode, is not even on that list.)
NOT EVERYONE HAS THE SAME NEEDS AS YOU.
DIFFERENT CLOUD PROVIDERS OFFER DIFFERENT SERVICES
YOU OBVIOUSLY HAVE NO IDEA WHAT THE MARKET OFFERS
Yes. but VPS is a standardized commodity. If one provider shuts you down you can just switch to another.
Well, 1) sorry but you don't get to decide this, 2) how would anything ever become a thing if people were not allowed to invent new things?
- I'm not favoring the term just opposing your commanding
Thank you all so much for your comments. I didn't expect this will be this high on HN. I'm aware there are more simple solutions for self-hosting, even partially. I'm also aware that my setup is not perfect - that's why this post was created. I was hoping to get some feedback. Not from that many of you, but some friends. :) Ask me anything you like, I'll try to answer every question.
You're system architecture is very clean and understandable. I spend a lot of time marveling at the beautiful but often overly complex diagrams on r/homelabs, which more often than not dissuade me from actually having a go at it. Your explanation made it feel very approachable.
That being said...
> Some people think I’m weird because I’m using a personal CRM.
This strikes me as incredibly...German, hahaha! Is there any reason your Contacts solution doesn't/can't provide this functionality?
Regarding CRM and Contacts - I could possibly fit all the info in the 'about' field for a particular contact, but Monica offers me so much more. With Monica, I can structure the data for a contact in a better way. That 'better way' and the feature set of Monica is why I'm using it.
What's stopping you from hosting at home?
While I admit that I often feel claustrophobic with only ~35-40 Mbps of usable bandwidth, my power costs for several orders of magnitude more usable storage+cpu are in line with what you're paying for VPS right now.
>I was hoping to get some feedback.
Do you run any additional layers of security of top of NextCloud? From something simple like requiring SNI to ward off casual scanning activity, or more advanced like a WAF layer?
I ask because I've been hesitant to trust my whole digital life to something that doesn't have a full-time paid security staff.
I will elaborate: I started out with AWS several years ago. I could never work out how they calculated my bill, and had more than one >$100 shocks for hosting my personal services.
I moved to DO and Vultr (stayed with DO for no real reason) and so shut everything down on AWS.
But I still got a $0.50 monthly charge on my credit card. I tried emailing - no response, totally ghosted.
I went through the control panel several times - it is/was a huge mess, obscure by policy obviously - and finally in some far distant corner found something still turned on. I did not understand what it was at the time and can recall no details, but I turned it off with great relief.
A week later I got a email from AWS (!) saying that I had made a error and they had helpfully turned the whatever it was back on...
So I continued to donate $0.50 a month to Amazon until I cancelled the credit card for other reasons. (it would cost $10 for the bank to even think about blocking them)
These days I will crawl over cut glass not to do business with that organised bunch of thieves called Amazon.
(Edit: I found the S3 bucket, but mysteriously no hosted zone to account for the Route 53 bills ¯\_(ツ)_/¯)
Using IAC (Terraform) would solve this in an instant: "terraform destroy". Done.
Why? It's just too much hassle these days; I want my down-time to be no longer dictated by my infrastructure. I don't want to have to spend off-work hours making sure my boxes are patched, my disks are raided, my offsite-backups are scheduled, and my web/email services are running. I just want it all to work, and when it doesn't, I want to be able to complain to someone else and make it their problem to fix it.
For my data, I'll probably still have an on-site backup, but everything else can just live in the cloud, and I'll start sleeping better, due to less stress about keeping it all secure and running.
How about you receiving a lot of spam emails?
I do get a lot of incoming spam though, but I think that's more to do with some of my email addresses being over 20 years old.
I'm getting about 1-2 spam mails a month delivered to my inbox, usually french SEO spam. Not worth investigating.
The reality is probably in excess of $1000/month. This only makes sense for people who have an abundance of spare time, and that's pretty rare these days.
Free software for DIY hosting like this is "free as in piano." Like a huge piano sitting on the street with a sign that says "free piano," it is actually not free at all when you factor in the hidden costs.
One way to understand why people self-host is to understand why people self-cook their food. It takes significantly longer to prepare food (get raw material, cut, cook) than ordering it. People still do it for $reasons - some find it fun, some find it cheaper, some find it nice to be able to control the taste, some find it more healthy to know whats going on their plate, and so on.
Only concentrating on the dollar cost is too narrow a view, IMO.
Your time is only free if it is worth nothing. My time is very valuable. I happily pay other people and companies to do things for me because I'd rather have the time.
I think it's just a normal part of life. When you're young, you have more time than money. When you're old, you have more money than time.
It's common for people to delude themselves into thinking they haven't wasted their time by convincing themselves they did it for fun (or the lols, or whatever) - I'd say the difference is whether they knew )or stated) this upfront, or only after they failed, or had a better solution pointed out to them.
2nd most common also: at least I learnt something / gained xp - which is fair enough, if true.
> Only concentrating on the dollar cost is too narrow a view
Not if you convey other resources/constraints in dollars. Just attach a dollar-value to your free time, perhaps with discounts for things with side-benefits.
> It's common for people to delude themselves into thinking they haven't wasted their time by convincing themselves they did it for fun
I am probably missing some context here because this does not make sense to me. Something is fun because its fun, what does it even mean for someone to forcibly convince themselves of something that is otherwise? ¯\_(ツ)_/¯
> Just attach a dollar-value to your free time
I do that when someone asks me to do a project for them in my free time, so I can know what to charge them. But there is little value in assigning a dollar-value for time that I am going to spend doing something that _I want to do_ . Its like watching a movie, or making a sand-castle in the backyard. I won't enjoy it if I keep thinking "Damn, I just watched a movie for 3 hours, there goes $300 worth of time."
You can have fun imagining the payoffs, only to find they do not appear. Have you ever seen a movie and been disappointed, or played a game and found it lacking. "fun" is not an absolute measure, and review doesn't necessarily capture how fun something is versus how fun you think it ought to be - plenty people give things higher value than their "fun" value, despite claiming to only have done it for fun - the missing value is ideological.
> there is little value in assigning a dollar-value for time that I am going to spend doing something that _I want to do_
Only if there is, some some reason, literally only one thing you want to do. But if there are competing things you might like to do then comparing them makes sense. One way or another, if you choose to watch a movie, or build a sandcastle, you are comparing the two to decide which. Using monetary values is just a more formal way of doing that for larger, less impulsive, projects.
My long-winded point is that all of the things I've picked up have been invaluable to me at work, especially in my time as a contractor where I would be switching between many different stacks. If you want to find a "true" cost for self-hosting, you need to also treat it as training.
I don't really believe it's any different from say, a woodworker that has a shop at home. They may spend the workday just doing framing, but odds are good they find the time to make a chair, a bird house, something to keep their skills sharp.
True for some things, like things that are not at all related to your work. But your job should be actively trying to make you better at your job, and a better person.
Large companies like the one I work for hire outside firms to offer classes to the employees for free, and on company time. If there is a new version of a piece of software that is significantly different from an old one, my company pays for the users to go to training, or to train online. This is very common for products like Office or the Adobe suite. But for some reason, as developers, we too often think that we're supposed to better ourselves on our own dime. If it benefits your current employer, the current employer should chip in.
Recent previous discussion at: https://news.ycombinator.com/item?id=26672009 .
The main concern is autonomy, not economic costs.
I expect you know this already, which is why the puppy analogy sort of fails.
I love my free time and there is precious little. But I don't think of it as costing ME $100/hr when I wash, dry, and detail my car, especially as I like doing it.
And two hours a month seems high.
You'll need a devops team as soon as you use the cloud ( eg. kubernetes)
Those will cost easily more.
1. Don't feed the FAANG
2. Store your SoR media, notes, documents on your own NAS
3. Automate a backup of the NAS, preferably both on and off site (I use rsync from a pi + large disk + cloud blob storage)
If you get the off-the shelf NAS, get one with at least 2GB of ram! Synology is particularly notorious for selling NAS with 512MB(WTF?!) of ram, and then when you try to run a few applications it grinds to a halt.