My though was to use hardware raid, and just set that up for the 2 hdds, then boot off an ssd with Debian (very familiar, and use it for current server which has 30+ docker containers. Basically I like and am good at docker so would like to stick to Debian+docker. But if hardware raid isn't the best option for HDDs now a days, I'll learn the better thing)
Which drives? Renewed or refurb are half the cost, so should I buy extra used ones, and just be ready to swap when the fail?
You don't want hardware raid. Some options you can research:
Mdadm - Linux software raid
ZFS - Combo raid and filesystem
Btrfs - A filesystem that can also do raid things
Some OS options to consider:
Debian - good if you want to learn to do everything yourself
Truenas Scale - Comercial NAS OS. I bit of work to get started, but very stable once going.
Unraid - Enthusiast focused NAS OS. Not as stable as Truenas, but easier to get started and a lot of community support.
There are probably other software/OS's to consider, but those are the ones I have any experience with. I personally use ZFS on Truenas with a lot of help from this YouTube channel. https://youtube.com/@lawrencesystems?si=O1Z4BuEjogjdsslF
Ditto on hardware raid. Adding a hardware controller just inserts a potentially catastrophic point of failure. With software raid and raid-likes, you can probably recover/rebuild, and it's not like the overhead is the big burden it was back in the 90s.
I got a server from ewaste because the RAID card did fail and having SAS drives they couldn't even pull data from it with anything else. It was the domain controller and NAS so as you can imagine, very disruptive to the business. As they should they had an offsite backup of the system and so we just restored onto a gaming PC as a temporary solution until we moved them to M365 instead.
I just use software RAID on it now and so far so good for about 180 days.
Truenas Scale - Comercial NAS OS. I bit of work to get started, but very stable once going.
Unraid - Enthusiast focused NAS OS. Not as stable as Truenas, but easier to get started and a lot of community support.
Since OP wants to use Docker i would not recommend either. Trunas scale does not support it usefully and the implementation in Unraid is also weird. Also the main benefit of unraid is the mixing of drives, OP wants to raid.
I'd recommend BTRFS in RAID1 over hardware or mdadm raid. You get FS snapshotting as a feature, which would be nice before running a system update.
For disk drives, I'd recommend new if you can afford them. You should look into shucking: It's where you buy an external drive and then remove (shuck) the HDD from inside. You can get enterprise grade disks for cheaper than buying that same disk on its own. The website https://shucks.top tracks the price of various disk drives, letting you know when there are good deals.
I second this. I use BTRFS over ZFS for its reduced footprint but has always been very reliable. With a couple of commands I replaced a disk and btrfs scrub on a monthly basis makes me sleep peacefully (relatively)
Then just go with debian+docker. As raid software i would recommend ZFS, its a filesystem that does both and also integrity on file level. (and lots more)
I personally would only buy new ones. No matter the brand just the best TB/€ you can get.
For MB basically every Chipset gives you 4 SATA ports. You could consider picking one that Supports unbuffered ECC memory but that is not a must. If you want to Hardware Transcode in Jellyfin, then Intel is probably your best since the dGPU with Quicksync is pretty good and well supported, otherwise i would go AMD.
For 4 drives you can use most ATX cases have no recommendations here.
No hardware RAID. Use zfs, if you can. Mirror the boot SSD. I would use a stripe over mirror and 4 HDDs. Two drives are not enough redundancy.
Use enterprise or nearline drives, if you can.
Debian is great, you can install Proxmox on top of it, but from the sound of it plain Debian would work for you.
For HDDs the best way is to think of them like shoes or tires. They will eventually fail, but they also may fail prematurely. I always recommend having a spare drive ready.
If you want to build it yourself, you have to decide on size.
Are you trying to keep it as small as possible?
Do you want a dedicated GPU for multiple jellyfin streams? (Definitely get the Intel A380, cheap and an encoding beast)
If you don't want to start a rack and don't want to go with a prebuilt NUC, there are 2 PC cases I would recommend.
Node 304 and Node 804.
Node 304 is mini-ITX (1 PCIe slot, 1 M.2 slot for boot OS, 4 HDDs, SFX-L PSU, and great cooling)
Node 804 is micro-ATX (2 PCIe slots, 2 M.2 slots, 8-10 HDDs, ATX PSU, and 2 chambers for the HDDs to stay cool)
Why do you want a N100? Is electricity very expensive where you are that idle power is a big factor? Because desktop CPUs are more powerful and the CPUs can idle down to 10W or so without a GPU and they can have way more RAM.
Tldr; go with prebuilt NUC or go with a desktop CPU for a custom build.
I just rebuilt my truenas in a node 804 and LOVE it. so much hard drive space. wanted to get the 304 for my personal backup server, but got thermal takes corev1 instead. looks uglier, but works well too.
I’ve had a great experience with the TrueNAS Mini-X system I bought. ZFS has great raid options, and TrueNAS makes managing a system really easy. You can get a box built & configured by them, with 16 GB ECC RAM and five (empty) drive bays, for about $1150 at the most affordable end. https://www.truenas.com/truenas-mini/
One thing to be careful about: you can’t add drives to a ZFS vdev once it’s been created, but you can add new vdevs to an existing pool. So, you can start with two mirrored drives, then add another two mirrored drives to that pool later.
(A vdev is a sub-unit of a ZFS storage pool, and you have to choose your RAID topology for each vdev and then compose those into a storage pool)
But also, in case if your only data backup plan for them is raid 1 - in such cases I prefer to have only one HDD in the machine & use the other one as a backup on a separate machine, preferably in another location. I find it that the missing 12h (or whatever) of the latest data overshadows the (lower) probability of losing all the data (fire, flood, burglary, weirdly specific accidents, etc).
And ofc you can select what to backup/rsync or not.
Eg Immich, after return to operation the apps will just resync any missing pics from the last backup.
Also with two systems you don't have to care that much about drive quality. Im now buying Exos 22+TB bcs why not. But when I needed quiet drives I bought Red Plus (not the regular Red ones, nor the Pro ones), they are even quieter than Exos, but smol.
The majority of our household stuff is on a Synology DS920+ (x86). I installed Docker and Portainer on it and then run most of my local services (Immich, Invidious, Alexandrite (the Lemmy frontend), Miniflux, Dokuwiki, and Heimdall) using the Portainer UI.
I'm still running Plex as a manually installed Syno package, because I haven't taken the time to figure out hardware trans-coding for other setups.
The 920 also manages cameras (via Surveillance Station), all off site backups (we all backup workstations to the 920 and it backs up online), handles private DNS and the reverse proxy for Docker, and hosts my personal VPN. I'm currently in the process of swapping the 4+ year old drives with new ones what will up my capacity (using SHR) from 12TB to 30 (with redundancy).
If you can do at least three nodes with high availability. It is more expensive and trickier to setup but in the long run it is worth it when hosting for others. You can literally unplug a system and it will fail over
It is overkill but you can use Proxmox with a docker swarm.
While this is a great approach for any business hosting mission critical or user facing ressources, it is WAY overkill for a basic selfhosted setup involving family and friends.
For this to make sense, you need to have access to 3 different physical locations with their own ISPs or rent 3 different VPS.
Assuming one would use only 1 data drive + an equal parity drive, now we're talking about 6 drives with the total usable capacity of one. If one decides to use fewer drives and link your nodes to one or two data drives (remotely), I/O and latency becomes an issue and you effectively introduced more points of failure than before.
Not even talking about the massive increase in initial and running costs as well as administrive headaches, this isn't worth it for basically anyone.
I think this is the way and not an overkill at all!
Its super easy to swarm ProxMox, and you make your inevitable admin job easier. Not to mention backups, first testing & setting up a VM on your server before copying it to their, etc.
You need at minimum three ceph nodes but actually four if you want it to work better. But ceph isn't ideally designed in mind with clusters that small. 7 nodes would be more reasonable.
While clustering proxmox using ceph is cool as fuck it's not easy or cheap to accomplish at home.