Concrete systemd unit patterns that reduced flakiness: restart policies, resource limits, and structured logs.

On this page

Systemd Tricks We Use to Keep Services Boring

After a few painful outages caused by homemade init scripts, we moved everything to systemd and wrote down the patterns that worked.

Pattern: Restart with Backoff #

We had a service that occasionally failed to bind its port on boot.

```ini [Unit] Description=API service After=network-online.target Wants=network-online.target

[Service] ExecStart=/usr/local/bin/api Restart=on-failure RestartSec=5

[Install] WantedBy=multi-user.target ```

Restart=on-failure + RestartSec gave the process room to recover without flapping.

Pattern: Non-Root with Limits #

We saw file descriptor exhaustion during load tests.

Added User=api and LimitNOFILE=65536.
Used Ansible to roll the unit file change across the fleet.

Pattern: Journald as a Timeline #

When something goes wrong, we start with:

`journalctl -u api -b`
`journalctl -u api --since "-15min"`

Systemd didn’t fix our code, but it made failures predictable and repeatable.

Systemd Tricks We Use to Keep Services Boring

Systemd Tricks We Use to Keep Services Boring

Pattern: Restart with Backoff #

Pattern: Non-Root with Limits #

Pattern: Journald as a Timeline #

Stay Updated

A Pragmatic Multi-Region Strategy for Small Teams

How We Stopped Terraform Drift from Surprising On-Call

More from Linux

systemd Timers vs Cron: When We Switched and What We Learned

Linux Performance Troubleshooting: A Real Incident Walkthrough

Systemd Drop-In Overrides for Vendor Services: The Supportable Linux Ops Pattern

systemd Timers vs Cron: When We Switched and What We Learned

Linux Performance Troubleshooting: A Real Incident Walkthrough

Systemd Drop-In Overrides for Vendor Services: The Supportable Linux Ops Pattern

Systemd Service Reliability Patterns: What We Changed After Repeated Restart Loops

EKS Auto Mode: What Worked, What Broke in Our Migration

Zero Trust on AWS: Lessons From Implementing IAM Identity Center

About Kiril urbonas

You might have missed

GitOps with Argo CD: Best Practices for 2025

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

AI Model Deployment Strategies: From Development to Production

Systemd Tricks We Use to Keep Services Boring

Pattern: Restart with Backoff#

Pattern: Non-Root with Limits#

Pattern: Journald as a Timeline#

Stay Updated

A Pragmatic Multi-Region Strategy for Small Teams

How We Stopped Terraform Drift from Surprising On-Call

More from Linux

systemd Timers vs Cron: When We Switched and What We Learned

Linux Performance Troubleshooting: A Real Incident Walkthrough

Systemd Drop-In Overrides for Vendor Services: The Supportable Linux Ops Pattern

About Kiril urbonas

You might have missed

GitOps with Argo CD: Best Practices for 2025

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

AI Model Deployment Strategies: From Development to Production

Pattern: Restart with Backoff #

Pattern: Non-Root with Limits #

Pattern: Journald as a Timeline #