Linux File System Hunting: 10 Discoveries That Changed How I See the OS

Most Linux learning starts with commands.
This exploration started with a different question: if Linux is “everything is a file,” what does the filesystem reveal about how the system actually behaves?
I treated the machine like a live investigation target and followed configuration files, virtual filesystems, process metadata, networking artifacts, and service definitions.
Here are the most meaningful findings.
1) /etc is the policy brain of the machine
What it does:/etc stores system-wide configuration: users, DNS, networking, host identity, services, PAM policies, and more.
Why it exists:
Linux separates policy/configuration from binaries. Programs live in /usr/bin or /sbin; behavior is defined in /etc.
What problem does it solve:
Predictable administration and automation. You can rebuild behavior by restoring config state, not by editing binaries.
Interesting insight:
I used to think /etc is “just config files.” It’s more than that: it is the declarative contract for system behavior.
Change /etc correctly, and the machine’s personality changes.
2) /etc/resolv.conf taught me DNS is layered, not static
What it does:
Defines DNS resolvers (nameserver), search domains, and lookup behavior.
Why it exists:
Applications need a common resolver interface, so every app doesn’t implement DNS independently.
What problem it solve:
Centralized name resolution across the OS.
Interesting insight:
On many systems, /etc/resolv.conf is generated (e.g., by systemd-resolved, NetworkManager, DHCP clients).
So editing it directly may be temporary. The real source of truth can live elsewhere, and this file is often the runtime projection of several subsystems.
3) /etc/nsswitch.conf revealed that DNS is only part of “name lookup.”
What it does:
Controls lookup order for identities and hostnames, e.g.:
hosts: files dnspasswd: files systemd ldap(varies by distro)
Why it exists:
Linux can resolve names from multiple backends: local files, DNS, LDAP, mDNS, etc.
What problem does it solve:
Flexible identity and host resolution across standalone servers, enterprise domains, and hybrid environments.
Interesting insight:
“Can’t resolve hostname” isn’t always a DNS issue. It might be a lookup order.
A single line nsswitch.conf can explain “works on one host, fails on another.”
4) /proc It is a live kernel API, not a normal directory
What it does:/proc exposes the runtime kernel and process state as files (/proc/cpuinfo, /proc/meminfo, /proc/<pid>/...).
Why it exists:
Kernel internals must be inspectable in a standard, scriptable way.
What problem does it solve:
Observability without custom tooling. Monitoring tools and admins can query the live state through file reads.
Interesting insight:
Files in /proc are generated on demand.
So reading them is like calling an API endpoint backed by kernel memory, not opening static disk content.
5) /proc/<pid> made process forensics surprisingly transparent
What it does:
Per-process directories expose the command line, environment, open file descriptors, memory maps, and current working directory.
Why it exists:
Processes are first-class managed entities. Linux exposes rich metadata for debugging, tracing, and auditing.
What problem it solve:
Rapid diagnosis of “what is this process actually doing?”
Interesting insight:/proc/<pid>/fd It is powerful: it shows every open file/socket as symlinks.
This bridges process behavior with filesystem and networking in one place—great for incident response.
6) /etc/passwd, /etc/shadow, and /etc/group showed Unix identity design in practice
What they do:
/etc/passwd: user accounts and metadata/etc/shadow: password hashes and aging policies (restricted)/etc/group: group membership and privilege grouping
Why they exist:
Authentication and authorization require structured identity records.
What problem do they solve:
User management, access control, and policy enforcement.
Interesting insight:
Separation of /etc/passwd and /etc/shadow is a security boundary:
world-readable identity info remains accessible, while secrets are restricted.
It’s a simple but elegant least-privilege design.
7) /var/log is the system’s memory of intent, failure, and recovery
What it does:
Stores logs from kernel, services, authentication events, package managers, and scheduled tasks.
Why it exists:
Systems are asynchronous and failure-prone; logs provide causal history.
What problem does it solve:
Troubleshooting, auditing, compliance, and operational learning.
Interesting insight:
Logs are more than errors—they narrate state transitions.
When correlated with service definitions and network state, they explain not just what failed but why the system made certain decisions.
8) /dev Proved device access is normalized through file semantics
What it does:
Represents hardware and pseudo-devices as special files (/dev/sda, /dev/null, /dev/tty, /dev/random).
Why it exists:
Uniform I/O interface. Programs can interact with devices via file operations.
What problem does it solve:
A consistent programming model across disks, terminals, and kernel-provided virtual devices.
Interesting insight:/dev is dynamically managed (typically by udev), meaning device files appear/disappear as hardware state changes.
This ties physical events (plugging hardware) directly into filesystem visibility.
9) /etc/systemd and unit files exposed startup as dependency graph, not script sequence
What it does:
Defines service units, targets, dependencies, restart policies, environment, and execution behavior.
Why it exists:
Modern systems need deterministic boot and service orchestration.
What problem does it solve:
Reliable startup ordering, failure handling, service supervision, and lifecycle control.
Interesting insight:
Service management in Linux is strongly declarative now.
Instead of “run script A then B,” systemd models relationships: “start B when A is ready, restart C on failure, bind D to target X.”
It feels closer to infrastructure orchestration than legacy init scripts.
10) /boot clarified that “Linux” has multiple layers before the user space even starts
What it does:
Stores kernel images, initramfs/initrd, and bootloader-related artifacts.
Why it exists:
Early boot needs a minimal, reliable filesystem location for kernel + initial userspace.
What problem does it solve:
Transition from firmware/bootloader to full OS initialization.
Interesting insight:
The initramfs is effectively a temporary micro-userspace that prepares the real root filesystem.
Boot is not one jump into Linux; it’s a staged handoff pipeline.
What this exploration changed for me
Linux feels less like “a command-line OS” and more like a transparent operating model where:
configuration is explicit (
/etc)runtime state is inspectable (
/proc,/sys)identity and permissions are composable (
passwd, groups, mode bits)Hardware is abstracted through files (
/dev)services are graph-managed (
systemd)history is recoverable (
/var/log)
The biggest lesson: Linux is debuggable because it externalizes internals into stable filesystem interfaces.
If you know where to look, the system explains itself.
Minimal command references used during investigation (non-tutorial)
Only where needed to validate findings:
cat /etc/nsswitch.confcat /proc/meminfols -l /proc/<pid>/fdjournalctl -xesystemctl cat <service>
These commands were just lenses; the real learning came from understanding why those files exist and what architecture decisions they represent.



