Post

Linux Process Management — Mastering ps, top, kill & Process Lifecycle

Linux Process Management — Mastering ps, top, kill & Process Lifecycle

🎯 What You Will Learn

  • How to inspect all running processes using ps aux and ps -ef with full output interpretation
  • How to monitor system processes in real time using top and htop
  • How to terminate processes cleanly and forcefully using kill and killall
  • What PID and PPID mean and why the parent-child process relationship matters
  • What a zombie process is, why it forms, and how to deal with it

📝 Topic Overview


🔹 ps aux — BSD-Style Full Process Snapshot

ps (process status) prints a snapshot of currently running processes. The aux flags are the most commonly used combination.

Flag breakdown:

  • a — show processes from all users (not just the current user)
  • u — show in user-oriented format (adds USER, CPU%, MEM% columns)
  • x — include processes not attached to a terminal (daemons, background services)
1
ps aux

Sample output and column meanings:

ColumnMeaning
USEROwner of the process
PIDProcess ID — unique identifier
%CPUCPU usage since process started
%MEMPhysical RAM usage percentage
VSZVirtual memory size (KB) — total address space reserved
RSSResident Set Size (KB) — actual RAM currently in use
TTYTerminal associated with the process (? = no terminal/daemon)
STATProcess state (see state codes below)
STARTTime or date the process started
TIMETotal accumulated CPU time used
COMMANDCommand that launched the process

Process state codes (STAT column):

CodeMeaning
RRunning or runnable (on the CPU or ready queue)
SInterruptible sleep (waiting for an event, e.g., I/O)
DUninterruptible sleep (usually waiting on disk I/O — cannot be killed)
ZZombie — process finished but not yet reaped by parent
TStopped (via SIGSTOP or Ctrl+Z)
<High priority (negative nice value)
NLow priority (positive nice value)
sSession leader (e.g., a shell)
lMulti-threaded process
+In the foreground process group
1
2
3
4
5
6
7
8
# Filter ps output for a specific process name
ps aux | grep nginx

# Sort by CPU usage (highest first)
ps aux --sort=-%cpu | head -10

# Sort by memory usage (highest first)
ps aux --sort=-%mem | head -10

🔹 ps -ef — POSIX/System V Style Process Listing

ps -ef is the System V (POSIX) style equivalent — standard across all Unix-like systems including macOS, Solaris, and AIX.

Flag breakdown:

  • -e — show every process on the system
  • -ffull format listing (adds PPID, STIME, UID columns)
1
ps -ef

Key columns unique to -ef:

ColumnMeaning
UIDUser ID (owner)
PIDProcess ID
PPIDParent Process ID — the PID that spawned this process
CCPU utilization integer
STIMEStart time
TTYTerminal
TIMECPU time consumed
CMDFull command with arguments

ps aux vs ps -ef: Both show all processes. aux gives %CPU, %MEM, VSZ, RSS — better for resource analysis. -ef shows PPID natively — better for tracing the process tree. Combine them: ps auxf adds an ASCII tree visualization.

1
2
3
4
5
# Show process tree using ps
ps auxf

# Find parent-child chain for a specific PID
ps -ef | grep <PID>

🔹 top — Real-Time Interactive Process Monitor

top is a live, continuously refreshing view of system resource usage and process activity. It updates every 3 seconds by default.

1
top

Real output example:

1
2
3
4
5
top - 18:16:43 up  4:20,  1 user,  load average: 0.42, 0.53, 0.61
Tasks: 341 total,   1 running, 340 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.4 us,  1.6 sy,  0.3 ni, 92.5 id,  1.1 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7722.0 total,    826.3 free,   4203.1 used,   2692.6 buff/cache
MiB Swap:  11817.5 total,  10912.4 free,    905.1 used.   2239.7 avail Mem

Line 1 — System Summary

1
top - 18:16:43 up 4:20, 1 user, load average: 0.42, 0.53, 0.61
FieldValueMeaning
Current time18:16:43Wall clock time when this snapshot was taken
Uptimeup 4:20System has been running for 4 hours 20 minutes
Users1 userOne user session currently logged in
Load avg (1m)0.42Average number of runnable/waiting processes over last 1 minute
Load avg (5m)0.53Same, over last 5 minutes
Load avg (15m)0.61Same, over last 15 minutes

Reading load averages: Load average represents the average number of processes that are either running or waiting for CPU time. On a single-core machine, a load of 1.0 means 100% busy. On a 4-core machine, a load of 4.0 means 100% busy. This system has 1 user and a load of 0.42 — the CPU is largely idle and healthy. A load average trending upward (0.42 → 0.53 → 0.61 reading right to left = rising load) can be a warning sign of increasing pressure.


Line 2 — Task (Process) Summary

1
Tasks: 341 total, 1 running, 340 sleeping, 0 stopped, 0 zombie
FieldValueMeaning
Total341Total number of processes currently known to the kernel
Running1Processes actively on a CPU core right now
Sleeping340Processes waiting on I/O, timers, or events (normal)
Stopped0Processes paused via SIGSTOP or Ctrl+Z
Zombie0Dead processes not yet reaped by their parent

What’s normal: Having nearly all processes sleeping is completely healthy — it means they’re idle and waiting for work. 0 zombie is the ideal state. If zombie count climbs steadily, a parent process has a bug and isn’t calling wait().


Line 3 — CPU Usage Breakdown

1
%Cpu(s):  4.4 us,  1.6 sy,  0.3 ni, 92.5 id,  1.1 wa,  0.0 hi,  0.0 si,  0.0 st
FieldValueMeaning
us4.4%User space — CPU time spent running your applications
sy1.6%System/kernel — CPU time spent on kernel operations (syscalls, drivers)
ni0.3%Nice — CPU time for user processes with adjusted (lowered) priority
id92.5%Idle — CPU doing nothing; available capacity
wa1.1%I/O wait — CPU idle but waiting for disk/network I/O to complete
hi0.0%Hardware interrupts — time handling hardware signals (keyboard, NIC)
si0.0%Software interrupts — time handling kernel software interrupt processing
st0.0%Steal time — CPU cycles taken by the hypervisor (only non-zero in VMs)

Reading this snapshot: 92.5% idle means the system is under very light load. 1.1% wa (I/O wait) is low and normal. If wa climbs above 10–20%, you likely have a disk bottleneck. If sy is consistently high, kernel-level activity (syscalls, context switches) may be a concern. st > 0 on a VM signals the host hypervisor is overcommitted.


Line 4 — Physical Memory (RAM)

1
MiB Mem : 7722.0 total, 826.3 free, 4203.1 used, 2692.6 buff/cache
FieldValueMeaning
total7722.0 MiBTotal installed physical RAM (~7.5 GB)
free826.3 MiBCompletely unused RAM
used4203.1 MiBRAM actively used by processes
buff/cache2692.6 MiBRAM used by kernel for disk buffers and file cache

Don’t panic about low “free” memory. Linux aggressively uses spare RAM as disk cache (buff/cache) to speed up file access. This memory is immediately reclaimable when a process needs it. The real available memory is shown on the Swap line as avail Mem — here 2239.7 MiB is truly available for new allocations.


Line 5 — Swap Space

1
MiB Swap: 11817.5 total, 10912.4 free, 905.1 used. 2239.7 avail Mem
FieldValueMeaning
total11817.5 MiBTotal swap space (~11.5 GB, on disk)
free10912.4 MiBUnused swap
used905.1 MiBData currently swapped out to disk
avail Mem2239.7 MiBEstimated RAM available for new processes without swapping

Swap in use (905.1 MiB) means some memory pages were moved to disk — likely from long-idle processes. Moderate swap use is fine. Heavy swap use (swap used near total) combined with high wa CPU time is a classic sign of memory pressure — the system is constantly swapping pages in and out, causing slowdowns.


Interactive keyboard shortcuts inside top:

KeyAction
PSort by CPU usage
MSort by memory usage
kKill a process (prompts for PID and signal)
rRenice a process (change priority)
uFilter by a specific user
1Toggle per-CPU-core breakdown
hHelp screen
qQuit

🔹 htop — Enhanced Interactive Process Viewer

htop is a modern, color-coded, mouse-enabled alternative to top. It is not installed by default on all systems.

1
2
3
4
5
# Install if not present
sudo apt install htop      # Debian/Ubuntu
sudo dnf install htop      # Fedora/RHEL

htop

Advantages over top:

  • Visual CPU/memory bars per core
  • Mouse-clickable interface
  • Scroll horizontally to see full command lines
  • Multi-select and bulk-kill processes (Space to select, F9 to kill)
  • Tree view built in (F5)

Key shortcuts:

KeyAction
F2Setup / configuration
F3Search for a process by name
F4Filter processes
F5Tree view (shows parent-child hierarchy)
F6Sort by column
F9Kill selected process (choose signal)
F10Quit

🔹 kill — Send Signals to a Process by PID

kill does not only terminate processes — it sends signals to them. Termination is just the most common use.

1
2
3
4
5
6
7
8
9
10
11
# Syntax
kill [signal] <PID>

# Default signal is SIGTERM (15) — polite termination request
kill 1234

# Force kill with SIGKILL (9) — cannot be caught or ignored
kill -9 1234

# List all available signals
kill -l

Essential signals:

SignalNumberMeaning
SIGTERM15Graceful termination request — process can clean up
SIGKILL9Immediate, unconditional kill — cannot be caught
SIGHUP1Hangup — often used to reload config (e.g., nginx, sshd)
SIGSTOP19Pause/suspend a process (cannot be caught)
SIGCONT18Resume a stopped process
SIGINT2Interrupt (same as pressing Ctrl+C)

Best practice: Always try SIGTERM (15) first — it allows the process to save state and exit cleanly. Only escalate to SIGKILL (9) if the process doesn’t respond after a few seconds. SIGKILL cannot be intercepted; it is handled entirely by the kernel.


🔹 killall — Send Signals to Processes by Name

killall targets processes by name rather than PID — useful when you don’t know or want to look up the PID.

1
2
3
4
5
6
7
8
9
10
11
# Gracefully terminate all processes named "nginx"
killall nginx

# Force kill all instances
killall -9 firefox

# Kill only processes owned by a specific user
killall -u www-data nginx

# Interactively confirm before killing each match
killall -i python3

Warning: killall on Linux kills by name; on some Unix systems (e.g., Solaris), killall kills every single process on the system. Always verify which OS you’re on before using it in scripts.


🔹 PID and PPID — The Process Identity System

PID (Process ID): A unique integer assigned by the kernel to every process at creation. PIDs are assigned sequentially and wrap around after reaching the system maximum (typically 32768 by default, configurable via /proc/sys/kernel/pid_max).

PPID (Parent Process ID): The PID of the process that created (forked) this process. Every process except PID 1 has a parent.

Special PIDs:

PIDProcessRole
0Swapper/idleKernel internal — not a real user process
1init / systemdThe first userspace process; parent of all orphaned processes
2kthreaddParent of all kernel threads
1
2
3
4
5
6
7
8
# Find PID of a running process by name
pgrep nginx

# Find PID and PPID together
ps -o pid,ppid,comm -p $(pgrep bash)

# Read a process's own PID/PPID from /proc
cat /proc/$$/status | grep -E "^(Pid|PPid)"

🔹 Zombie Processes — The Living Dead

A zombie process (state Z in ps) is a process that has finished executing but whose exit status has not yet been collected by its parent.

How a zombie forms:

  1. Child process calls exit() — it terminates and releases its memory and resources.
  2. The kernel keeps a small entry in the process table to store the exit code.
  3. The parent is expected to call wait() or waitpid() to collect that exit code.
  4. If the parent never calls wait(), the child entry stays in the process table forever — it becomes a zombie.

Key facts about zombies:

  • A zombie holds no memory, no CPU — it’s just a process table entry
  • You cannot kill a zombie with kill -9 — it’s already dead
  • To eliminate a zombie, you must fix or kill its parent
  • If the parent dies, systemd/init (PID 1) adopts the zombie and reaps it automatically
1
2
3
4
5
6
7
# Find zombie processes
ps aux | grep 'Z'
ps -el | grep Z

# Find the parent of a zombie (to kill it)
ps -o ppid= -p <zombie_PID>
kill -9 <parent_PID>   # killing the parent forces PID 1 to adopt and reap the zombie

Zombie vs Orphan: A zombie is a dead child whose parent hasn’t called wait(). An orphan is a living child whose parent has already died — orphans are immediately re-parented to PID 1 (systemd), which will properly reap them when they finish.


🔧 Commands & Quick Reference

CommandPurposeExample
ps auxSnapshot of all processes (BSD format)ps aux \| grep nginx
ps -efSnapshot with PPID visible (POSIX format)ps -ef \| grep python
ps auxfProcess tree in ASCII artps auxf \| less
topReal-time process monitortop then press P
htopEnhanced real-time monitorhtop then F5 for tree
kill <PID>Send SIGTERM to a processkill 4321
kill -9 <PID>Force kill (SIGKILL)kill -9 4321
kill -1 <PID>Reload config (SIGHUP)kill -1 $(pgrep nginx)
killall <name>Kill all processes by namekillall -9 firefox
pgrep <name>Find PID by process namepgrep sshd
pkill <name>Kill by name (like killall)pkill -9 zombie_app

💡 References & Learning Resources

  • man ps, man top, man kill — the primary source of truth (Beginner-friendly)
  • “The Linux Programming Interface” by Michael Kerrisk, Ch. 26 (Monitoring Child Processes) (Advanced/Deep dive)
  • “Linux Command Line and Shell Scripting Bible” by Richard Blum (Beginner-friendly)
  • strace -e trace=process ls — trace only process-related syscalls live (Intermediate)
  • Linux proc(5) man page: man 5 proc — full reference for /proc/<PID>/status fields (Intermediate)

📊 Quick Recap

  • ps aux is your go-to snapshot tool; %CPU, %MEM, RSS, and the STAT column reveal process health at a glance.
  • ps -ef uniquely shows PPID, making it ideal for tracing parent-child relationships.
  • top gives real-time CPU/memory stats; htop adds color, mouse support, and a tree view — prefer htop for interactive use.
  • Always attempt SIGTERM (15) before SIGKILL (9) — give processes the chance to clean up gracefully.
  • killall targets by name; kill targets by PID — use pgrep to bridge them: kill $(pgrep nginx).
  • PID identifies a process uniquely; PPID reveals who spawned it, forming the process family tree rooted at PID 1 (systemd).
  • A zombie is a finished process whose exit status was never collected by its parent — it holds no resources but cannot be removed until the parent calls wait() or dies itself.

🏷️ Tags

1
#Linux #ProcessManagement #ps #top #htop #kill #killall #PID #PPID #ZombieProcess #Signals #SIGTERM #SIGKILL #SystemMonitoring #CLI #Intermediate #SysAdmin #UnixInternals
This post is licensed under CC BY 4.0 by the author.