Post

Linux DevOps Interview Questions and Answers

Linux DevOps Interview Questions and Answers

Q1: Explain the differences between RHEL, CentOS, and AlmaLinux.

Answer: RHEL is a commercial enterprise Linux distribution with official support and 10-year lifecycle. CentOS was a free RHEL clone but changed to CentOS Stream (development platform for RHEL). AlmaLinux emerged as a community-driven, free RHEL fork with 1:1 binary compatibility. Choose RHEL for enterprise environments requiring support and certifications, AlmaLinux for RHEL compatibility without licensing costs, and CentOS Stream for early access to upcoming RHEL features.

Q2: How would you troubleshoot a server running out of disk space?

Answer: I’d follow these steps:

  1. Run df -h to identify affected partitions
  2. Use du -hsx /* | sort -rh | head -10 to find large directories
  3. Check common culprits: log files in /var/log, database data, Docker images
  4. Find large recent files with find / -mtime -7 -type f -size +100M
  5. Check deleted but open files with lsof | grep deleted
  6. Take appropriate action: compress logs, set up rotation, clean caches
  7. Implement monitoring to prevent future occurrences

Q3: Describe your approach to Linux server hardening.

Answer: My approach includes:

  1. Minimizing the attack surface:
    • Install only necessary packages
    • Disable unused services
    • Use a host-based firewall (UFW/firewalld)
  2. User account security:
    • Strong password policies
    • SSH key authentication
    • Restrict root login
    • Implement least privilege principle
  3. Network security:
    • Configure firewall rules
    • Disable unused ports
    • Use intrusion detection (fail2ban)
  4. Regular maintenance:
    • Keep systems patched
    • Enable automatic security updates
    • Regular security audits
  5. Monitoring and logging:
    • Central log management
    • File integrity monitoring
    • Regular log review

Docker and Containerization Questions

Q1: Explain the difference between Docker containers and virtual machines.

Answer: Docker containers share the host OS kernel and are lightweight (MBs), starting in seconds with efficient resource usage. Virtual machines run complete OS with dedicated kernels, are heavyweight (GBs), start slower (minutes), but provide complete hardware-level isolation. Containers are ideal for microservices and applications with the same OS requirements, while VMs are better for applications requiring different OS environments or complete isolation.

Q2: How would you secure a Docker container in production?

Answer: To secure Docker containers:

  1. Use minimal base images (Alpine, distroless)
  2. Scan images for vulnerabilities
  3. Run containers with non-root users
  4. Apply principle of least privilege with capabilities
  5. Use read-only filesystems where possible
  6. Implement network isolation with user-defined bridge networks
  7. Keep Docker engine updated
  8. Use secrets management for sensitive data
  9. Implement logging and monitoring
  10. Automate security scanning in pipelines

Cloudflare, CDN, and Security Questions

Q1: What is a Web Application Firewall and how does it differ from traditional firewalls?

Answer: A WAF protects web applications by filtering HTTP traffic at OSI Layer 7 (Application layer). It inspects specific HTTP/HTTPS traffic to protect against application-level attacks like XSS, SQL injection, and CSRF. Traditional firewalls operate at OSI Layers 3-4 (Network & Transport), filtering traffic based on IP, ports, and protocols without understanding application contexts. WAFs focus on application vulnerabilities with deep packet inspection, while traditional firewalls handle network-level attacks with simpler rules based on IP/port/protocol combinations.

Q2: How would you implement security measures for a web application?

Answer: I would implement:

  1. Input validation server-side and client-side
  2. Authentication with MFA where possible
  3. Authorization using principle of least privilege
  4. HTTPS with proper TLS configuration
  5. Security headers (HSTS, CSP, X-Frame-Options)
  6. WAF to protect against common attacks
  7. Rate limiting to prevent brute force
  8. CSRF tokens for form submissions
  9. Regular vulnerability scanning
  10. Security monitoring and logging
  11. Database security - parameterized queries, least privilege access
  12. Regular updates for all components

Nginx, Ansible, and Jenkins Questions

Q1: How would you configure Nginx as a reverse proxy with SSL termination?

Answer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Obtain SSL certificate first with certbot
server {
    listen 443 ssl http2;
    server_name example.com;
    
    # SSL configuration
    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    
    # Security headers
    add_header Strict-Transport-Security "max-age=31536000" always;
    
    # Proxy settings
    location / {
        proxy_pass http://backend_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

# Redirect HTTP to HTTPS
server {
    listen 80;
    server_name example.com;
    return 301 https://$host$request_uri;
}

Q2: Describe an Ansible playbook you’ve created and what it accomplished.

Answer: I created a playbook for standardized web server deployment that:

  1. Updated system packages
  2. Installed Nginx, PHP-FPM, and required extensions
  3. Configured Nginx with optimized settings and SSL
  4. Set up proper file permissions and security hardening
  5. Deployed application code from Git repository
  6. Configured monitoring agents
  7. Set up log rotation and backup scripts

The playbook used roles for reusability, variable files for different environments, and handlers for service restarts. This resulted in consistent deployments that reduced setup time from hours to minutes with built-in security best practices.

Q3: How would you set up a basic Jenkins pipeline for a web application?

Answer: I’d create a Jenkinsfile with these stages:

  1. Checkout: Pull code from repository
  2. Build: Install dependencies and build assets
  3. Test: Run unit and integration tests
  4. Quality: Static code analysis and security scanning
  5. Package: Create deployable artifact
  6. Deploy: Deploy to staging environment
  7. Validate: Run smoke tests on staging
  8. Approval: Manual approval gate for production
  9. Production: Deploy to production

The pipeline would include notifications, timeout controls, and artifact archiving. For a web app, I’d use Docker for consistent environments and integrate security scanning tools to identify vulnerabilities early.

VMWare and Proxmox Questions

Q1: Compare VMWare and Proxmox. When would you choose one over the other?

Answer: VMware is proprietary with enterprise features like vMotion and DRS, excellent scalability, and enterprise support, but requires significant licensing costs. Proxmox is open-source based on KVM/LXC with good web management, basic clustering, and is free with optional support subscriptions.

Choose VMware for enterprise environments with large-scale deployments, critical workloads requiring vendor support, and when budget allows. Choose Proxmox for smaller organizations with budget constraints, when you need both VMs and containers from one platform, or for development/testing environments.

Q2: Describe your experience with VM snapshots and backups.

Answer: I’ve implemented both snapshot and backup strategies:

For snapshots:

  • Used for short-term protection during updates/changes
  • Implemented automated pre-update snapshots
  • Established retention policies (usually 24-48 hours)
  • Set up monitoring for orphaned snapshots
  • Educated teams on snapshot limitations (performance impact)

For backups:

  • Implemented agent-based and agentless VM backups
  • Set up incremental backup schedules with full backups weekly
  • Configured offsite replication for disaster recovery
  • Performed regular test restores to verify integrity
  • Used backup verification tools
  • Implemented 3-2-1 backup strategy (3 copies, 2 different media, 1 offsite)

Automation and Troubleshooting Questions

Q1: How would you troubleshoot a web application that’s running slowly?

Answer: I’d follow a systematic approach:

  1. Gather information:
    • User reports and specific symptoms
    • Recent changes or deployments
    • Check monitoring dashboards
  2. Isolate the issue:
    • Frontend vs. backend performance
    • Check server resource utilization (CPU, memory, disk I/O)
    • Database query performance
    • Network latency and bandwidth
  3. Tools and diagnostics:
    • Browser developer tools for frontend issues
    • top, htop, iotop for server resources
    • Application performance monitoring tools
    • Database query analyzers
    • tcpdump or Wireshark for network analysis
  4. Common solutions:
    • Scale resources vertically or horizontally
    • Optimize database queries and add indices
    • Implement caching (Redis, Memcached)
    • Configure CDN for static assets
    • Enable compression and minification
    • Optimize code (N+1 query problems, memory leaks)
  5. Document and prevent:
    • Document root cause and solution
    • Implement monitoring to catch early warnings
    • Add performance testing to CI/CD pipeline

Q2: Describe your approach to automation in a DevOps environment.

Answer: My approach to automation focuses on:

  1. Identify repetitive tasks first - target high-frequency, error-prone, or time-consuming processes
  2. Start small and iterate - automate simple tasks first, then expand
  3. Use infrastructure as code - Terraform, CloudFormation for infrastructure
  4. Configuration management - Ansible for consistent environments
  5. CI/CD pipelines - Jenkins, GitHub Actions for automated testing and deployment
  6. Version control everything - store all automation code in Git
  7. Focus on idempotency - scripts should be safely rerunnable
  8. Include error handling - robust error reporting and recovery
  9. Documentation - self-documenting code with clear comments
  10. Monitoring and alerting - to verify automation is working

The goal is to reduce manual intervention, increase consistency, and enable self-service where possible.

This post is licensed under CC BY 4.0 by the author.