DevOps Interview linux Preparation Study Notes
These examples provide simple solutions to common DevOps interview questions, along with detailed explanations of each component.
1. Troubleshooting SSH Connection to EC2 Instance
Definition: SSH (Secure Shell) connection failures to EC2 instances can occur due to security group configurations, key permissions, network issues, or instance problems.
Key Points:
- Check security groups first - they’re the most common blocker
- Verify the instance is running and passed status checks
- Ensure your private key has proper permissions (chmod 400)
Step-by-Step Troubleshooting:
- Verify instance state
- Check if the instance is running in the EC2 Console
- Ensure the instance has passed its status checks
- Check security group rules
- Verify inbound rules allow SSH traffic (port 22) from your IP
- Command:
aws ec2 describe-security-groups --group-ids sg-xxxxx
- Validate network settings
- Confirm the subnet has a route to the internet gateway
- Ensure no Network ACLs are blocking port 22
- Check key pair
- Verify you’re using the correct private key (.pem file)
- Set proper permissions:
chmod 400 your-key.pem
- Use verbose mode for more details
1
ssh -v -i your-key.pem ec2-user@ec2-xx-xx-xx-xx.compute-1.amazonaws.com
- Try the EC2 Instance Connect
- Use as an alternative to verify the instance is accepting connections
- Check system logs in EC2 console
- System log may show SSH service failures
2. Troubleshooting 5xx Errors on Ubuntu Server
Definition: 5xx errors are server-side errors indicating the server failed to fulfill a valid request, commonly seen in web applications.
Key Points:
- 5xx errors almost always indicate server problems, not client issues
- Common causes: resource exhaustion, application crashes, or misconfiguration
- Log files are your best diagnostic tools
Step-by-Step Troubleshooting:
- Check web server logs
- Apache:
/var/log/apache2/error.log - Nginx:
/var/log/nginx/error.log
- Apache:
- Check application logs
- Check application-specific logs (e.g.,
/var/log/application.log) - For containerized apps, check container logs
- Check application-specific logs (e.g.,
- Monitor system resources
1 2 3 4 5 6 7 8
# Check CPU and memory usage top # Check disk space df -h # Check for running processes ps aux | grep apache2 # or nginx, etc.
- Restart web services
1
sudo systemctl restart apache2 # or nginx
- Check configuration files
- Validate config syntax:
sudo apache2ctl configtestorsudo nginx -t
- Validate config syntax:
- Check database connections
- Ensure database services are running
- Check connection parameters in configuration files
- Review recent changes
- Check deployment logs for recent changes
- Verify configuration changes
3. Adding and Mounting a New Disk on a Server
Definition: Adding a new disk involves attaching the physical/virtual storage, creating a filesystem, and mounting it to make it accessible within the filesystem hierarchy.
Key Points:
- Always check disk identity before partitioning/formatting
- Create a permanent mount entry in /etc/fstab for persistence after reboot
- Consider filesystem type based on your needs (ext4 is common for Linux)
Step-by-Step Guide:
- Identify the new disk
1 2 3
lsblk # or fdisk -l
- Create a partition
1 2 3 4 5
sudo fdisk /dev/xvdf # Replace with your disk # n (new partition) # p (primary partition) # Accept defaults for partition number and size # w (write changes)
- Create a filesystem
1
sudo mkfs -t ext4 /dev/xvdf1 # Replace with your partition
- Create a mount point
1
sudo mkdir /data # Or any preferred directory
- Mount the filesystem
1
sudo mount /dev/xvdf1 /data - Configure automatic mounting
1 2 3 4 5
# Get the UUID sudo blkid /dev/xvdf1 # Add to /etc/fstab echo "UUID=<uuid-from-blkid> /data ext4 defaults,nofail 0 2" | sudo tee -a /etc/fstab
- Verify the mount
1
df -h
4. Recovering from a Lost PEM File for an AWS Ubuntu Server
Definition: A lost PEM file means you’ve lost SSH access to your instance because the private key is no longer available to authenticate your connection.
Key Points:
- AWS doesn’t store or can’t recover your private key
- Recovery involves creating a new key and applying it to the instance
- For EBS-backed instances, you can use another instance to help recover access
Step-by-Step Recovery:
- Create a new key pair
1 2
aws ec2 create-key-pair --key-name NewKeyPair --query 'KeyMaterial' --output text > NewKeyPair.pem chmod 400 NewKeyPair.pem
- Stop the instance (Note: This will cause downtime)
1
aws ec2 stop-instances --instance-ids i-1234567890abcdef0 - Detach the root volume
1
aws ec2 detach-volume --volume-id vol-1234567890abcdef0 - Launch a temporary instance
- Launch in the same AZ as the detached volume
- Attach the volume to the temporary instance
1
aws ec2 attach-volume --volume-id vol-1234567890abcdef0 --instance-id i-abcdef1234567890 --device /dev/sdf
- Mount the volume and modify authorized_keys
1 2 3
sudo mkdir /mnt/recovery sudo mount /dev/xvdf1 /mnt/recovery sudo cp ~/.ssh/authorized_keys /mnt/recovery/home/ubuntu/.ssh/
- Detach, reattach to original instance, and restart
1 2 3 4
sudo umount /mnt/recovery aws ec2 detach-volume --volume-id vol-1234567890abcdef0 aws ec2 attach-volume --volume-id vol-1234567890abcdef0 --instance-id i-1234567890abcdef0 --device /dev/sda1 aws ec2 start-instances --instance-ids i-1234567890abcdef0
- Connect with the new key
1
ssh -i NewKeyPair.pem ubuntu@ec2-xx-xx-xx-xx.compute-1.amazonaws.com