How to safeguard yourself from notorious "rm -rf" command in production

Photo by Sam Pak on Unsplash

How to safeguard yourself from notorious "rm -rf" command in production

This article was inspired by the original postmortem analysis made by Gitlab team during the database outage on January 31 2017. In fact, it is great that enterprise companies don't seal the incidents inside but rather tend to share their experience and knowledge with public. This helps to drive innovation and learn from others, while helping to strengthen the trust people put in the company (thanks to GitLab.com).

The main story of this incident follows a GitLab team member who was paged in after hours to investigate high database load. The engineer suspected spam and attempted to fix the problem by deleting spam users and blocking IPs. However, due to a misunderstanding, the engineer accidentally deleted the entire production database in the wrong SSH session withrm -rfcommand

The team then scrambled to recover the data. They discovered that none of their automated backups were functional and eventually had to restore from a manual snapshot taken hours before the incident. This resulted in data loss for anything created in the preceding six hours.

Lessons learned from this incident include:

  • Importance of testing backups: The team discovered that none of their automated backups were functional, highlighting the importance of regularly testing backups to ensure they work as expected.

  • Thorough load testing: The incident exposed a replication lag scenario that could have been identified through thorough load testing.

  • Code review: The engineer accidentally deleted the database due to running an untested command. Code review practices could have helped prevent this mistake.

  • Importance of documentation: The team was unfamiliar with how the backup process worked and there was no documentation on how to handle replication lag. Better documentation could have prevented the situation from escalating.

  • Asynchronous deletes: The incident also highlighted the danger of synchronous deletes. The team decided to implement soft deletes instead, where a user is marked as deleted but their data is not immediately removed.

  • Dual control principle: Ensure that at least two team members are involved during the risky operations in production environment. This approach adds an extra layer of security and accountability but also comes with considerations of efficiency and practicality.

How to safely delete files in Linux environment?

So, now let's discuss how we can safeguard ourselves in particular from rm -rf command.

Deleting files safely in Linux involves ensuring you're removing the correct files and have backups if necessary. Here are steps and tips to delete files safely:

  1. Backup Important Data: Before deleting any files, ensure you have a backup of important data. You can use tools like rsync, tar, or a dedicated backup solution to back up your data.

  2. Use therm command carefully: The rm command is used to delete files and directories:

    • To delete a single file, use rm filename.

    • To delete multiple files at once, list them with spaces in between, like rm file1.txt file2.txt.

    • To delete a directory and its contents recursively, use rm -r directoryName.

  3. Use the-i option for interactive deletion: If you want to ensure safety, use the -i option with rm to make the deletion interactive. This option will prompt you before deleting each file, allowing you to confirm. For example, rm -i filename will ask for confirmation before deleting the specified file.

  4. Use thetrash-cli utility: Instead of permanently deleting files, you can use a utility like trash-cli which moves files to a trash folder, mimicking the recycle bin feature of graphical environments. Install trash-cli using your package manager (e.g., apt install trash-cli on Debian/Ubuntu), and use trash-put filename to move files to trash safely. To restore a trashed file use trash-restore command.

  5. Use Custom Scripts: Users who have specific needs or who work in environments without a native trash system can write custom scripts. These scripts could move files to a designated directory (serving as a "trash" folder) and could even implement restore functionality. Such scripts can be written in bash, Python, or any scripting language supported by Linux. This approach requires more work but offers maximum flexibility.

  6. Double-Check File Names and Paths: Before hitting Enter, double-check the file names and paths to ensure you're deleting the correct files. A typo can lead to deleting the wrong file or directory.

  7. Avoid Runningrm as root when possible: Running rm with superuser privileges increases the risk of system damage if you mistakenly delete important system files. Only use sudo rm if you are certain about the files you're deleting and their impact on the system.

  8. Use Wildcards Carefully: Wildcards (e.g., *) can be powerful but dangerous when used with rm. For example, rm * deletes all files in the current directory. Always double-check when using wildcards to prevent unintended deletions.

  9. Consider Using a Graphical File Manager: If you're unsure about command-line operations, consider using a graphical file manager. This can provide a more intuitive and visual way to select and delete files, often with a trash/recycle bin feature that allows for recovery of mistakenly deleted files. For GNOME based system you can use the native Nautilus file manager to trash files.

Remember, deleted files using rm are difficult to recover, especially on systems without a dedicated "trash" area for command-line deletions. Always proceed with caution and ensure you have backups of important data. Before doing any risky operations in production environment always notify your team members about what you are going to do.

Secure SSH sessions with different coloring

One way to secure yourself from accidental deletes is to color your SSH sessions, so that you are constantly visually reminded about the environment you are currently logged in.

To achieve this, you can customize the terminal window colors for SSH sessions to help differentiate between production and secondary (or development/testing) environments. Below are a few approaches to achieve this:

1. Terminal Profile Settings

Many terminal emulators (like GNOME Terminal, Konsole, iTerm2 for macOS, etc.) allow you to create profiles with custom colors, fonts, and other settings. You can create a "Production" profile with a red background and a "Secondary" profile with a blue background. When you open a new terminal window to connect to a server, you can select the appropriate profile for that environment.

2. SSH Config File

You can use the SSH configuration file (~/.ssh/config) to set up alias commands for connecting to your servers, which include changing the terminal color upon connection and reverting it upon disconnection. This requires using echo statements to change the terminal color based on ANSI color codes and is somewhat limited by the terminal emulator's support for these codes.

An example entry in ~/.ssh/config might look like this for a production server:

Host production
    HostName production.example.com
    User username
    PermitLocalCommand yes
    LocalCommand echo -ne "\033]11;#FF0000\007" # Set background to red
    RemoteCommand echo -ne "\033]11;#0000FF\007" && bash # Set background to blue upon disconnecting

And for a secondary server:

Host secondary
    HostName secondary.example.com
    User username
    PermitLocalCommand yes
    LocalCommand echo -ne "\033]11;#00FF00\007" # Set background to green

Note: The LocalCommand and RemoteCommand options, and particularly the ANSI codes used, might behave differently depending on your terminal emulator. The example ANSI codes above are for changing the background color. You might need to adjust these codes based on your specific terminal emulator's capabilities and your personal preferences.

3. Bash Profile or Bashrc

You can also customize your .bash_profile or .bashrc file on the server to emit color codes when you log in, but this approach changes the color after login, which might not be as immediately visible as changing the terminal window's color. This method requires that you have the ability to modify these files on the server, which might not be suitable or allowed for production environments.

4. Using Terminal Multiplexers

If you use a terminal multiplexer like tmux or screen, you can configure it to use different color schemes based on the session name, which you might set to reflect the environment you're connected to. This approach requires some familiarity with your multiplexer's configuration files and options.

While customizing terminal colors can be a useful visual reminder of the environment you're working with, ensure that any changes you make do not interfere with your ability to read text in the terminal or distinguish other important information. Always test your configuration changes in a safe environment before applying them to production or critical systems.

Using chattr command to prevent accidental deletion in Linux

The chattr (change attribute) command in Linux allows you to modify the file attributes on a Linux file system to increase data security and integrity. One of its most powerful features for preventing accidental deletion or modification is the ability to set the immutable (i) attribute on a file or directory. When a file or directory is marked as immutable, even users with root privileges cannot delete, modify, rename, or create a hard link to it until the immutable attribute is removed. This can be particularly useful for protecting critical configuration files or sensitive data.

Using chattr to prevent accidental deletion

  1. Set the Immutable Attribute

    To make a file immutable and thus prevent it from being accidentally deleted or modified, you would use the +i attribute with chattr. For example, to make file.txt immutable, you would use the following command:

     sudo chattr +i file.txt
    

    To apply this attribute to a directory and all the files within it, you would use the -R (recursive) option:

     sudo chattr +i -R /path/to/directory
    
  2. Verify the Attributes

    To check if a file or directory has the immutable attribute set, you can use the lsattr command:

     lsattr file.txt
    

    For directories, especially when applied recursively, you might want to list attributes for all contained files:

     lsattr -R /path/to/directory
    
  3. Remove the Immutable Attribute

    If you need to modify or delete the file or directory later, you will first need to remove the immutable attribute using the -i option:

     sudo chattr -i file.txt
    

    Again, for a directory, especially if the attribute was applied recursively, you would use:

     sudo chattr -i -R /path/to/directory
    

Important Considerations

  • Superuser Only: Setting and removing immutable attributes can only be done by the root user or with sudo privileges.

  • Double-Edged Sword: While the immutable attribute is a powerful tool for preventing accidental deletions, it can also interfere with system updates, backups, and other maintenance tasks if used without careful consideration. For example, if a script or a system update needs to modify a file marked as immutable, it will fail until the attribute is removed.

  • Not a Backup Solution: While chattr can prevent accidental deletions, it is not a substitute for having a proper backup. Always ensure you have regular backups of critical data.

  • File System Support: The chattr command works on most Linux file systems, including ext2, ext3, ext4, and btrfs. However, its functionality might not be supported on all file systems, so it's important to verify compatibility with your specific setup.

Using chattr with the immutable attribute is a practical way to safeguard critical files and directories against accidental deletions or modifications, enhancing system security and data integrity.

References:

  1. https://t.me/devops_orbit

  2. YouTube: Dev Deletes Entire Production Database, Chaos Ensues

  3. How not to rm yourself

  4. Prevent Files And Folders From Accidental Deletion Or Modification In Linux

  5. Postmortem of database outage of January 31

  6. GitLab.com database incident

  7. What is load testing?

  8. What is a code review?

  9. https://github.com/andreafrancia/trash-cli

  10. How To Delete a File in Ubuntu

  11. Change tmux colors when running ssh

  12. Change terminal colour based on SSH session