Initial commit: Intelligent Disk Cleaner

This commit is contained in:
2026-02-03 14:36:25 -05:00
commit ec7625d4d9
11 changed files with 1769 additions and 0 deletions

71
README.md Normal file
View File

@@ -0,0 +1,71 @@
# Intelligent Disk Cleaner
A zero-dependency Python script for Linux servers that prevents disk exhaustion by monitoring usage and intelligently cleaning up log files.
## Key Features
1. **Zero External Dependencies**: Runs on standard Python 3 libraries. No `pip install` needed.
2. **Dual Thresholds**:
* **Warning (80%)**: Alerts admins to high usage and identifies the largest open file (suspected culprit) without modifying any files.
* **Critical (95%)**: Triggers active cleanup to recover space immediately.
3. **Intelligent Truncation**: Safely shrinks large active log files in-place by removing the oldest 50% of data, preserving the process file handle.
4. **Fallback Cleanup**: If active logs cannot be shrunk, iteratively deletes the oldest rotated log files first.
5. **Spam Prevention**: Rate-limits email notifications to once every 8 hours per disk volume.
## Requirements
* **Operating System**: Linux (relies on `/proc` filesystem for process analysis).
* **Runtime**: Python 3.6+.
## Configuration
Open `disk_cleaner.py` and configure the settings at the top of the file:
```python
# --- Configuration ---
THRESHOLD_PERCENT = 95.0 # Critical cleanup trigger
WARNING_THRESHOLD_PERCENT = 80.0 # Warning email trigger
RATE_LIMIT_SECONDS = 8 * 3600 # 8 hours cooldown
# Email Settings
SMTP_SERVER = "smtp.example.com"
SMTP_PORT = 25
EMAIL_FROM = "alerts@example.com"
EMAIL_TO = ["admins@example.com"]
```
## Usage
### Manual Execution
Run with root privileges to ensure access to all system processes (required for accurate open file detection):
```bash
sudo python3 disk_cleaner.py
```
### Automation (Cron)
To monitor continuously, add a cron job (e.g., run hourly).
1. Open root crontab:
```bash
sudo crontab -e
```
2. Add the schedule:
```bash
# Run every hour at minute 0
0 * * * * /usr/bin/python3 /opt/scripts/disk_cleaner.py >> /var/log/disk_cleaner.log 2>&1
```
## Logic Flow
1. **Check Usage**: Scans all mounted partitions (ignoring pseudo-filesystems).
2. **Evaluate State**:
* **< 80%**: Healthy. Exit.
* **80% - 94% (Warning)**: Start scan. Find the largest file held open by any process. Send a "WARNING" email with the file path and size. **No action taken.**
* **> 95% (Critical)**: Start scan.
* **Strategy A**: additional checks confirm files are logs. Truncate the largest active log file by 50%. Check if space is recovered.
* **Strategy B**: If A fails/insufficient, find rotated logs (e.g., `.log.1`, `.gz`) and delete the oldest ones until usage drops.
* **Action Taken**: Send "URGENT" email detailing the cleanup.
* **Action Failed**: If space cannot be freed, send "CRITICAL" email with the largest suspect file.
3. **Rate Limit**: Before sending any email, check the state file (`/tmp/disk_cleaner_state.json`). If an email was sent for this volume in the last 8 hours, suppress the notification.