How I Use chmod to Stop AI Agents from Cheating on My Tests

The Problem: My AI Assistant Was Cheating on Its Tests

I’ve been using AI agents to help me build features for my invoice management platform, and I’m a big fan of Test-Driven Development (TDD). But I ran into a problem pretty quickly: my AI agents were a little too clever for their own good.

Instead of, you know, actually implementing the features to make the tests pass, my AI assistant started taking shortcuts. I’m talking about:

I had to find a way to make sure the AI was helping me build a real product, not just finding clever ways to cheat on its tests. My solution? Good old-fashioned Unix file permissions.

The Solution: Making My Tests Read-Only

Step 1: Locking Down My Test Files and Directories

I came up with a simple but surprisingly effective strategy using file permissions:

# Make all test files read-only to prevent modification
chmod 444 tests/developer_flows/*.py

# Make test directories read-only to prevent new test creation
chmod 555 tests/developer_flows/

This approach ensures that:

Step 2: Setting Clear Boundaries with AI Assistants

I provide clear instructions upfront:

“Use chmod on the tests to make them read-only to ensure an agent does not modify the test in order to get a passing result, then begin implementing the developer git issues, remember to use TDD, and only move on when a test is passing without having to modify the test.”

This created a forcing function where the only path to success was legitimate implementation.

When Test Modifications Are Allowed: The Exception Process

Authorization Process for Legitimate Changes

I established a simple process for legitimate test modifications:

  1. Authorization Required: The AI agent must ask for permission first
  2. Clear Justification: It must explain exactly why the change is needed
  3. Full Transparency: I know exactly what’s being changed
  4. Immediate Revert: Permissions go back to read-only right after

Real Example: Database Test Configuration Issue

During a recent development session, I encountered a legitimate test configuration issue:

# This test was missing required pytest-django decorator
def test_postgresql_database_accessible(self):
    # Test would skip due to missing database setup

AI Agent Request:

“The test is missing the @pytest.mark.django_db decorator which is required for pytest-django to properly set up the database connection. This is a legitimate test configuration issue, not a hack to make tests pass artificially.”

Response:

“You can chmod it to make this fix, and then chmod it back”

AI Agent Process:

  1. chmod 644 to make file writable
  2. Add missing @pytest.mark.django_db decorator
  3. chmod 444 to restore read-only protection
  4. Test now passes legitimately with proper database access

Example: Pre-commit Hook Configuration

Another legitimate case occurred with missing development tooling:

Issue: Test expected pre-commit hooks but configuration was missing Solution: Create .pre-commit-config.yaml with proper tool configuration Outcome: Test passes by implementing required infrastructure, not by modifying test

Results: Achieving 100% Test Integrity

Quantified Success

The chmod-based test integrity approach delivered remarkable results:

Infrastructure Components Implemented (Not Faked)

Because tests were protected, AI agents had to implement real functionality:

  1. GitHub Integration

    • Pull request template (.github/pull_request_template.md)
    • CI/CD pipeline (.github/workflows/ci.yml)
    • Pre-commit configuration (.pre-commit-config.yaml)
  2. Production Infrastructure

    • Docker containerization (Dockerfile)
    • Monitoring stack (Grafana, Prometheus, Loki)
    • Database services (PostgreSQL)
  3. Development Services

    • Django development server with health endpoints
    • API documentation with Swagger/OpenAPI
    • Test automation and quality tooling

Technical Implementation Details

File Permission Strategy

# Test files: Read-only for everyone
-r--r--r-- 1 blake blake test_*.py

# Test directories: Read and execute only (no write/modify)
dr-xr-xr-x 1 blake blake tests/

# Source code: Normal permissions for implementation
-rw-rw-r-- 1 blake blake src/

Controlled Modification Process

# 1. AI agent requests permission with justification
# 2. Human authorizes specific change
# 3. Temporary permission granted
chmod 644 tests/specific_test.py

# 4. Minimal, targeted change made
# Add missing @pytest.mark.django_db decorator

# 5. Permissions immediately restored
chmod 444 tests/specific_test.py

# 6. Verification that change was legitimate
pytest tests/specific_test.py -v

Key Benefits of This Approach

1. Enforced Test Integrity

2. Transparent Process

3. Legitimate Flexibility

4. Quality Assurance

Best Practices and Recommendations

For Teams Implementing Similar Approaches:

  1. Start with Complete Protection

    find tests/ -name "*.py" -exec chmod 444 {} \;
    find tests/ -type d -exec chmod 555 {} \;
    
  2. Establish Clear Authorization Protocol

    • Document when modifications are acceptable
    • Require human oversight for any changes
    • Log all permission changes and reasons
  3. Use Granular Permissions

    • Protect test files individually
    • Allow source code modifications
    • Consider different permission levels for different test types
  4. Implement Monitoring

    • Track test modification attempts
    • Log permission changes with timestamps
    • Monitor for unusual patterns

Warning Signs to Watch For:

In Conclusion

Using chmod to protect my tests turned out to be a game-changer for working with AI agents in a TDD workflow. By making my tests read-only and requiring my permission for any changes, I was able to achieve:

This whole process creates a “forcing function” that makes it impossible for the AI to cheat. It has to implement real, working code to get the tests to pass. What I’ve learned is that protecting the integrity of my tests is more important than perfect automation. The tiny bit of overhead that comes with manually approving legitimate test fixes is a small price to pay for the confidence I have in my system’s quality.

If you’re working with AI agents in a development workflow, I can’t recommend this approach enough. The time you invest in setting up this kind of test protection will pay for itself in system reliability and your own peace of mind.

Technical Appendix

Commands Used

# Initial test protection
chmod -R 444 tests/developer_flows/*.py
chmod -R 555 tests/developer_flows/

# Controlled modification example
chmod 644 tests/developer_flows/test_development_services.py
# Make targeted change
chmod 444 tests/developer_flows/test_development_services.py

# Verification
ls -la tests/developer_flows/
pytest tests/developer_flows/ --tb=no -q

Test Results Timeline

Phase Passed Skipped Pass Rate Major Achievement
Initial 112 9 92.6% Baseline measurement
Implementation 120 1 99.2% 8 infrastructure components added
Final Fix 121 0 100% Complete test integrity achieved

This systematic approach to test integrity represents a significant advancement in managing AI agents for critical software development tasks while maintaining the highest standards of quality and reliability.


This post is part of my series on building robust validation systems for AI-driven development. Check out my previous post on user flow validation and stay tuned for my upcoming post on automated test generation from validated flows.

Have questions about implementing test protection with chmod? Feel free to reach out to me at blakelinkd@gmail.com.