Skip to content

izaac/ai-trace-scanner

Repository files navigation

ai-trace-scan

Detect AI/agentic authorship fingerprints in a codebase.

Scans git history, branch names, config files, and source comments for traces left by AI coding assistants (Copilot, Claude, Cursor, Aider, etc).

Table of contents

Setup

One-time setup — installs uv (if needed) and creates the virtual environment:

make install

After that, uv run handles everything automatically (venv activation, Python version, dependencies). You don't need make again.

Platform notes

Platform How uv is installed
macOS / Linux (standard) make install downloads it via the official installer
NixOS Use nix-shell (a shell.nix is provided) or add uv to your system config
Already have uv make install skips the download, just runs uv sync

NixOS:

nix-shell                       # drops you into a shell with uv + git
uv sync                         # create venv and install
uv run ai-trace-scan --help

Usage

Once installed, run from anywhere using uv run --project:

# Shorthand — add to .bashrc / .zshrc
alias ai-scan='uv run --project /path/to/ai-trace-scanner ai-trace-scan'

Scan an entire repo

ai-scan /path/to/repo

Scans the last 50 commits, all branches, config files, and source tree.

Scan only your current branch

The most common use case — check your feature branch before pushing:

cd /path/to/repo
ai-scan --branch $(git branch --show-current) .

This scans only commits in your branch that aren't in main/master, plus the source tree and config files.

Scan staged changes before committing

cd /path/to/repo
ai-scan --staged

Only checks the diff of what you're about to commit. Useful as a pre-commit hook (see below).

Scan unstaged work-in-progress

cd /path/to/repo
ai-scan --unstaged

Scans your working tree changes before you even stage them. Use both together to check everything pending:

ai-scan --staged --unstaged

Limit commit history depth

ai-scan --commits 10 /path/to/repo

Disable color (for CI / piping)

ai-scan --no-color /path/to/repo

Quiet mode (findings only, no banner)

ai-scan --quiet /path/to/repo

What it detects

Category Severity Examples
Git trailers High Co-authored-by: Copilot, Co-authored-by: Claude
Commit messages High "generated by Copilot", "as an AI", "per your instructions"
AI-to-user language High "as you requested", "per your request", "let me know if you'd like"
AI first-person voice High "I've implemented", "here's the implementation", "I'll create"
Commit authors High copilot[bot], devin[bot], +Copilot@users.noreply.github.com
Commit diffs Medium Added lines containing AI attribution, AI TODOs, instruction remnants
Emoji commit prefixes Low Commits starting with emoji characters (common in AI-generated messages)
Tag annotations High Annotated tags with AI trailers or attribution
Branch names Medium copilot/, claude/, aider-, ai-, devin-, sweep- prefixes
Config files High/Medium AGENTS.md, CLAUDE.md, .cursorrules, .aider*, .github/copilot-instructions.md
Source comments Medium "generated with Claude", "copilot-generated" (pygments-parsed, comments only)
Prose content Medium "written by Claude", "powered by GPT" in markdown/text files
GitHub Actions Medium Workflows using AI actions, API keys (OPENAI_API_KEY), AI SDK installs
Commit timing Medium Clusters of commits with suspiciously small gaps (possible automation)

Exit codes

Code Meaning
0 No findings
1 Findings detected
2 Error

Pre-commit hook

Add to .pre-commit-config.yaml:

- repo: local
  hooks:
    - id: ai-trace-scan
      name: AI trace scan
      entry: uv run --project /path/to/ai-trace-scanner ai-trace-scan --staged
      language: system
      pass_filenames: false

Or as a git hook directly in .git/hooks/pre-commit:

#!/bin/sh
uv run --project /path/to/ai-trace-scanner ai-trace-scan --staged

Exclude patterns

Filter out noise from things you don't control (upstream branches, etc):

# Exclude all upstream remote branches
ai-scan --exclude 'upstream/' /path/to/repo

# Exclude specific config files
ai-scan --exclude 'AGENTS\.md' --exclude 'CLAUDE\.md' /path/to/repo

# Combine multiple excludes
ai-scan --exclude 'upstream/' --exclude 'origin/copilot/' /path/to/repo

Persistent excludes with config file

Create .ai-trace-scan.yml in the repo root:

exclude: ['upstream/', 'AGENTS\.md', 'CLAUDE\.md']

Config file excludes are merged with CLI --exclude flags.

JSON output

For CI pipelines or further processing:

ai-scan --format json /path/to/repo

# Pipe to jq for filtering
ai-scan --format json /path/to/repo | jq '[.[] | select(.severity == "high")]'

Commit timing detection and fix

The scanner flags commits that are suspiciously close together. It can also rewrite timestamps to look like natural work sessions.

WARNING: --fix-dates rewrites git history. While safety checks (backup branches, tree verification, future-date guards) and unit tests are in place, unforeseen scenarios may still cause data loss. Always keep a backup of your repository before rewriting history. The author is not responsible for any data loss resulting from the use of this tool.

How it works

By default, dates are anchored to the present — the last commit lands at "now" and the rest spread backwards. This ensures no future dates.

Single session (default): Spreads commits evenly across a time window, then adds random jitter so the gaps are not identical.

                                              last commit = now
+~1h±jitter       +~1h±jitter       +~1h±jitter       |
    |                  |                  |             |
  06:22             07:08              08:34          09:45

Burst mode (--burst): Splits commits into work sessions with idle days between them — simulates the pattern of working in bursts then taking days off.

Session 1 (3 days ago)     idle      Session 2 (today)
  c1  c2  c3           ~2 days gap     c4  c5  c6
  |   |   |                           |   |   |

Anchor modes (--anchor):

Mode Behavior Use case
present (default) Last commit = now, rest spread backwards Normal use — always safe
first-commit First commit keeps its original date, rest spread forward When you want to preserve the start date (may produce future dates)

Usage

# Preview what would change (safe, no modifications)
ai-scan --fix-dates --dry-run /path/to/repo

# Single session: spread 10 commits over 3 hours (default)
ai-scan --fix-dates /path/to/repo

# Longer session
ai-scan --fix-dates --spread 6 /path/to/repo

# Burst mode: 3 work sessions, ~2 days idle between each
ai-scan --fix-dates --burst 3,2 /path/to/repo

# Burst with custom session length
ai-scan --fix-dates --burst 2,3 --spread 4 /path/to/repo

# Tighter jitter for less variance
ai-scan --fix-dates --jitter 5 /path/to/repo

# Fix only a feature branch
ai-scan --fix-dates --branch my-feature /path/to/repo

# Anchor from first commit instead of present (may produce future dates)
ai-scan --fix-dates --anchor first-commit /path/to/repo

# Override remote-push safety check (you know what you're doing)
ai-scan --fix-dates --force /path/to/repo

# GPG/SSH sign all commits after rewriting
ai-scan --fix-dates --sign /path/to/repo

# Full example: burst + sign + force
ai-scan --fix-dates --all-commits --force --burst 4,3 --spread 5 --jitter 25 --sign /path/to/repo

Safety checks

--fix-dates rewrites git history, so it runs several checks before touching anything:

Check What happens Override
Dirty working tree Refuses to run if you have uncommitted or staged changes Commit or stash first
Git operation in progress Refuses if a rebase, merge, or cherry-pick is active Finish or abort it
Pushed commits Refuses if any commits exist on remote tracking branches --force
Backup branch Creates backup/fix-dates-YYYYMMDD-HHMMSS before rewriting Automatic
Post-rewrite verification Compares tree SHAs before/after to confirm file content is unchanged None — prints restore command on failure

After every rewrite, undo instructions are printed:

  To undo: git reset --hard backup/fix-dates-20260329-163500
  To clean up backup: git branch -D backup/fix-dates-20260329-163500

Use --dry-run to preview the timestamp changes without modifying history:

ai-scan --fix-dates --dry-run /path/to/repo
  [DRY RUN] Would rewrite 5 commits:

    a1b2c3d4e5f6  2026-03-29T10:00:00-07:00  ->  2026-03-29T10:00:00-07:00
    f6e5d4c3b2a1  2026-03-29T10:01:00-07:00  ->  2026-03-29T10:48:12-07:00
    ...

Detection

Commit timing analysis runs automatically during scans. Adjust sensitivity:

# Flag clusters with less than 3-minute average gaps (stricter)
ai-scan --cluster-threshold 3 /path/to/repo

Development

Running tests

The test suite uses pytest and covers patterns, config loading, output formatting, source scanning, git operations, date rewriting, and the CLI entry point.

# Run the full suite (verbose)
make test

# Run the full suite (quiet, CI-friendly)
uv run --extra test pytest tests/ -q

# Run a single test file
uv run --extra test pytest tests/test_dates.py -v

# Run a specific test by name
uv run --extra test pytest tests/test_dates.py -k "test_fix_dates_burst" -v

# Run with coverage (if pytest-cov is installed)
uv run --extra test pytest tests/ --cov=ai_trace_scan --cov-report=term-missing

Test files live in tests/ and mirror the package structure:

Test file What it covers
test_patterns.py Regex patterns for AI traces
test_config.py .ai-trace-scan.yml loading and exclude filters
test_output.py Text and JSON output formatting
test_source_scan.py Pygments comment extraction, gitignore, file tree
test_git_scan.py Commit, branch, tag, and staged-change scanning
test_dates.py Clustering, scan_dates, fix_dates, safety, weekends
test_cli.py Argparse flags, exit codes, end-to-end CLI invocations

Linting and formatting

# Check without modifying
make lint

# Auto-fix and reformat
make format

make lint runs (in order):

  1. ruff — pycodestyle, pyflakes, isort, bugbear, simplify rules
  2. black — code formatting check (line length 100)
  3. mypy — strict type checking
  4. mdformat — Markdown formatting check (GFM + tables)

Pre-commit hooks

Activate hooks that run linters, formatters, AI trace scan, and tests on every commit:

make setup-hooks

Hooks (via pre-commit):

Hook What it does
ruff Lint + auto-fix Python
ruff-format Format Python (ruff's built-in formatter)
black Format Python
mdformat Format Markdown
ai-trace-scan Scan staged changes for AI traces
pytest Run unit tests

Makefile targets

Target Command Description
install make install Install uv (if needed) and sync deps
run make run ARGS="..." Run the scanner with arbitrary arguments
test make test Run the full pytest suite
lint make lint Check ruff, black, mypy, mdformat
format make format Auto-fix ruff + reformat black + mdformat
setup-hooks make setup-hooks Install pre-commit git hooks
clean make clean Remove .venv

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages