Programming with Aider: Impressions, Overview

Some rough impression after programming mainly with Aider. I first used it exclusively for about 4 days back in March, using version 0.77 and mostly using Claude Sonnet 3.7. More recently I've been using it . and then again for a week end of July, using 0.80.2.

Usability pretty good

I like the workflow of using this command line combined with an editor of choice.

I like that it always has the latest state of my code without having to be nudged to refresh files (unlike Cursor).

It mostly seemed to generate decent code in 0.77 (mostly using Claude 3.7 sonnet) and 0.85.2 (using Claude 4 sonnet and Qwen 3 Coder). Sonnet is fairly quick too. The experience varies a lot with the choice of model, not surprisingly.

So far, I have never gotten into a "bug loop" like my frequent experience with Cursor (eg it fixed problem 1, introduced problem 2; it fixed problem 2, introduced problem 3; fixed problem 3 by re-introducing problem 1, then repeat)

Managing which files get included as context is a bit more hassle; lots of manually /add and /remove files. Cursor mostly guesses right.

Too many changes at once, unreadable diff format

When changes are made in the terminal, they scroll by with typically more than I can digest. This is my biggest usability complaint. It actually is training me to ignore the LLM output because there's just too much of it.

It doesn't help that the diff format used (usually) is the least readable diff format I have ever encountered: if modifying a method, it seems to show the entire old method body, then the entire new method body. There is no visual indication of which of those lines changed or how.

I don't see much in the docs about tweaking this. I wish I could change it. Example:

<<<<<<< SEARCH
def setup_colored_logging():
    # Initialize colorama
    colorama.init()

    # Set up handler with our custom formatter
    handler = logging.StreamHandler()
    handler.setFormatter(ColoredFormatter("%(levelname)-8s %(name)s: %(message)s"))

    # Configure root logger
    root_logger = logging.getLogger()
    # Remove any existing handlers
    for h in root_logger.handlers[:]:
        root_logger.removeHandler(h)
    # Add our handler
    root_logger.addHandler(handler)
    root_logger.setLevel(logging.INFO)
=======
def setup_colored_logging():
    # Initialize colorama
    colorama.init()

    # Set up handler with our custom formatter
    handler = logging.StreamHandler()
    handler.setFormatter(ColoredFormatter("%(levelname)-8s %(name)s: %(message)s"))

    # Configure root logger
    root_logger = logging.getLogger()
    # Remove only StreamHandler instances (console handlers), keep file handlers
    for h in root_logger.handlers[:]:
        if isinstance(h, logging.StreamHandler) and not isinstance(h, logging.handlers.RotatingFileHandler):
            root_logger.removeHandler(h)
    # Add our colored console handler
    root_logger.addHandler(handler)
    root_logger.setLevel(logging.INFO)
>>>>>>> REPLACE

In the docs, I see that you can override the diff format. But this changes the model behavior, which might or might not work well. I think what I'd like is to have control of how the diff is displayed to me independently of the model diff format.

Workaround: I am training myself to mostly ignore those as they scroll by and just look at the git diffs instead (or, when it commits automatically, the actual commits).

Rules not always followed / CONVENTIONS.md ignored sometimes

I told it to "always make the smallest change possible to achieve the goal" and yet it (sonnet 4) sometimes does big multi-file changes.

On one occasion, even telling aider + sonnet 4 explicitly in the chat to make the minimum possible change resulted in a commit that contained:

  • a new button in html
  • a handler for that button in js
  • a new endpoint in the backend to support that
  • a new task type in my backend task queue

I would have broken that into AT LEAST separate frontend and backend commits. Pretty much every time it says "I'll do X..." and then later "Now let's add..." I'd like those to be separate commits.

Learning to adapt to the tool auto-committing

I am still waffling about whether I should turn off the auto-commits: true setting.

For one thing, as mentioned it puts too much into a single commit.

For another thing, commit message quality and style is very inconsistent. Can I teach it what I like? For example, here's a common Aider commit message pattern:


Author: Paul Winkler (aider) <slinkp@gmail.com>
Date:   Sun Mar 16 22:40:03 2025 -0400

    Based on the changes we've made, here's a commit message that summarizes the modifications:

    feat: Update RecruiterMessage date to use UTC datetime

Uh, thanks for the commit message telling me it's a commit message.

Or this one, even worse as it was clearly intended to be a message for me - not a commit message at all; this provides no information about the changes:


Author: Paul Winkler (aider) <slinkp@gmail.com>
Date:   Mon Mar 17 14:31:54 2025 -0400

    I see you've added some files to the chat. Could you list out the specific files you've added? I want to make sure I understand the context and can provide the most accurate implementation.

    Typically, for a Python project, I'd want to see:
    1. The main script
    2. Any test files
    3. Configuration files
    4. Requirements/dependencies file

    Could you list out the files you've added?

Managing Aider's commits

In general, it does a poor job guessing how much I want in a commit. It's either too much (eg backend plus frontend), or too little (eg auto linter changes that clean up the immediately previous commit; why not just commit properly formatted code the first time?)

/undo is a good command to know about; it both reverts an Aider commit and tells aider i've done so.

(Separating lint fixes into separate commits makes /undo a little less convenient; but the obvious workaround works - just /undo twice in a row.)

In practice, I'm letting Aider commit, then do a lot of commit gardening. This is a change in workflow habits compared to Cursor, where I did a lot of reviewing proposed diffs before commit, and then mostly not needing to touch my commit history.

In some ways this might be better - it's weird in Cursor having changes in uncommitted state and trying to get the AI to get them into better shape and/or manually fix them up; a couple times I've needed to back up and start again.

But it's a bit tedious going through the history. I don't want to stop doing it though, because I hate long chains of committing broken stuff and flailing trying to get it into a good state; I'd rather collapse it all down to "here's what worked".

Example of the kind of cleanup I do:

I told aider to fix an issue that basically said "progress spinner is not appearing in frontend after pushing button". Sonnet 4 claimed to have fixed the problem when it hadn't - after several failed solutions, i asked it to try testing with playwright, and it committed a playwright test script that it claimed showed the spinner appearing (it did not). Eventually aider/sonnet did find a solution. It took a bit more prompting to get it to clean up unnecessary leftover "fixes" that did nothing useful.

Here was the commit history that I ended up rebasing and squashing down to a single ~70 line change; improve UI reactivity for generating messages and research status was the actual relevant fix.

commit e5f369a39acdb5fc4f8ece6b21d1f7b8169a8b5f
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 12:44:14 2025 -0400

    aider: refactor: simplify task polling and remove unnecessary error handling

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 server/static/app.js             |  38 ++++------
 server/static/daily-dashboard.js |  22 ++----
 server/static/task-polling.js    |  16 +---
 test_generate_button.py          | 157 ---------------------------------------
 4 files changed, 24 insertions(+), 209 deletions(-)

commit 6686970795d69d55341c762a683b131dc1f0af6a
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 12:39:18 2025 -0400

    aider: refactor: remove debugging console.log statements from task-polling.js

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 server/static/task-polling.js | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

commit 9712c5d291ff19407f33f22ca69b3edc6c3147b4
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 12:37:16 2025 -0400

    aider: feat: improve UI reactivity for generating messages and research status

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 server/static/app.js             | 20 ++++++++++++++------
 server/static/daily-dashboard.js | 25 ++++++++++++++++++-------
 server/static/task-polling.js    |  8 +++++++-
 3 files changed, 39 insertions(+), 14 deletions(-)

commit 7b11e7f8563c53f908fd57596e06102365e22759
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 12:24:55 2025 -0400

    aider: fix: Ensure loading spinner appears immediately during reply generation

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 server/static/app.js             | 5 +++++
 server/static/daily-dashboard.js | 5 +++++
 2 files changed, 10 insertions(+)

commit 6ade87ff6336373f275ef12e7f06165014bda267
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 11:25:38 2025 -0400

    aider: style: format test_generate_button.py with consistent whitespace

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 test_generate_button.py | 80 +++++++++++++++++++++++++++----------------------
 1 file changed, 44 insertions(+), 36 deletions(-)

commit 0858aebf73f76acbea6f0af04cab0fc80fd6dd71
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 11:25:33 2025 -0400

    aider: feat: add Playwright test for generate button functionality

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 test_generate_button.py | 149 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 149 insertions(+)

commit fdd5e568c6e574b18edf28da716abc1ce4d15907
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 00:38:40 2025 -0400

    aider: feat: add spinner to generate/regenerate buttons during message generation

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 server/static/app.js     | 16 ++++++++++++++++
 server/static/index.html | 31 ++++++++++++++++++++-----------
 2 files changed, 36 insertions(+), 11 deletions(-)

commit 4d9bbd8c5caadba9c2318b825c0828d557519ccd
Author: Paul Winkler <slinkp@gmail.com>
Date:   Wed Jul 30 00:27:01 2025 -0400

    aider: fix: improve task polling and UI updates for generated replies

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 server/static/app.js             | 23 +++++++++++++----------
 server/static/daily-dashboard.js | 22 ++++++++++++----------
 server/static/task-polling.js    |  8 +++++++-
 3 files changed, 32 insertions(+), 21 deletions(-)

commit 305d894fece5c8d801679ba8b0d057bbd599409e
Author: Paul Winkler <slinkp@gmail.com>
Date:   Tue Jul 29 16:23:19 2025 -0400

    feat: Auto-refresh UI after task completion without manual page reload

    PW's Note - sonnet 4's original commit message included all this garbage around
    the actual message:

    aider: The changes look good. Let me generate a concise commit message for these changes:

    ```
    feat: Auto-refresh UI after task completion without manual page reload
    ```

    This commit message captures the key improvement:
    - `feat`: Indicates a new feature/enhancement
    - Describes the core functionality added: automatically refreshing the UI when a background task completes
    - Highlights the key user experience improvement of removing the need for manual page refresh

    Would you like me to run the tests to verify the changes?

    Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) <aider@aider.chat>

 server/static/app.js             | 40 ++++++++++++++++++++++++++++------------
 server/static/daily-dashboard.js | 27 +++++++++++++++++++++------
 server/static/task-polling.js    |  8 +++++++-
 3 files changed, 56 insertions(+), 19 deletions(-)

Lint integration is meh

I found the lint setup syntax a bit confusing in the docs. It's this:

# Example:
# lint-cmd: "language: command args..."

lint-cmd: "python: ./black-flake8-mypy"

So I made a little script that does all of the tools I want:

#!/usr/bin/env bash

echo Black...
black $*
echo Flake8...
flake8 $* # || exit 1
echo Mypy...
mypy $* || exit 1

Please for the love of god, test and lint BEFORE commit

I'm not sure how to configure this. Aider seems to want to commit, then run the linters (which reformats automatically), THEN run the tests, then more commits of each attempt at making the tests pass.

I'd like it to at least run lint BEFORE committing; that never seems worth doing separately on new code. Aider ignores my instructions to that effect in CONVENTIONS.md

Maybe I should just configure this via normal git hooks, and don't expect Aider to do it?

Managing LLM Expense Is a Struggle

With Cursor, I mostly used (with gradually more success) the then-current Claude Sonnet model, progressing through 3.5, 3.7, and 4. So with Aider, I started with sonnet as my default too.

At times, I've gotten rate limited. And Aider was getting expensive on sonnet 4. I spent $10 in one not especially long day of programming, most of that in "/code" mode (not /architect mode). It sent over 4 million input tokens on that day!

If I want to keep using this daily, I need to know what models actually work that don't burn through my bank account that fast.

Followup posts. Looking for cheaper models to use...

One advantage of Aider (which it shares with some other AI coding tools - Cursor et al) is that it's not locked in to one particular model vendor.

Unlike eg Claude Code, it supports a large and growing number of models from many providers. I REALLY like that this means no vendor lock-in. (It theoretically means you could use a local model, though I have no hardware on which to run anything that would be worth using as a code assistant. I hope that someday a lot of today's functionality could be run in local open models and save the big cloud providers for bigger jobs?)

Also, openrouter makes it easier to manage billing in one place, rather than registering for a bazillion API keys just to try something out.

I will probably make a series of followup posts as I try different models, because this quest will surely evolve over time and never be "done". The targets are moving fast.

Aider publishes a leaderboard which gives at least some idea of how expensive models are - they cost anywhere from pennies to nearly $200 to complete the benchmark! (That was for o1-2024-12-17 (high))

The benchmark also gives some clue which models do and don't work well -- but that doesn't necessarily translate well for any given real-world task.

Next: Other models to try?

I decided to focus on leaderboard models that score comparably to sonnet 4 (>= 55%) and cost less.

I asked Claude to make a list of them sorted by price, and include openrouter invocation where possible:

Cheap Models That Are Still Usable (Sorted by Cost)

(Score ≥55%, Cost <$25, Edit format ≥90%, sorted by cost ascending)

Model Score Cost Edit Format Provider Aider Command OpenRouter Equivalent
Qwen3 235B A22B diff, no think, Alibaba API 59.6% $0.00 92.9% Alibaba aider --model openai/qwen3-235b-a22b openrouter/qwen/qwen3-coder
DeepSeek V3 (0324) 55.1% $1.12 99.6% DeepSeek aider --model deepseek/deepseek-chat openrouter/deepseek/deepseek-chat-v3-0324
Kimi K2 59.1% $1.24 92.9% Moonshot AI aider --model openrouter/moonshotai/kimi-k2 Already using OpenRouter
DeepSeek R1 (0528) 71.4% $4.8 94.6% DeepSeek aider --model deepseek/deepseek-reasoner openrouter/deepseek/deepseek-r1-0528
DeepSeek R1 56.9% $5.42 96.9% DeepSeek aider --model deepseek/deepseek-reasoner openrouter/deepseek/deepseek-r1
gemini-2.5-flash-preview-05-20 (24k think) 55.1% $8.56 95.6% Google aider --model gemini/gemini-2.5-flash-preview-05-20 openrouter/google/gemini-2.5-flash
DeepSeek R1 + claude-3-5-sonnet-20241022 64.0% $13.29 100.0% DeepSeek + Anthropic aider --architect --model r1 --editor-model sonnet openrouter/deepseek/deepseek-r1 + openrouter/anthropic/claude-3.5-sonnet
o3 76.9% $13.75 93.8% OpenAI aider --model o3 No exact equivalent
claude-sonnet-4-20250514 (no thinking) 56.4% $15.82 98.2% Anthropic aider --model claude-sonnet-4-20250514 openrouter/anthropic/claude-sonnet-4
o3 (high) + gpt-4.1 78.2% $17.55 100.0% OpenAI aider --model o3 No exact equivalent
claude-3-7-sonnet-20250219 (no thinking) 60.4% $17.72 93.3% Anthropic aider --model sonnet openrouter/anthropic/claude-3.7-sonnet
o3-mini (high) 60.4% $18.16 93.3% OpenAI aider --model o3-mini --reasoning-effort high openrouter/openai/o3-mini-high
o4-mini (high) 72.0% $19.64 90.7% OpenAI aider --model o4-mini No equivalent
o3 (high) 81.3% $21.23 94.7% OpenAI aider --model o3 --reasoning-effort high No exact equivalent

I decided to skip the advanced OpenAI models (o3, o3-mini, o4-mini) for now, since they require a government ID , and while I for one welcome our new insect overlords, I don't welcome them that much.

First up: Qwen 3 Coder!