Sonnet 3.5 is using a lot of output tokens, hitting 4k output token limit #705

vessenes · 2024-06-22T00:02:57Z

Issue

Asking for a large-scale change with sonnet, I see this output:

Model claude-3-5-sonnet-20240620 has hit a token limit!

Input tokens: 4902 of 200000
Output tokens: 3220 of 4096
Total tokens: 8122 of 200000

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

None of these numbers is over the stated limit. However, here is what my anthropic API console returns:

Jun 21, 2024 at 7:56 PM claude-3-5-sonnet-20240620 39.10 5068 4096 sse None

The "None" is in the "Error" column. The 4096 is tokens generated. So, it looks like we actually got 4096 tokens out of Anthropic, but either we're using the wrong tokenizer for aider, or ... ?

Version and model info

Aider v0.39.0, called with aider --sonnet

The text was updated successfully, but these errors were encountered:

vessenes · 2024-06-22T00:03:40Z

p.s. I note the input token calculation report is wrong from aider as well.

paul-gauthier · 2024-06-22T01:00:46Z

Thanks for trying aider and filing this issue. Others have reported similar issues recently. Your confirmation that sonnet did indeed return 4k tokens is very helpful info.

NeuralAIM · 2024-06-22T14:18:55Z

Same with DeepSeek Coder

Emasoft · 2024-06-22T17:50:43Z

I can confirm this. Can't work with Sonnet 3.5 because it stops after a while and prints an error about tokens limit reached, even though the limit was not reached.

paul-gauthier · 2024-06-22T18:49:05Z

It is reaching the output limit. The token counts being shown aren't accurate because Anthropic hasn't released a tokenize for these models. But Anthropic itself is returning the token limit error.

paul-gauthier · 2024-06-23T15:39:24Z

4k tokens is like ~hundreds of lines of code. Are you guys routinely asking for a single change that involves hundreds of lines of code?

I understand that a refactor might tend in this direction. I can certainly concoct a situation to force this to happen.

But I'd really love if folks could share some example output from .aider.chat.history.md that shows this problem happening naturally "in the wild".

Yona544 · 2024-06-23T17:06:58Z

=======
procedure TfrmMasterDetail.SaveSQL1Click(Sender: TObject);
var
DBGrid: TDBGrid;
MenuItem: TMenuItem;
Query: TFDQuery;
JSONFile: string;
JSONArray,

Model claude-3-5-sonnet-20240620 has hit a token limit!

Input tokens: 9790 of 200000
Output tokens: 3037 of 4096
Total tokens: 12827 of 200000

For more info: https://aider.chat/docs/troubleshooting/token-limits.html
Did not ask for hundreds of lines of code.

see file
.aider.chat.history.md

paul-gauthier · 2024-06-23T23:22:40Z

Thanks @Yona544, I really appreciate that. Unfortunately it appears that the model output doesn't get saved to .aider.chat.history.md when this error occurs! I have updated the main branch to improve logging for this.

Emasoft · 2024-06-24T00:10:00Z

@paul-gauthier

4k tokens is like ~hundreds of lines of code. Are you guys routinely asking for a single change that involves hundreds of lines of code?

I don't understand your surprise at hearing that we reach the 4k limit so often. If a project is not just a small script, I usually reach that limit every 3-4 exchanges with the AI. And yes, a single change usually needs to modify hundreds of lines of code. For example yesterday I found a bug caused by a function that needed 4 parameters to work correctly but instead, it was only accepting 2 arguments (the error was to assume those 2 additional parameters were constants, instead, they were not). So aider changed the function definition from 2 to 4 arguments. But (and this is the main point) it also had to change all the places in the code that were calling that function, ensuring that each time 4 variables were correctly instanced and passed instead of only 2. Since that was a function used very often in the program, the code changes were easily above 4k tokens, and probably much more than that. That scenario is a common occurrence when fixing bugs.

Anyway, I found a TEMPORARY WORKAROUND until this is fixed.

I created a new custom configuration file called sonnet_cfg.json with a reduced token limit (3072 instead of 4096). I made many tests, and found that this is the maximum value you can use to compensate the error of estimation made by aider (a buffer of 1024 tokens). This is the full content of the file:

 {
   "claude-3-5-sonnet-20240620": {
        "max_tokens": 3072,
        "max_input_tokens": 200000,
        "max_output_tokens": 3072,
        "input_cost_per_token": 0.000003,
        "output_cost_per_token": 0.000015,
        "litellm_provider": "anthropic",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_vision": true,
        "tool_use_system_prompt_tokens": 159
    },
   "claude-3-haiku-20240307": {
        "max_tokens": 4096,
        "max_input_tokens": 200000,
        "max_output_tokens": 4096,
        "input_cost_per_token": 0.00000025,
        "output_cost_per_token": 0.00000125,
        "litellm_provider": "anthropic",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_vision": true,
        "tool_use_system_prompt_tokens": 264
    }
}

All you need to do is to create the above file, and then launch Aider with this command:

aider --model-metadata-file sonnet_cfg.json --model claude-3-5-sonnet-20240620 --weak-model claude-3-haiku-20240307 --edit-format diff --map-tokens 1024 --restore-chat-history

Note that you MUST use the ‘diff’ edit format for this trick to be effective. It is a temporary solution, but It works for me. I never got the token limit error again.

PS: You can remove “--restore-chat-history” from the command if you want to save tokens ( it doesn’t work anyway...)

paul-gauthier · 2024-06-24T00:28:19Z

Modifying the model metadata doesn't affect whether a token limit is hit. Aider doesn't use the max_output_tokens value for anything (except printing it). Aider doesn't enforce the token limit. Anthropic's API server returns the token limit error because the underlying LLM has errored out.

All the OpenAI models and Opus have the same 4k output token limit. But literally no one has ever reported this output token limit issue previously until Sonnet launched.

I have confirmed that Sonnet is really "chatty" with its SEARCH/REPLACE blocks. It often includes a ton of unchanging code in both the search and replace sections.

For example, I made a code request to change 2 lines spread apart in a large source file. Sonnet made a SEARCH/REPLACE block that included all the intervening lines!

I've updated the prompts to disuade Sonnet from this behavior. The change is available in the main branch.

python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git

I would really appreciate feedback from anyone who is able to try this improved version.

Emasoft · 2024-06-24T02:11:32Z

Modifying the model metadata doesn't affect whether a token limit is hit. Aider doesn't use the max_output_tokens value for anything (except printing it). Aider doesn't enforce the token limit. Anthropic's API server returns the token limit error because the underlying LLM has errored out.

Ok, but no matter what you say, my trick works. If I remove it and revert to the 4096 value, I get the token limit error again. Why? IMO all Aider needs to do for my trick to work is to pass the custom model configuration to LiteLLM. And it does. And that is all we need to avoid the limit error. Because then LiteLLM will use the max_tokens value when calling the completion API of Anthropic, and that will result in fewer tokens sent back by Anthropic as a response. Maybe Sonnet will ‘regulate’ itself by answering with 3072 tokens instead of 4096, I don’t know. That will be enough to prevent LiteLLM or Aider from (over)estimating (due to the wrong tokenizer, or whatever the issue behind this bug) a response length greater than 4096.

Here is Aider passing the custom model to Litellm with register_model():

aider/aider/models.py

Line 508 in 7be08c7

litellm.register_model(model_def)

...and here is LiteLLM register_model() overriding the max_tokens original parameter:

https://github.com/BerriAI/litellm/blob/769c60e4f6e6e98e91e0619b5c20906d1c43cb20/litellm/utils.py#L1927

I would really appreciate feedback from anyone who is able to try this improved version.

I will test it as soon as possible.

Yona544 · 2024-06-24T03:13:25Z

I'm using the new version and got

Model claude-3-5-sonnet-20240620 has hit a token limit!
Token counts below are approximate.

Input tokens: ~15,227 of 200,000
Output tokens: ~2,991 of 4,096 -- possibly exceeded output limit!
Total tokens: ~18,218 of 200,000

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

.aider.chat.history.md

Hope this is helpful in resolving issue

In a case like this, I go back and ask Aider to break down the task and do one part at a time
maybe if Aider can detect this token hit, it can automatically re-prompt to generate a list of actions and feed a action at a time till done

Emasoft · 2024-06-24T05:42:19Z

I also tested the new version (Aider v0.39.1-dev), and the problem persists:

Model claude-3-5-sonnet-20240620 has hit a token limit!  
Token counts below are approximate.  

Input tokens: ~24,889 of 200,000  
Output tokens: ~3,218 of 4,096 -- possibly exceeded output limit!  
Total tokens: ~28,107 of 200,000  
 
For more info: https://aider.chat/docs/troubleshooting/token-limits.html

This is the chat transcript:

Aider.chat.history.zip

Hope this helps.

Yona544 · 2024-06-24T15:42:01Z

@Emasoft , when using with the sonnet_cfg.json fife as u described, I'm getting :

(aider) Y:\Projects\posexport>aider --model-metadata-file sonnet_cfg.json --model claude-3-5-sonnet-20240620 --weak-model claude-3-haiku-20240307 --edit-format diff --map-tokens 1024 --restore-chat-history

Loaded 1 litellm model file(s)

sonnet_cfg.json
Aider v0.39.1-dev
Models: claude-3-5-sonnet-20240620 with diff edit format, weak model claude-3-haiku-20240307
Git repo: \192.168.0.35\YonaVmDataShare\Projects\posexport.git with 27 files
Repo-map: using 1024 tokens
Restored previous conversation history.

summarizer unexpectedly failed for claude-3-haiku-20240307

Emasoft · 2024-06-24T16:08:03Z

@Yona544 No idea. Works perfectly fine for me. Have you tried omitting --restore-chat-history? If your previous model configuration was much different, it may not be restored correctly.

paul-gauthier · 2024-06-24T20:44:24Z

I just released v0.40.0 which has even stronger prompting to stop Sonnet from wasting so many tokens on unchanging chunks of code. If you have a chance to upgrade and try it, please let me know how it works for you.

thiswillbeyourgithub · 2024-06-24T21:15:44Z

I don't know if this is helpful but : although anthropic has not released a tokenizer, they still gave access to a token counting function.

As per their chatbor:

import anthropic

vo = anthropic.Anthropic()
total_tokens = vo.count_tokens(["Sample text"])

(I'm thinking this can be used to reverse engineer the tokenizer so maybe look for that on github)

Emasoft · 2024-06-24T22:28:56Z

@paul-gauthier I tested the 0.40.0 and this is the result:

Model claude-3-5-sonnet-20240620 has hit a token limit!  
Token counts below are approximate.  

Input tokens: ~28,911 of 200,000  
Output tokens: ~3,277 of 4,096 -- possibly exceeded output limit!  
Total tokens: ~32,188 of 200,000  

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

aider.chat.history40.md.zip

It seems that the issue was not resolved. Hope this helps.

cngarrison · 2024-06-25T05:59:22Z

I'm running 0.40.0 and got a token limit error today.

I'd like to propose a different solution, which has worked for me in other cases (automated code) where I hit the output token limit. Get Claude to "keep speaking". Send another request, with the last message an "assistant" role with contents of previous response. Claude will pick up where he left off from before. I use this technique to get complete XML blocks that are well over 4k tokens. Claude handles it very well.

Here is a simple example:

https://www.reddit.com/r/Anthropic/comments/1dgpopb/how_can_we_tell_claude_to_continue_generating_the/

The docs discuss prefilling the assistant response as a supported technique, but for different use-cases:

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prefill-claudes-response

So, getting Claude to "keep speaking" would allow for code search/replace blocks well in excess of the 4k token limit.

ackoi · 2024-06-25T07:26:30Z

I'm running 0.40.0 and got a token limit error today.

I'd like to propose a different solution, which has worked for me in other cases (automated code) where I hit the output token limit. Get Claude to "keep speaking". Send another request, with the last message an "assistant" role with contents of previous response. Claude will pick up where he left off from before. I use this technique to get complete XML blocks that are well over 4k tokens. Claude handles it very well.

Here is a simple example:

https://www.reddit.com/r/Anthropic/comments/1dgpopb/how_can_we_tell_claude_to_continue_generating_the/

The docs discuss prefilling the assistant response as a supported technique, but for different use-cases:

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prefill-claudes-response

So, getting Claude to "keep speaking" would allow for code search/replace blocks well in excess of the 4k token limit.

I agree with this, this is what I recommended yesterday as well: #716

paul-gauthier · 2024-06-25T13:29:10Z

It may indeed be useful to add some "continue" functionality. But the root cause of the problem is that Sonnet has been wasting thousands of tokens by outputting tons of unchanging code. It's important to address that root cause first, before assessing the need for additional workarounds.

Yona544 · 2024-06-25T15:32:39Z

Model claude-3-5-sonnet-20240620 has hit a token limit!
Token counts below are approximate.

Input tokens: ~24,884 of 200,000
Output tokens: ~3,027 of 4,096 -- possibly exceeded output limit!
Total tokens: ~27,911 of 200,000

Aider v0.40.1
Models: claude-3-5-sonnet-20240620 with diff edit format, weak model claude-3-haiku-20240307
Git repo: \192.168.0.35\YonaVmDataShare\Projects\posexport.git with 28 files
Repo-map: using 1024 tokens

.aider.chat.history.md

Emasoft · 2024-06-25T15:49:20Z

I don't think that any prompt would make Sonnet output less code. Or at least not without negative consequences. Maybe the reason it's good at coding is the same reason why it outputs more code. Perhaps rewriting part of the code helps it be more accurate when predicting the next token. Token prediction is based on what precedes it after all. Maybe the best solution would be to implement the "continue" trick.

vadi2 · 2024-06-25T16:14:32Z

I have to agree, when using Sonnet outside of aider it's a breath of fresh air when it returns the entire snippet of code instead of just giving you a diff.

cngarrison · 2024-06-25T16:26:48Z

It may indeed be useful to add some "continue" functionality. But the root cause of the problem is that Sonnet has been wasting thousands of tokens by outputting tons of unchanging code. It's important to address that root cause first, before assessing the need for additional workarounds.

If you can get Sonnet to be more terse, that's a good quick win.

But the problem of exceeding the output token limit will always be there, even if it doesn't happen as frequently (eg. I asked aider for a new file today, there was no diffing to make it smaller). Getting claude to continue speaking will solve that.

If you can suggest which file you think the "loop & continue" logic belongs in, I can create a PR for it. (Python isn't my native.)

unphased · 2024-06-27T15:16:57Z

I'm hitting the token limit due to trying to do too many things in one go, but usually the issue is sonnet goes to include too much context.

Example I just hit:

src/config.rs


 <<<<<<< SEARCH
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
     pub blame_separator_format: BlameLineNumbers,
     pub blame_palette: Vec<String>,
     pub blame_separator_style: Option<Style>,
     pub blame_timestamp_format: String,
     pub blame_timestamp_output_format: Option<String>,
     pub color_only: bool,
     pub commit_regex: Regex,
     pub commit_style: Style,
     pub cwd_of_delta_process: Option<PathBuf>,
     pub cwd_of_user_shell_process: Option<PathBuf>,
     pub cwd_relative_to_repo_root: Option<String>,
     pub decorations_width: cli::Width,
     pub default_language: String,
     pub diff_stat_align_width: usize,
     pub error_exit_code: i32,
     pub file_added_label: String,
     pub file_copied_label: String,
     pub file_modified_label: String,
     pub file_removed_label: String,
     pub file_renamed_label: String,
     pub file_regex_replacement: Option<RegexReplacement>,
     pub right_arrow: String,
     pub file_style: Style,
     pub git_config: Option<GitConfig>,
     pub git_minus_style: Style,
     pub git_plus_style: Style,
     pub grep_context_line_style: Style,
     pub grep_file_style: Style,
     pub classic_grep_header_file_style: Style,
     pub classic_grep_header_style: Style,
     pub ripgrep_header_style: Style,
     pub grep_line_number_style: Style,
     pub grep_match_line_style: Style,
     pub grep_match_word_style: Style,
     pub grep_output_type: Option<GrepType>,
     pub grep_separator_symbol: String,
     pub handle_merge_conflicts: bool,
     pub hunk_header_file_style: Style,
     pub hunk_header_line_number_style: Style,
     pub hunk_header_style_include_file_path: HunkHeaderIncludeFilePath,
     pub hunk_header_style_include_line_number: HunkHeaderIncludeLineNumber,
     pub hunk_header_style_include_code_fragment: HunkHeaderIncludeCodeFragment,
     pub hunk_header_style: Style,
     pub hunk_label: String,
     pub hyperlinks_commit_link_format: Option<String>,
     pub hyperlinks_file_link_format: String,
     pub hyperlinks: bool,
     pub inline_hint_style: Style,
     pub inspect_raw_lines: cli::InspectRawLines,
     pub keep_plus_minus_markers: bool,
     pub line_buffer_size: usize,
     pub line_fill_method: BgFillMethod,
     pub line_numbers_format: LeftRight<String>,
     pub line_numbers_style_leftright: LeftRight<Style>,
     pub line_numbers_style_minusplus: MinusPlus<Style>,
     pub line_numbers_zero_style: Style,
     pub line_numbers: bool,
     pub styles_map: Option<HashMap<style::AnsiTermStyleEqualityKey, Style>>,
     pub max_line_distance_for_naively_paired_lines: f64,
     pub max_line_distance: f64,
     pub max_line_length: usize,
     pub merge_conflict_begin_symbol: String,
     pub merge_conflict_ours_diff_header_style: Style,
     pub merge_conflict_theirs_diff_header_style: Style,
     pub merge_conflict_end_symbol: String,
     pub minus_emph_style: Style,
     pub minus_empty_line_marker_style: Style,
     pub minus_file: Option<PathBuf>,
     pub minus_non_emph_style: Style,
     pub minus_style: Style,
     pub navigate_regex: Option<String>,
     pub navigate: bool,
     pub null_style: Style,
     pub null_syntect_style: SyntectStyle,
     pub pager: Option<String>,
     pub paging_mode: PagingMode,
     pub plus_emph_style: Style,
     pub plus_empty_line_marker_style: Style,
     pub plus_file: Option<PathBuf>,
     pub plus_non_emph_style: Style,
     pub plus_style: Style,
     pub relative_paths: bool,
     pub show_themes: bool,
     pub side_by_side_data: side_by_side::SideBySideData,
     pub side_by_side: bool,
     pub syntax_set: SyntaxSet,
     pub syntax_theme: Option<SyntaxTheme>,
     pub tab_cfg: utils::tabs::TabCfg,
     pub tokenization_regex: Regex,
     pub true_color: bool,
     pub truncation_symbol: String,
     pub whitespace_error_style: Style,
     pub wrap_config: WrapConfig,
     pub zero_style: Style,
 }
 =======
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
     pub blame_separator_format: BlameLineNumbers,
     pub blame_palette: Vec<String>,
     pub blame_separator_style: Option<Style>,
     pub blame_timestamp_format: String,
     pub blame_timestamp_output_format: Option<String>,
     pub color_only: bool,
     pub commit_regex: Regex,
     pub commit_style: Style,
     pub cwd_of_delta_process: Option<PathBuf>,
     pub cwd_of_user_shell_process: Option<PathBuf>,
     pub cwd_relative_to_repo_root: Option<String>,
     pub decorations_width: cli::Width,
     pub default_language: String,
     pub diff_stat_align_width: usize,
     pub error_exit_code: i32,
     pub file_added_label: String,
     pub file_copied_label: String,
     pub file_modified_label: String,
     pub file_removed_label: String,
     pub file_renamed_label: String,
     pub file_regex_replacement: Option<RegexReplacement>,
     pub right_arrow: String,
     pub file_style: Style,
     pub git_config: Option<GitConfig>,
     pub git_minus_style: Style,
     pub git_plus_style: Style,
     pub grep_context_line_style: Style,
     pub grep_file_style: Style,
     pub classic_grep_header_file_style: Style,
     pub classic_grep_header_style: Style,
     pub ripgrep_header_style: Style,
     pub grep_line_number_style: Style,
     pub grep_match_line_style: Style,
     pub grep_match_word_style: Style,
     pub grep_output_type: Option<GrepType>,
     pub grep_separator_symbol: String,
     pub handle_merge_conflicts: bool,
     pub hunk_header_file_style: Style,
     pub hunk_header_line_number_style: Style,
     pub hunk_header_style_include_file_path: HunkHeaderIncludeFilePath,
     pub hunk_header_style_include_line_number: HunkHeaderIncludeLineNumber,
     pub hunk_header_style_include_code_fragment: HunkHeaderIncludeCodeFragment,
     pub hunk_header_style: Style,
     pub hunk_label: String,
     pub hyperlinks_commit_link_format: Option<String>,
     pub hyperlinks_file_link_format: String,
     pub hyperlinks: bool,
     pub inline_hint_style: Style,
     pub inspect_raw_lines: cli::InspectRawLines,
     pub keep_plus_minus_markers: bool,
     pub line_buffer_size: usize,
     pub line_fill_method: BgFillMethod,
     pub line_numbers_format: LeftRight<String>,
     pub line_numbers_style_leftright: LeftRight<Style>,
     pub line_numbers_style_minusplus: MinusPlus<Style>,
     pub line_numbers_zero_style: Style,
     pub line_numbers: bool,
     pub styles_map: Option<HashMap<style::AnsiTermStyleEqualityKey, Style>>,
     pub max_line_distance_for_naively_paired_lines: f64,
     pub max_line_distance: f64,
     pub max_line_length: usize,
     pub merge_conflict_begin_symbol: String,
     pub merge_conflict_ours_diff_header_style: Style,
     pub merge_conflict_theirs_diff_header_style: Style,
     pub merge_conflict_end_symbol: String,
     pub minus_emph_style: Style,
     pub minus_empty_line_marker_style: Style,
     pub minus_file: Option<PathBuf>,
     pub minus_non_emph_style: Style,
     pub minus_style: Style,
     pub navigate_regex: Option<String>,
     pub navigate: bool,
     pub null_style: Style,
     pub null_syntect_style: SyntectStyle,
     pub pager: Option<String>,
     pub paging_mode: PagingMode,
     pub plus_emph_style: Style,
     pub plus_empty_line_marker_style: Style,
     pub plus_file: Option<PathBuf>,
     pub plus_non_emph_style: Style,
     pub plus_style: Style,
     pub relative_paths: bool,
     pub show_themes: bool,
     pub side_by_side_data: side_by_side::SideBySideData,
     pub side_by_side: bool,
     pub syntax_set: SyntaxSet,
     pub syntax_theme: Option<SyntaxTheme>,
     pub tab_cfg: utils::tabs::TabCfg,
     pub tokenization_regex: Regex,
     pub true_color: bool,
     pub truncation_symbol: String,
     pub whitespace_error_style: Style,
     pub wrap_config: WrapConfig,
     pub zero_style: Style,
     pub file_decorator_filter: Option<String>,
     pub file_decorator_hyperlink_format: Option<String>,
 }
 >>>>>>> REPLACE

(code is from the repo https://github.com/dandavison/delta)

I applied some natural language coaxing as follows and you can see it was successful.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
src/config.rs src/delta.rs src/handlers/diff_header.rs
> You should be able to see that you wrote too much, exceeding the token limit. PLEASE DO NOT PLACE EXCESSIVE CONTEXT IN THE SEARCH REPLACE BLOCKS. YOU
SHOULD TRUNCATE that block for config so that we don't waste thousands of tokens on unchanging code. KEEP THE SEARCH REPLACE BLOCKS TERSE !!!PLEASE!!!.
That will allow us to do larger tasks in one go. Trust that the diff/patch algorithm will find the appropriate code you're referencing. Let's try this a
gain.

I apologize for the previous verbose response. You're right, I should keep the SEARCH/REPLACE blocks concise. Let's implement the requested changes step
by step.

First, let's add a new configuration option in src/config.rs:

src/config.rs


 <<<<<<< SEARCH
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
 =======
 pub struct Config {
     pub available_terminal_width: usize,
     pub background_color_extends_to_terminal_width: bool,
     pub blame_code_style: Option<Style>,
     pub blame_format: String,
     pub file_decorator_filter: Option<String>,
 >>>>>>> REPLACE


Now, let's modify src/delta.rs to use this new configuration option:

src/delta.rs


 <<<<<<< SEARCH
 impl<'a> StateMachine<'a> {
     pub fn new(writer: &'a mut dyn Write, config: &'a Config) -> Self {
         Self {
             line: "".to_string(),
             raw_line: "".to_string(),
             state: State::Unknown,
 =======
 impl<'a> StateMachine<'a> {
     pub fn new(writer: &'a mut dyn Write, config: &'a Config) -> Self {
         Self {
             line: "".to_string(),
             raw_line: "".to_string(),
             state: State::Unknown,
             file_decorator_filter: config.file_decorator_filter.clone(),
 >>>>>>> REPLACE


Finally, let's update src/handlers/diff_header.rs to use the filter:

src/handlers/diff_header.rs


 <<<<<<< SEARCH
 pub fn write_generic_diff_header_header_line(
     line: &str,
     raw_line: &str,
     painter: &mut Painter,
     mode_info: &mut String,
     config: &Config,
 ) -> std::io::Result<()> {
     // If file_style is "omit", we'll skip the process and print nothing.
     // However in the case of color_only mode,
     // we won't skip because we can't change raw_line structure.
     if config.file_style.is_omitted && !config.color_only {
         return Ok(());
     }
 =======
 pub fn write_generic_diff_header_header_line(
     line: &str,
     raw_line: &str,
     painter: &mut Painter,
     mode_info: &mut String,
     config: &Config,
 ) -> std::io::Result<()> {
     if config.file_style.is_omitted && !config.color_only {
         return Ok(());
     }
     if let Some(filter) = &config.file_decorator_filter {
         if !line.contains(filter) {
             return Ok(());
         }
     }
 >>>>>>> REPLACE


These changes add a new configuration option for filtering file decorators and implement basic filtering logic. You may need to add more complex
filtering or hyperlink generation based on your specific requirements.


Applied edit to src/handlers/diff_header.rs
Applied edit to src/delta.rs
Applied edit to src/config.rs
Commit a44b40f Implemented file decorator filter in config and delta state machine.
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
src/config.rs src/delta.rs src/handlers/diff_header.rs
>

paul-gauthier · 2024-06-27T22:36:31Z

The main branch has experimental support for continuing Sonnet responses that hit the 4k output limit. It should happen seamlessly without any user intervention required. You can install the latest like this:

python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git

If you have a chance to try it, please let me know if it works better for you.

cngarrison · 2024-06-28T03:12:03Z

If you have a chance to try it, please let me know if it works better for you.

I asked Claude to add some swagger doc comments to all endpoints, lots of tokens - it worked a treat.

The terminal output got "messed up" when Claude was first asked to keep speaking; I wasn't paying attention enough to describe the symptoms - it was in middle of diff and the diff formatting disappeared when speaking resumed.

The diffs applied cleanly though. (One of the diffs needed to be regenerated due to errors, but that worked normally.)

So aside from some UI glitches, the "keep speaking" feature is working great for me. 👍

Thanks heaps for this feature - it's a real game changer. I can be more ambitious when asking Claude for larger features.

paul-gauthier · 2024-06-28T22:31:57Z

The main branch has fixes for the rendering glitches that have been happening when aider asks Sonnet to continue its response.

Emasoft · 2024-07-01T14:37:16Z

Tested 0.40.7. Works like a charm. It is a monster. It refactors files of any size like it is nothing. The continue trick with Sonnet is truly the holy grail. Aider beats Github copilot and Cursor hands down. I'm going to cancel both subscriptions.

ackoi · 2024-07-01T16:06:35Z

The main branch has experimental support for continuing Sonnet responses that hit the 4k output limit. It should happen seamlessly without any user intervention required. You can install the latest like this:
python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git
If you have a chance to try it, please let me know if it works better for you.

Much appreciated, giving it a try today!

paul-gauthier · 2024-07-01T21:25:43Z

This all went out in v0.41.0 today.

I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.

paul-gauthier mentioned this issue Jun 22, 2024

Is litellm.encode() accurate for Claude 3.5 Sonnet? BerriAI/litellm#4347

Closed

paul-gauthier added the bug Something isn't working label Jun 22, 2024

paul-gauthier pushed a commit that referenced this issue Jun 23, 2024

ensure the LLM response is logged even if token limits reached #705

716e4d9

paul-gauthier pushed a commit that referenced this issue Jun 24, 2024

Tell Sonnet to make concise SEARCH/REPLACE blocks #705

7be08c7

paul-gauthier changed the title ~~incorrect token limit warning with sonnet 3.5~~ Sonnet 3.5 is using a lot of output tokens, hitting 4k output token limit Jun 24, 2024

paul-gauthier mentioned this issue Jun 24, 2024

Continuing after token limit cutoff #716

Closed

paul-gauthier closed this as completed Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sonnet 3.5 is using a lot of output tokens, hitting 4k output token limit #705

Sonnet 3.5 is using a lot of output tokens, hitting 4k output token limit #705

vessenes commented Jun 22, 2024

vessenes commented Jun 22, 2024

paul-gauthier commented Jun 22, 2024

NeuralAIM commented Jun 22, 2024

Emasoft commented Jun 22, 2024

paul-gauthier commented Jun 22, 2024 •

edited

Loading

paul-gauthier commented Jun 23, 2024

Yona544 commented Jun 23, 2024 •

edited

Loading

paul-gauthier commented Jun 23, 2024

Emasoft commented Jun 24, 2024 •

edited

Loading

paul-gauthier commented Jun 24, 2024

Emasoft commented Jun 24, 2024 •

edited

Loading

Yona544 commented Jun 24, 2024 •

edited

Loading

Emasoft commented Jun 24, 2024

Yona544 commented Jun 24, 2024 •

edited

Loading

Emasoft commented Jun 24, 2024 •

edited

Loading

paul-gauthier commented Jun 24, 2024

thiswillbeyourgithub commented Jun 24, 2024

Emasoft commented Jun 24, 2024 •

edited

Loading

cngarrison commented Jun 25, 2024

ackoi commented Jun 25, 2024

paul-gauthier commented Jun 25, 2024

Yona544 commented Jun 25, 2024

Emasoft commented Jun 25, 2024 •

edited

Loading

vadi2 commented Jun 25, 2024

cngarrison commented Jun 25, 2024

unphased commented Jun 27, 2024

paul-gauthier commented Jun 27, 2024

cngarrison commented Jun 28, 2024

paul-gauthier commented Jun 28, 2024

Emasoft commented Jul 1, 2024 •

edited

Loading

ackoi commented Jul 1, 2024

paul-gauthier commented Jul 1, 2024

Sonnet 3.5 is using a lot of output tokens, hitting 4k output token limit #705

Sonnet 3.5 is using a lot of output tokens, hitting 4k output token limit #705

Comments

vessenes commented Jun 22, 2024

Issue

Version and model info

vessenes commented Jun 22, 2024

paul-gauthier commented Jun 22, 2024

NeuralAIM commented Jun 22, 2024

Emasoft commented Jun 22, 2024

paul-gauthier commented Jun 22, 2024 • edited Loading

paul-gauthier commented Jun 23, 2024

Yona544 commented Jun 23, 2024 • edited Loading

paul-gauthier commented Jun 23, 2024

Emasoft commented Jun 24, 2024 • edited Loading

paul-gauthier commented Jun 24, 2024

Emasoft commented Jun 24, 2024 • edited Loading

Yona544 commented Jun 24, 2024 • edited Loading

Emasoft commented Jun 24, 2024

Yona544 commented Jun 24, 2024 • edited Loading

Emasoft commented Jun 24, 2024 • edited Loading

paul-gauthier commented Jun 24, 2024

thiswillbeyourgithub commented Jun 24, 2024

Emasoft commented Jun 24, 2024 • edited Loading

cngarrison commented Jun 25, 2024

ackoi commented Jun 25, 2024

paul-gauthier commented Jun 25, 2024

Yona544 commented Jun 25, 2024

Input tokens: ~24,884 of 200,000 Output tokens: ~3,027 of 4,096 -- possibly exceeded output limit! Total tokens: ~27,911 of 200,000

Aider v0.40.1 Models: claude-3-5-sonnet-20240620 with diff edit format, weak model claude-3-haiku-20240307 Git repo: \192.168.0.35\YonaVmDataShare\Projects\posexport.git with 28 files Repo-map: using 1024 tokens

Emasoft commented Jun 25, 2024 • edited Loading

vadi2 commented Jun 25, 2024

cngarrison commented Jun 25, 2024

unphased commented Jun 27, 2024

paul-gauthier commented Jun 27, 2024

cngarrison commented Jun 28, 2024

paul-gauthier commented Jun 28, 2024

Emasoft commented Jul 1, 2024 • edited Loading

ackoi commented Jul 1, 2024

paul-gauthier commented Jul 1, 2024

paul-gauthier commented Jun 22, 2024 •

edited

Loading

Yona544 commented Jun 23, 2024 •

edited

Loading

Emasoft commented Jun 24, 2024 •

edited

Loading

Emasoft commented Jun 24, 2024 •

edited

Loading

Yona544 commented Jun 24, 2024 •

edited

Loading

Yona544 commented Jun 24, 2024 •

edited

Loading

Emasoft commented Jun 24, 2024 •

edited

Loading

Emasoft commented Jun 24, 2024 •

edited

Loading

Input tokens: ~24,884 of 200,000
Output tokens: ~3,027 of 4,096 -- possibly exceeded output limit!
Total tokens: ~27,911 of 200,000

Aider v0.40.1
Models: claude-3-5-sonnet-20240620 with diff edit format, weak model claude-3-haiku-20240307
Git repo: \192.168.0.35\YonaVmDataShare\Projects\posexport.git with 28 files
Repo-map: using 1024 tokens

Emasoft commented Jun 25, 2024 •

edited

Loading

Emasoft commented Jul 1, 2024 •

edited

Loading