Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Weight Calc based on NearSwap #179

Closed
wants to merge 3 commits into from

Conversation

Steel-skull
Copy link

@Steel-skull Steel-skull commented Feb 25, 2024

Adds in a merge method based off of NearSwap

It uses Dare / Ties and adds in auto weight calculation based off of near tuned interpolation on a per parameter bases, so theoretically it is gradient weights on steroids.

Am looking at changing this into in a auto density function that will feed back into the weight function based on calculating a gradient / variance-based density map, with a gradient fall off for each parameter.

this is the ramblings of a sleep deprived dude.

would love to hear from others on this as it was a 3am random thought and im sure the implementation could be changed.

I still need to change a few other files still

This is based on the work of:
https://huggingface.co/lodrick-the-lafted
&
https://huggingface.co/LilyWinter

who developed the NearSwap algorithm that was used on:
https://huggingface.co/alchemonaut/BoreanGale-70B
https://huggingface.co/alchemonaut/QuartetAnemoi-70B-t0.0001

adds in a new merge method based off of NearSwap

it uses Dare_Ties and adds in auto weight calculation based off of near tuned interpolation on a per parameter bases.

so theoretically it is gradient weights on steroids.

Am looking at changing this into in a auto density function that will feed back into the weight function based on calculating a gradient and variance-based density map, with a gradient fall off for each parameter

this is the ramblings of a sleep deprived idiot
@Steel-skull Steel-skull changed the title Update generalized_task_arithmetic.py Automatic Weight Calc based on NearSwap Feb 25, 2024
fixed batch dimension logic

added weight defaults
Update generalized_task_arithmetic.py
@NextGenOP
Copy link

Why closed?

@Steel-skull
Copy link
Author

There's too much going on IRL, so I couldn't push any further. I may look into it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants