Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeRO-Inference refresh #4197

Merged
merged 113 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
ade9096
INT4 weight only quantization (#479)
donglinz May 5, 2023
2461449
Moving quantization into post_init_method and add int4 dequantization…
donglinz May 17, 2023
8751edf
Refactor: move int4 code to deepspeed/inference (#528)
donglinz Jun 5, 2023
df1859d
zero++ tutorial PR (#3783)
HeyangQin Jun 21, 2023
d81a6ad
[Fix] _conv_flops_compute when padding is a str and stride=1 (#3169)
zhiruiluo Jun 21, 2023
a8c182a
fix interpolate flops compute (#3782)
cli99 Jun 22, 2023
c4c442f
use `Flops Profiler` to test `model.generate()` (#2515)
CaffreyR Jun 22, 2023
fc9e1ee
revert PR #3611 (#3786)
jeffra Jun 22, 2023
40045dc
bump to 0.9.6
jeffra Jun 22, 2023
49a0a1b
ZeRO++ chinese blog (#3793)
HeyangQin Jun 23, 2023
2c62cb4
remove staging trigger (#3792)
jeffra Jun 23, 2023
4dc65f7
DeepSpeed-Triton for Inference (#3748)
stephen-youn Jun 23, 2023
e1119d8
ZeRO++ (#3784)
HeyangQin Jun 23, 2023
01b843a
adding zero++ to navigation panel of deepspeed.ai (#3796)
HeyangQin Jun 23, 2023
319b64e
Add ZeRO++ Japanese blog (#3797)
tohtana Jun 23, 2023
b4a2c0a
Bug Fixes for autotuner and flops profiler (#1880)
cli99 Jun 23, 2023
b7e1010
Missing strided copy for gated MLP (#3788)
cmikeh2 Jun 23, 2023
e5b1ead
Requires grad checking. (#3789)
jomayeri Jun 23, 2023
9c756cf
bump to 0.10.0
jeffra Jun 23, 2023
a204edc
Fix Bug in transform.cu (#3534)
rraminen Jun 23, 2023
f6e2e38
bug fix: triton importing error (#3799)
stephen-youn Jun 23, 2023
c1a7d3c
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 23, 2023
65ed548
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 24, 2023
d7ac329
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 26, 2023
83f1102
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 27, 2023
16555b2
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 27, 2023
9d7b654
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 28, 2023
c121f90
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 29, 2023
f6b2962
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 29, 2023
dd6bb04
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 29, 2023
6e5a1f1
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 29, 2023
1fbbbbf
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 30, 2023
e44eb86
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jun 30, 2023
26d8823
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 1, 2023
83e1752
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 3, 2023
b5446a2
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 4, 2023
7f9b2fa
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 5, 2023
9fb79a3
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 5, 2023
9643da2
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 6, 2023
464f99a
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 6, 2023
fbf5068
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 6, 2023
208870a
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 6, 2023
62b47f3
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 6, 2023
c5f62c3
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 7, 2023
78528ae
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 7, 2023
b52c407
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 8, 2023
dfe3b82
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 10, 2023
f086c39
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 10, 2023
04718e4
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 12, 2023
63db286
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 12, 2023
f32c947
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 13, 2023
3b7c583
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 13, 2023
7963bc7
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 13, 2023
441fffe
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 14, 2023
72f37ab
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 14, 2023
f5eb5df
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 17, 2023
e595621
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 18, 2023
427d9eb
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 19, 2023
e9708d6
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 19, 2023
6da3e48
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 19, 2023
c031179
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 19, 2023
0de04b3
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 20, 2023
a733676
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 20, 2023
fd2ca3a
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 21, 2023
9665c46
Rebase
tjruwase Jul 21, 2023
af181e5
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 21, 2023
68935e8
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 22, 2023
37b7743
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 22, 2023
b8153eb
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 24, 2023
8b9815f
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 25, 2023
79781ef
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 25, 2023
2b3664c
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 25, 2023
f6ded65
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 26, 2023
cea2dd9
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 26, 2023
30626f0
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 27, 2023
ccb6817
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 27, 2023
a20815b
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 28, 2023
2d8c49a
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 28, 2023
762a1bb
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Jul 31, 2023
8ac993f
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 1, 2023
9d054da
Workaround qaunt bug
tjruwase Aug 2, 2023
fabf2c0
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 3, 2023
1bc1af2
Fix dequant bug
tjruwase Aug 3, 2023
cc0b0f1
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 3, 2023
fd1cede
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 5, 2023
f694a93
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 7, 2023
4d29d5e
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 8, 2023
636e5e4
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 9, 2023
a39b6e2
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 9, 2023
de64c54
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 9, 2023
cf479bf
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 9, 2023
df4f25c
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 9, 2023
38bb552
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 10, 2023
e86e1c5
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 11, 2023
ec59340
Address PR feedback
tjruwase Aug 11, 2023
8e19577
Use super() __exit__
tjruwase Aug 14, 2023
704e0f0
Fix unit tests
tjruwase Aug 14, 2023
95643e3
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 15, 2023
19018b6
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 16, 2023
b886682
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 16, 2023
7948971
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 17, 2023
9242d36
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 17, 2023
c51a072
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 18, 2023
c816d50
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 19, 2023
d9b1672
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 21, 2023
a746aca
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 22, 2023
f0afcf3
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 22, 2023
956ed2f
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 22, 2023
e1276ab
Merge branch 'master' of github.com:microsoft/DeepSpeed
jeffra Aug 23, 2023
f940e1e
Rebase
tjruwase Aug 23, 2023
0129db2
Fix rebase conflict
tjruwase Aug 24, 2023
a63e92b
Merge branch 'master' into staging-zero-inference-v1
tjruwase Aug 30, 2023
9723d93
Merge branch 'master' into staging-zero-inference-v1
awan-10 Sep 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Refactor: move int4 code to deepspeed/inference (#528)
* Move int 4 code to deepspeed/inference

* fix

* fix

* fix
  • Loading branch information
donglinz committed Jun 5, 2023
commit 8751edf51c508a48b31c754196d25fd524fead9f
29 changes: 0 additions & 29 deletions deepspeed/compression/inference/config.py

This file was deleted.

2 changes: 0 additions & 2 deletions deepspeed/inference/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,3 @@
# SPDX-License-Identifier: Apache-2.0

# DeepSpeed Team

from .engine import InferenceEngine
2 changes: 2 additions & 0 deletions deepspeed/inference/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,8 @@ class BaseQuantConfig(DeepSpeedConfigModel):

class WeightQuantConfig(BaseQuantConfig):
enabled = True
quantized_initialization: Dict = {}
post_init_quant: Dict = {}


class ActivationQuantConfig(BaseQuantConfig):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,9 @@
from torch import nn
from typing import Dict
import gc
from deepspeed.compression.inference import layers
from deepspeed.inference.quantization import layers
from .layers import QUANTIZATION_LAYER_MAPPINGS
from .utils import get_AsyncPartitionedParameterSwapper
from ..helper import recursive_setattr
from .utils import get_AsyncPartitionedParameterSwapper, recursive_setattr
from deepspeed.utils.logging import logger
from collections import deque
from transformers.utils.generic import ContextManagers
Expand All @@ -35,7 +34,7 @@ def _init_group_wise_weight_quantization(model: nn.Module, ds_config: Dict) -> n
matched_module_count = 0

assert 'weight_quantization' in ds_config, 'Please provide quantization config in ds_config'
quantization_config = ds_config['weight_quantization']
quantization_config = ds_config['weight_quantization']['post_init_quant']

# Return nvme swapper if exists, else return None.
# For nvme offloading we must use the same swapper here as model initialized.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,24 @@ def get_AsyncPartitionedParameterSwapper(model: nn.Module):
return None


def recursive_setattr(model, module_name, module):
"""
Recursively set the attribute of a module.
Args:
model (`torch.nn.Module`)
The model to set the attribute in.
module_name (`str`)
The name of the module to set the attribute in.
module (`torch.nn.Module`)
The module to set the attribute to.
"""
split_list = module_name.split('.')
output = model
for name in split_list[:-1]:
output = getattr(output, name)
output.__setattr__(split_list[-1], module)


def concat_to_compat_param(quantized_weight: Tensor,
quant_scale: Tensor,
quant_min: Tensor,
Expand Down
5 changes: 3 additions & 2 deletions deepspeed/runtime/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
from .activation_checkpointing.config import DeepSpeedActivationCheckpointingConfig
from ..comm.config import DeepSpeedCommsConfig
from ..monitor.config import get_monitor_config
from ..compression.inference.config import WeightQuantizationConfig
from ..inference.config import WeightQuantConfig

from deepspeed import comm as dist
from deepspeed.runtime.config_utils import DeepSpeedConfigModel
Expand Down Expand Up @@ -871,7 +871,8 @@ def _initialize_params(self, param_dict):

self.nebula_config = DeepSpeedNebulaConfig(param_dict)

self.weight_quantization_config = WeightQuantizationConfig(param_dict)
self.weight_quantization_config = WeightQuantConfig(
**param_dict['weight_quantization']) if 'weight_quantization' in param_dict else None

def _batch_assertion(self):

Expand Down
4 changes: 2 additions & 2 deletions deepspeed/runtime/zero/partition_parameters.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
debug_param2name_id, debug_param2name_id_shape_status)
from deepspeed.accelerator import get_accelerator
from ..swap_tensor.partitioned_param_swapper import AsyncPartitionedParameterSwapper, PartitionedParamStatus
from deepspeed.compression.inference.utils import _quantize_param, WEIGHT_QUANTIZATION_LAYERS, wrap_quantized_functional, wrap_load_from_state_dict
from deepspeed.inference.quantization.utils import _quantize_param, WEIGHT_QUANTIZATION_LAYERS, wrap_quantized_functional, wrap_load_from_state_dict

param_count = 0
partitioned_param_data_shape = [0]
Expand Down Expand Up @@ -295,7 +295,7 @@ def __init__(self, enabled=True, mem_efficient_linear=True, ds_config=None, dtyp
self.wrapped_cls = set()

self.quantized_initialization = None
if ds_config is not None and ds_config.weight_quantization_config.quantized_initialization:
if ds_config is not None and ds_config.weight_quantization_config and ds_config.weight_quantization_config.quantized_initialization:
self.quantized_initialization = ds_config.weight_quantization_config.quantized_initialization

def __enter__(self):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
import torch
import torch.nn as nn
from deepspeed.accelerator import get_accelerator
from deepspeed.compression.inference.quantization import _init_group_wise_weight_quantization
from deepspeed.compression.inference.utils import Quantizer, DeQuantizer
from deepspeed.compression.inference.layers import QuantizedLinear
from deepspeed.inference.quantization.quantization import _init_group_wise_weight_quantization
from deepspeed.inference.quantization.utils import Quantizer, DeQuantizer
from deepspeed.inference.quantization.layers import QuantizedLinear
from transformers.models.opt.modeling_opt import OPTDecoderLayer
from transformers import AutoConfig, OPTConfig, AutoModel
import pytest
Expand Down Expand Up @@ -258,35 +258,37 @@ def test_model_quantization():

ds_config = {
'weight_quantization': {
'fc': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.q_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.k_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.v_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.out_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
'post_init_quant': {
'fc': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.q_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.k_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.v_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
},
'self_attn.out_proj': {
'num_bits': bits,
'group_size': 64,
'group_dim': 0,
'symmetric': False
}
}
}
}
Expand Down Expand Up @@ -321,11 +323,13 @@ def test_quantized_linear():

ds_config = {
'weight_quantization': {
'layer': {
'num_bits': 4,
'group_size': 64,
'group_dim': 0,
'symmetric': False
'post_init_quant': {
'layer': {
'num_bits': 4,
'group_size': 64,
'group_dim': 0,
'symmetric': False
}
}
}
}
Expand Down