{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":705996574,"defaultBranch":"main","name":"gpt-fast","ownerLogin":"pytorch-labs","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2023-10-17T05:30:32.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/107212512?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1720842386.0","currentOid":""},"activityList":{"items":[{"before":"7f2c92219ba72bab409fc8421c378f6f732853fb","after":"9dce6a4d267ca036cbaf5f862d21467c7b2f488a","ref":"refs/heads/gqa_support","pushedAt":"2024-07-13T03:58:02.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jainapurva","name":"Apurva Jain","path":"/jainapurva","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/19538305?s=80&v=4"},"commit":{"message":"Lint fixes","shortMessageHtmlLink":"Lint fixes"}},{"before":"091515ab5b06f91c0d6a3b92f9c27463f738cc9b","after":"7f2c92219ba72bab409fc8421c378f6f732853fb","ref":"refs/heads/gqa_support","pushedAt":"2024-07-13T03:51:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jainapurva","name":"Apurva Jain","path":"/jainapurva","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/19538305?s=80&v=4"},"commit":{"message":"Updated sdpa with enable_gqa=True","shortMessageHtmlLink":"Updated sdpa with enable_gqa=True"}},{"before":null,"after":"091515ab5b06f91c0d6a3b92f9c27463f738cc9b","ref":"refs/heads/gqa_support","pushedAt":"2024-07-13T03:46:26.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"jainapurva","name":"Apurva Jain","path":"/jainapurva","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/19538305?s=80&v=4"},"commit":{"message":"Unified Llama 3 (8b,70b) + Safetensors support (#169)\n\n* unify llama 3 support\r\n\r\n* add safetensors support\r\n\r\n* Bug\r\n\r\n* Add safetensors to reqs\r\n\r\n* rope base bug fix. Thx @xavierpuigf\r\n\r\nFrom comment\r\nhttps://github.com/pytorch-labs/gpt-fast/pull/169#issuecomment-2123919020\r\n\r\n* Update model.py","shortMessageHtmlLink":"Unified Llama 3 (8b,70b) + Safetensors support (#169)"}},{"before":"900cd67e08074cf4cacafd3654794306db0eef41","after":"091515ab5b06f91c0d6a3b92f9c27463f738cc9b","ref":"refs/heads/main","pushedAt":"2024-06-26T05:00:14.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"yanboliang","name":"Yanbo Liang","path":"/yanboliang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1962026?s=80&v=4"},"commit":{"message":"Unified Llama 3 (8b,70b) + Safetensors support (#169)\n\n* unify llama 3 support\r\n\r\n* add safetensors support\r\n\r\n* Bug\r\n\r\n* Add safetensors to reqs\r\n\r\n* rope base bug fix. Thx @xavierpuigf\r\n\r\nFrom comment\r\nhttps://github.com/pytorch-labs/gpt-fast/pull/169#issuecomment-2123919020\r\n\r\n* Update model.py","shortMessageHtmlLink":"Unified Llama 3 (8b,70b) + Safetensors support (#169)"}},{"before":"e71d268012881c619f049f39626df1cf3510e186","after":"900cd67e08074cf4cacafd3654794306db0eef41","ref":"refs/heads/main","pushedAt":"2024-06-26T04:45:07.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"yanboliang","name":"Yanbo Liang","path":"/yanboliang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1962026?s=80&v=4"},"commit":{"message":"Update installation instructions in README.md (#178)\n\nOtherwise tiktoken is missing","shortMessageHtmlLink":"Update installation instructions in README.md (#178)"}},{"before":"9b908fb74da891bf65ec1c4d7c70684b84e999a6","after":"e71d268012881c619f049f39626df1cf3510e186","ref":"refs/heads/main","pushedAt":"2024-06-17T02:51:46.000Z","pushType":"pr_merge","commitsCount":4,"pusher":{"login":"yanboliang","name":"Yanbo Liang","path":"/yanboliang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1962026?s=80&v=4"},"commit":{"message":"Merge pull request #166 from yanboliang/llama3-8b\n\nLlama3 8b perf numbers on A100","shortMessageHtmlLink":"Merge pull request #166 from yanboliang/llama3-8b"}},{"before":"6e0b5eb3bde74b7c4de1f3e3d7e7e047f0b62924","after":"2fe820027fa7228ecf39275af82552681db73b40","ref":"refs/heads/LayerSkip","pushedAt":"2024-06-15T00:35:53.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"mostafaelhoushi","name":"Mostafa Elhoushi","path":"/mostafaelhoushi","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1451293?s=80&v=4"},"commit":{"message":"Add self-speculative instructions","shortMessageHtmlLink":"Add self-speculative instructions"}},{"before":"faa70ae6f8aab4a2e74505a57d3df430e2700d8e","after":"6e0b5eb3bde74b7c4de1f3e3d7e7e047f0b62924","ref":"refs/heads/LayerSkip","pushedAt":"2024-06-15T00:26:47.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"mostafaelhoushi","name":"Mostafa Elhoushi","path":"/mostafaelhoushi","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1451293?s=80&v=4"},"commit":{"message":"handle default early exit","shortMessageHtmlLink":"handle default early exit"}},{"before":"6253c6bb054e658d67566150f87329b87815ae63","after":"9b908fb74da891bf65ec1c4d7c70684b84e999a6","ref":"refs/heads/main","pushedAt":"2024-06-14T17:14:11.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"yanboliang","name":"Yanbo Liang","path":"/yanboliang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1962026?s=80&v=4"},"commit":{"message":"Merge pull request #181 from VikParuchuri/main\n\nFix rope base issue with llama 3","shortMessageHtmlLink":"Merge pull request #181 from VikParuchuri/main"}},{"before":null,"after":"faa70ae6f8aab4a2e74505a57d3df430e2700d8e","ref":"refs/heads/LayerSkip","pushedAt":"2024-06-14T16:03:35.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"mostafaelhoushi","name":"Mostafa Elhoushi","path":"/mostafaelhoushi","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1451293?s=80&v=4"},"commit":{"message":"fixing bugs in KVQ","shortMessageHtmlLink":"fixing bugs in KVQ"}},{"before":null,"after":"eaae2f52bfc153c6b458b715e16f5c903f8cd40f","ref":"refs/heads/gh/kwen2501/1/orig","pushedAt":"2024-06-12T20:42:57.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"kwen2501","name":"Ke Wen","path":"/kwen2501","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6676466?s=80&v=4"},"commit":{"message":"Use DTensor-based tensor parallel\n\nghstack-source-id: b55b264d20bd2c0054f7248435fd605a452e876b\nPull Request resolved: https://github.com/pytorch-labs/gpt-fast/pull/180","shortMessageHtmlLink":"Use DTensor-based tensor parallel"}},{"before":null,"after":"547765c000c4bf1af43ab4855e15cf968efd1a45","ref":"refs/heads/gh/kwen2501/1/head","pushedAt":"2024-06-12T20:42:54.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"kwen2501","name":"Ke Wen","path":"/kwen2501","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6676466?s=80&v=4"},"commit":{"message":"Use DTensor-based tensor parallel\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Use DTensor-based tensor parallel"}},{"before":null,"after":"6253c6bb054e658d67566150f87329b87815ae63","ref":"refs/heads/gh/kwen2501/1/base","pushedAt":"2024-06-12T20:42:54.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"kwen2501","name":"Ke Wen","path":"/kwen2501","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6676466?s=80&v=4"},"commit":{"message":"Merge pull request #176 from yanboliang/exa\n\nUpdate Grok-1 and DBRX support in README","shortMessageHtmlLink":"Merge pull request #176 from yanboliang/exa"}},{"before":"de06b53a4f95c72cd3abd0a8e9fa2d6913676c1a","after":"eab18b71dd45c9e25aec3f7b6d8a776d112a541e","ref":"refs/heads/grok1","pushedAt":"2024-06-06T04:51:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Chillee","name":"Horace He","path":"/Chillee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6355099?s=80&v=4"},"commit":{"message":"Update model.py","shortMessageHtmlLink":"Update model.py"}},{"before":"1090423ed9d81bf7b4181936dea4120f89fa4579","after":"faa70ae6f8aab4a2e74505a57d3df430e2700d8e","ref":"refs/heads/ak/layer_skip","pushedAt":"2024-06-04T21:20:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AkshatSh","name":"Akshat Shrivastava","path":"/AkshatSh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9097613?s=80&v=4"},"commit":{"message":"fixing bugs in KVQ","shortMessageHtmlLink":"fixing bugs in KVQ"}},{"before":"6767545367d4e92a2a22b7cbbf85b6b17a4013ca","after":"1090423ed9d81bf7b4181936dea4120f89fa4579","ref":"refs/heads/ak/layer_skip","pushedAt":"2024-06-03T08:27:24.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AkshatSh","name":"Akshat Shrivastava","path":"/AkshatSh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9097613?s=80&v=4"},"commit":{"message":"initial KVQ implementation","shortMessageHtmlLink":"initial KVQ implementation"}},{"before":"7056eb75c04d4ccd33a522e7e8f6ce971ef88253","after":"6767545367d4e92a2a22b7cbbf85b6b17a4013ca","ref":"refs/heads/ak/layer_skip","pushedAt":"2024-06-01T07:07:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AkshatSh","name":"Akshat Shrivastava","path":"/AkshatSh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9097613?s=80&v=4"},"commit":{"message":"sharing model weights","shortMessageHtmlLink":"sharing model weights"}},{"before":null,"after":"7056eb75c04d4ccd33a522e7e8f6ce971ef88253","ref":"refs/heads/ak/layer_skip","pushedAt":"2024-06-01T04:30:51.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"AkshatSh","name":"Akshat Shrivastava","path":"/AkshatSh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9097613?s=80&v=4"},"commit":{"message":"initial changes","shortMessageHtmlLink":"initial changes"}},{"before":"e2cfa34145e7f14f8390c8ab727f8d7451b26fec","after":"6253c6bb054e658d67566150f87329b87815ae63","ref":"refs/heads/main","pushedAt":"2024-05-21T01:04:58.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"yanboliang","name":"Yanbo Liang","path":"/yanboliang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1962026?s=80&v=4"},"commit":{"message":"Merge pull request #176 from yanboliang/exa\n\nUpdate Grok-1 and DBRX support in README","shortMessageHtmlLink":"Merge pull request #176 from yanboliang/exa"}},{"before":"ca0d85075b3cf92ead264da3826c8cc9f0207185","after":"7180321ed7749bf9dde2b62e50a3dd2fcf6b5543","ref":"refs/heads/malfet/set-prec-to-float16","pushedAt":"2024-05-17T05:37:56.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"malfet","name":"Nikita Shulga","path":"/malfet","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2453524?s=80&v=4"},"commit":{"message":"[WIP] Set precision to float16","shortMessageHtmlLink":"[WIP] Set precision to float16"}},{"before":"1095a5c465d5f6af734a8b86e3f7be49ecfc7668","after":"e2cfa34145e7f14f8390c8ab727f8d7451b26fec","ref":"refs/heads/main","pushedAt":"2024-05-10T04:54:13.000Z","pushType":"pr_merge","commitsCount":18,"pusher":{"login":"yanboliang","name":"Yanbo Liang","path":"/yanboliang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1962026?s=80&v=4"},"commit":{"message":"Merge pull request #175 from yanboliang/band-emb\n\nRemove nn.Embedding layer from model size","shortMessageHtmlLink":"Merge pull request #175 from yanboliang/band-emb"}},{"before":"30d69b3245a29823e7c4c5ae6a1f48fa38267afd","after":"1095a5c465d5f6af734a8b86e3f7be49ecfc7668","ref":"refs/heads/main","pushedAt":"2024-05-07T21:49:28.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"Chillee","name":"Horace He","path":"/Chillee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6355099?s=80&v=4"},"commit":{"message":"Revert quantization additions to something that works on CUDA still","shortMessageHtmlLink":"Revert quantization additions to something that works on CUDA still"}},{"before":"a2aa7d6d7b01ef55c024b5891197c80569f3be83","after":"de06b53a4f95c72cd3abd0a8e9fa2d6913676c1a","ref":"refs/heads/grok1","pushedAt":"2024-05-05T21:42:49.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"Chillee","name":"Horace He","path":"/Chillee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6355099?s=80&v=4"},"commit":{"message":"Added grok-1 support","shortMessageHtmlLink":"Added grok-1 support"}},{"before":null,"after":"a2aa7d6d7b01ef55c024b5891197c80569f3be83","ref":"refs/heads/grok1","pushedAt":"2024-05-05T21:36:18.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"Chillee","name":"Horace He","path":"/Chillee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6355099?s=80&v=4"},"commit":{"message":"Added grok-1 support","shortMessageHtmlLink":"Added grok-1 support"}},{"before":null,"after":"ca0d85075b3cf92ead264da3826c8cc9f0207185","ref":"refs/heads/malfet/set-prec-to-float16","pushedAt":"2024-05-03T18:08:40.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"malfet","name":"Nikita Shulga","path":"/malfet","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2453524?s=80&v=4"},"commit":{"message":"[WIP] Set precision to float16","shortMessageHtmlLink":"[WIP] Set precision to float16"}},{"before":"c21a88962b02ee54b74999078e81be0fd24ac2af","after":"30d69b3245a29823e7c4c5ae6a1f48fa38267afd","ref":"refs/heads/main","pushedAt":"2024-04-29T21:02:52.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Chillee","name":"Horace He","path":"/Chillee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6355099?s=80&v=4"},"commit":{"message":"llama3 8B support, tiktoken tokenizer (#158)\n\n* WIP: llama3 support, tiktoken tokenizer\r\n\r\n* Finalizing","shortMessageHtmlLink":"llama3 8B support, tiktoken tokenizer (#158)"}},{"before":"2a9b8283f83ca416faacfa1cb637ea49543e6a99","after":"c21a88962b02ee54b74999078e81be0fd24ac2af","ref":"refs/heads/main","pushedAt":"2024-04-18T06:15:58.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Chillee","name":"Horace He","path":"/Chillee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6355099?s=80&v=4"},"commit":{"message":"Update quantize.py","shortMessageHtmlLink":"Update quantize.py"}},{"before":"095b2229ee3a40e379c11f05b94bd6923db63b4b","after":"2a9b8283f83ca416faacfa1cb637ea49543e6a99","ref":"refs/heads/main","pushedAt":"2024-04-11T01:29:37.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"HDCharles","name":null,"path":"/HDCharles","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39544797?s=80&v=4"},"commit":{"message":"Merge pull request #156 from pytorch-labs/094_fix_shape_gptq\n\nshape fix for gptq","shortMessageHtmlLink":"Merge pull request #156 from pytorch-labs/094_fix_shape_gptq"}},{"before":null,"after":"f2c6534f083f1931e09c3e1a41eb5659acbd1caa","ref":"refs/heads/094_fix_shape_gptq","pushedAt":"2024-04-10T19:13:04.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"HDCharles","name":null,"path":"/HDCharles","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39544797?s=80&v=4"},"commit":{"message":"shape fix for gptq\n\nSummary: aligns with previous shape fixes\n(https://github.com/pytorch-labs/gpt-fast/pull/152)\n\nTest Plan:\n\nexport MODEL_REPO=meta-llama/Llama-2-7b-chat-hf\npython quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 10\npython eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext\n\nwikitext: {'word_perplexity,none': 12.4647656874071, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 1.6028703940149458, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 0.6806577757911142, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}\n\npython quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4\npython eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4.g32.cuda.pth --tasks wikitext\n\nwikitext: {'word_perplexity,none': 12.639992147818221, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 1.6070602521912754, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 0.6844240198082908, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}\n\nReviewers:\n\nSubscribers:\n\nTasks:\n\nTags:","shortMessageHtmlLink":"shape fix for gptq"}},{"before":"410cc25bd2fae6f60ef145d6e172277fcaac5590","after":"55b9f6e0a947cd4ffc18567d8709b6b57bb99922","ref":"refs/heads/gh/HDCharles/9/orig","pushedAt":"2024-04-09T18:21:27.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"HDCharles","name":null,"path":"/HDCharles","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/39544797?s=80&v=4"},"commit":{"message":"testing HQQ [not for land]\n\nSummary:\n\nfor eval=5\nwikitext: {'word_perplexity,none': 11.49343838017535, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 1.6110947678444059, 'byte_perplexity_stderr,none':\n\nfor eval all\n...\n\nTest Plan: sh run.sh\n\nReviewers:\n\nSubscribers:\n\nTasks:\n\nTags:\n\nghstack-source-id: e1564ea867790825ad8a00c8de8a672a349b8a48\nPull Request resolved: https://github.com/pytorch-labs/gpt-fast/pull/155","shortMessageHtmlLink":"testing HQQ [not for land]"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEfl_9igA","startCursor":null,"endCursor":null}},"title":"Activity ยท pytorch-labs/gpt-fast"}