Train from scratch #50

aoluming · 2020-08-14T01:24:53Z

Anyone try to train from scratch on Ucf101 on C3D? The accuracy keep 1%. I use other models implemented by myself and the accuracy is also 1%. The learning rate is 1e-5. Does anyone have some idea on it?

Farabi-shafkat · 2020-08-14T12:35:49Z

I am also trying to train from scratch and after 13 or so epochs the train and test accuracies are 43% and 28% almost. Im also using a custom architecture. So maybe the bug is in your code. Its not possible to provide any more solutions without knowing the specifics of your code.

aoluming · 2020-08-14T13:40:22Z

@Farabi-shafkat
Thank you for your kind reply! Have you ever tried to train from scratch on C3D? https://github.com/aoluming/Cost_model ,this is my custom architecture, which I follow the paper 'Collaborative Spatiotemporal Feature Learning for Video Action Recognition' in CVPR2019. This paper is not opoen-source. I would be really appreciated if you can check the code for me.

Farabi-shafkat · 2020-08-16T11:54:43Z

Hello, no I have not trained from scratch on c3d. And i am by no means an expert, i have seen your code but i could not find any bug in your custom network implementation. However there is one thing that might be wrong. check this thread out.
#30 (comment)

aoluming · 2020-08-16T13:12:03Z

Hello, you are such a modest man and thank you for doing so much for me. I will try this thread in my code. @Farabi-shafkat

libb999 · 2020-08-18T09:45:31Z

有人试着用Ucf 101从零开始在C3D上训练吗？准确度保持1%。我使用的是由我自己实现的其他模型，准确率也是1%。学习率为1E-5。有人对此有什么想法吗？

Anyone try to train from scratch on Ucf101 on C3D? The accuracy keep 1%. I use other models implemented by myself and the accuracy is also 1%. The learning rate is 1e-5. Does anyone have some idea on it?

I get the same acc=1% with train from scratch，in this code

aoluming · 2020-08-18T11:54:18Z

@libb999 同学你用的就是c3d么，有尝试用其他模型么，用这篇repo的c3d加他给的pretrain几个迭代acc就上97，从0训练就1%，我觉得很离谱，你觉得可能是哪里出问题了么。

libb999 · 2020-08-19T00:50:58Z

是的，我用的也是c3d，情况跟你一模一样

aoluming · 2020-08-19T01:15:27Z

@libb999 我用其他模型也是百分之1，但是我在训练过程中print了输出，发现了一个问题。就是我在网络中不加dropout的话，输出的类索引基本是一个或两个固定值，不论跑多少epoch都是这样。我不知道是网络的问题还是训练代码的问题，或者是data的问题。但是加了pretrain就很高，说明data可能就没问题

libb999 · 2020-08-19T01:28:29Z

我感觉代码有问题

aoluming · 2020-08-19T01:37:29Z

同感，但是检查了loss，print了梯度，感觉是没有问题的，是有梯度回传的，但是就是loss降不下去 @libb999

shanchao0906 · 2020-09-29T07:34:50Z

有人解决了吗，训练精度一直很低

shanchao0906 · 2020-10-17T02:46:56Z

BryceWayne · 2020-10-26T19:11:25Z

I ran into the same issue. If you reduce the number of classes then the model will converge. For instance, if you reduce ufc101 to 7 classes and then train from scratch the model will converge to 95% validation accuracy. Training from scratch is known to take forever.

HuangZuShu · 2020-11-19T09:49:12Z

Is there anyone who solve this problem? I also met the same problem. The loss quickly converge ,but the accuracy is only 1% in top 1 and 5% in top ten.

BryceWayne · 2020-11-19T16:38:38Z

The loss should be computed with the outputs. I have good training now.

BryceWayne · 2020-11-19T21:23:52Z

Is there anyone who solve this problem? I also met the same problem. The loss quickly converge ,but the accuracy is only 1% in top 1 and 5% in top ten.

Make sure to check that the loss is computed with the outputs.

HuangZuShu · 2020-11-20T01:03:17Z

Is there anyone who solve this problem? I also met the same problem. The loss quickly converge ,but the accuracy is only 1% in top 1 and 5% in top ten.

Make sure to check that the loss is computed with the outputs.

Thank you for your reply！My loss function is computed with the outputs， you can see in the picture following, and I couldn't find any problem.

skyqwe123 · 2021-03-08T09:17:18Z

@jfzhang95 I meet the same error, when I train from scratch on ucf101. The accuracy is very low about(0.001). Do you have any good suggestions？Thanks

skyqwe123 · 2021-03-18T12:30:23Z

同感，但是检查了loss，print了梯度，感觉是没有问题的，是有梯度回传的，但是就是loss降不下去 @libb999

@aoluming 请问解决了吗？

alonelysnake · 2021-04-05T16:41:59Z

同感，但是检查了loss，print了梯度，感觉是没有问题的，是有梯度回传的，但是就是loss降不下去 @libb999

@aoluming 请问解决了吗？

在原代码里写的是loss = criterion(outputs, labels)，但实际上应该是probs才对吧。如果把这块改了准备结果会不会变好些。
` if phase == 'train':
outputs = model(inputs)

            else:
                with torch.no_grad():
                    outputs = model(inputs)

            probs = nn.Softmax(dim=1)(outputs)
            preds = torch.max(probs.data, 1)[1]
            labels=labels.long()
            loss = criterion(probs, labels)`

Krystal0606 · 2021-04-07T01:49:33Z

同感，但是检查了loss，print了梯度，感觉是没有问题的，是有梯度回传的，但是就是loss降不下去 @libb999

@aoluming 请问解决了吗？

在原代码里写的是loss = criterion(outputs, labels)，但实际上应该是probs才对吧。如果把这块改了准备结果会不会变好些。
` if phase == 'train':
outputs = model(inputs)
            else:
                with torch.no_grad():
                    outputs = model(inputs)

            probs = nn.Softmax(dim=1)(outputs)
            preds = torch.max(probs.data, 1)[1]
            labels=labels.long()
            loss = criterion(probs, labels)`

您好，请问您这个解决了吗？精度有没有提升呢？

Krystal0606 · 2021-04-07T01:59:26Z

您好，请问预训练模型怎么加呢？我从0训练在20个epoch左右精度就开始上不去了，训练集精度一直在0.22-0.24之间，验证集精度0.25-0.27之间震荡，没有出现上述提到的只有1%的情况，想请问这是怎么回事呀？ @aoluming

alonelysnake · 2021-04-07T03:47:15Z

同感，但是检查了loss，print了梯度，感觉是没有问题的，是有梯度回传的，但是就是loss降不下去 @libb999

@aoluming 请问解决了吗？

在原代码里写的是loss = criterion(outputs, labels)，但实际上应该是probs才对吧。如果把这块改了准备结果会不会变好些。
` if phase == 'train':
outputs = model(inputs)
            else:
                with torch.no_grad():
                    outputs = model(inputs)

            probs = nn.Softmax(dim=1)(outputs)
            preds = torch.max(probs.data, 1)[1]
            labels=labels.long()
            loss = criterion(probs, labels)`
您好，请问您这个解决了吗？精度有没有提升呢？

试过了还是不行。我看您说您的精度在0.22-0.24左右，请问您对代码做过哪些修改吗？还是设置好路径和超参数后就直接运行了？

Krystal0606 · 2021-04-07T06:49:12Z

没有做修改，我是按照他的数据处理方法对ucf101进行处理并从0开始训练，到20个epoch左右精度就上不去了。不知道您有没有使用预训练模型跑过呢？ @alonelysnake

Krystal0606 · 2021-04-07T06:50:30Z

想起来，改了一下学习率，从1e-5改成1e-3，不过改动前后差别不大。 @alonelysnake

alonelysnake · 2021-04-07T07:59:20Z

@Krystal0606 用不用预训练模型我都试过了，结果都基本1%左右。如果不改我之前提到的那个地方，在学习率是1e-3时loss会报nan，1e-4及以下时loss在9左右。把那个地方改了之后学习率在1e-3的时候也可以跑了，loss降到4左右，但准确率依然保持不变。

alonelysnake · 2021-04-07T08:11:06Z

@Krystal0606 我看我前几个epoch的loss和准确率都一直在波动，所以所有的都只训练了五到十次。不知道您的训练是开始时和我一样，然后突然从一个epoch开始提高，还是从一开始就一直在提高呢？

Krystal0606 · 2021-04-07T09:23:29Z

我的损失值是一直在下降，精度也是一直在提高的，但是精度到22左右就开始震荡了，在学习率为1e-3时没有出现loss为nan的情况，我的loss一开始就差不多4左右最后是降到3左右。不知道这是什么情况 @ @alonelysnake

alonelysnake · 2021-04-07T10:59:42Z

@Krystal0606 我在知乎上看到一个人用了预训练模型，代码也没有改动，20个epoch后准确率百分之九十几。这么来看不同电脑上跑出来的结果差异好大，有没有可能是随机种子的问题？我对这方面没研究过。

Taylor-X76 · 2021-05-25T05:24:12Z

train 10 epoch，C3D的ACC也是1%
另外我改了loss = criterion(probs, labels)
不然ACC会nan

1009qjm · 2021-10-27T12:31:21Z

一样的超参数，我的训练精度到了70几就不上升了，数据集也是ucf101

Robin-WZQ · 2021-11-17T12:09:07Z

同感，但是检查了loss，print了梯度，感觉是没有问题的，是有梯度回传的，但是就是loss降不下去 @libb999

@aoluming 请问解决了吗？

在原代码里写的是loss = criterion(outputs, labels)，但实际上应该是probs才对吧。如果把这块改了准备结果会不会变好些。 ` if phase == 'train': outputs = model(inputs)
            else:
                with torch.no_grad():
                    outputs = model(inputs)

            probs = nn.Softmax(dim=1)(outputs)
            preds = torch.max(probs.data, 1)[1]
            labels=labels.long()
            loss = criterion(probs, labels)`

这里的话应该不需要修改，nn.CrossEntropyLoss自带了softmax的功能，https://blog.csdn.net/LIsaWinLee/article/details/107683641

Eunchan24 · 2021-11-19T05:22:44Z

@aoluming
Hello, I have the same problem as you.
The accuracy is only between 0.02 and 0.03.
Did you solve this problem?
Your help would be greatly appreciated.

232525 · 2022-01-06T12:48:00Z

你们可以试试把Batch Size设置的大一点，比如说16、20，应该会有奇迹+-_-+

Wangdanchunbufuz · 2024-01-16T06:02:34Z

有遇到过一直卡在这地方不动的情况，求助

hongbo-miao mentioned this issue May 8, 2021

Low accuracy for HMDB51 stanford-action-recognition/ar#10

Open

Eunchan24 mentioned this issue Nov 19, 2021

It's a question where the accuracy is too low accuracy. accuracy(0.02~0.03) #67

Closed

Train from scratch #50

Train from scratch #50

Comments

aoluming commented Aug 14, 2020

Farabi-shafkat commented Aug 14, 2020

aoluming commented Aug 14, 2020

Farabi-shafkat commented Aug 16, 2020

aoluming commented Aug 16, 2020 • edited

libb999 commented Aug 18, 2020

aoluming commented Aug 18, 2020

libb999 commented Aug 19, 2020

aoluming commented Aug 19, 2020

libb999 commented Aug 19, 2020

aoluming commented Aug 19, 2020

shanchao0906 commented Sep 29, 2020

shanchao0906 commented Oct 17, 2020

BryceWayne commented Oct 26, 2020

HuangZuShu commented Nov 19, 2020

BryceWayne commented Nov 19, 2020

BryceWayne commented Nov 19, 2020

HuangZuShu commented Nov 20, 2020

skyqwe123 commented Mar 8, 2021

skyqwe123 commented Mar 18, 2021

alonelysnake commented Apr 5, 2021

Krystal0606 commented Apr 7, 2021

Krystal0606 commented Apr 7, 2021

alonelysnake commented Apr 7, 2021

Krystal0606 commented Apr 7, 2021

Krystal0606 commented Apr 7, 2021

alonelysnake commented Apr 7, 2021

alonelysnake commented Apr 7, 2021

Krystal0606 commented Apr 7, 2021

alonelysnake commented Apr 7, 2021

Taylor-X76 commented May 25, 2021 • edited

1009qjm commented Oct 27, 2021

Robin-WZQ commented Nov 17, 2021

Eunchan24 commented Nov 19, 2021

232525 commented Jan 6, 2022

Wangdanchunbufuz commented Jan 16, 2024

aoluming commented Aug 16, 2020 •

edited

Taylor-X76 commented May 25, 2021 •

edited