Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

side improvement isn't side-refinement ? #3

Open
hcnhatnam opened this issue Apr 7, 2019 · 12 comments
Open

side improvement isn't side-refinement ? #3

hcnhatnam opened this issue Apr 7, 2019 · 12 comments

Comments

@hcnhatnam
Copy link

I think side improvement isn't side-refinement in paper

@yizt
Copy link
Owner

yizt commented Apr 7, 2019

@hcnhatnam 感谢指正,已更正;翻译为"侧边细化"

@hcnhatnam
Copy link
Author

I asked but not yet answered, did you implement side-refinement like that?
Screenshot from 2019-04-08 11-23-57

@yizt
Copy link
Owner

yizt commented Apr 8, 2019

@hcnhatnam 我的实现逻辑不是这样的;Ground Truth本身也做了分割;如果一个GT的x轴坐标[x1,x2]分别为[5.3,68.7],则会被分割(spilt)为如下5个GT: [5.3,16.] 、[16.,32.]、[32.,48.]、[48.,64]、 [64.,68.7] ;对于匹配中间3个GT的anchors,side-refinement 回归目标为0;只有匹配左右两边的gt才有side-refinement 回归目标;分别为
dx= ((5.3+16)-(0+16))/16; 对于匹配[5.3,16.]的anchors
dx= ((64.+68.7)-(64+72))/16; 对于匹配[64.,68.7]的anchors

@hcnhatnam
Copy link
Author

sorry but Ground Truth is[x1,y1,x2,y2] why Ground Truth is [5.3,68.7]?i don't understand

@yizt
Copy link
Owner

yizt commented Apr 8, 2019

@hcnhatnam 对, Ground Truth是四边形,坐标为[lt_x, lt_y, rt_x, rt_y, rb_x, rb_y, lb_x, lb_y];side-refinement只与x轴坐标相关,所以省略了y轴坐标

@hcnhatnam
Copy link
Author

i understaned.I really appreciate you.But i think dx= ((64.+68.7)-(64+70))/16 not 72

@yizt
Copy link
Owner

yizt commented Apr 8, 2019

@hcnhatnam 应该是dx= ((64.+68.7)-(64+80))/16 ;(* ̄︶ ̄)

@hcnhatnam
Copy link
Author

ohh... ok ok

@hcnhatnam
Copy link
Author

@yizt I think you were a bit confused.
Screenshot from 2019-04-08 21-07-38
In the paper: we are considering O*

  • dx(of first anchor)=O*=(5.3-8)/16 ;(8=(0+16)/2=Cax= center of anchor in x-axis)
  • dx(of last anchor)=O*=(68.7-56)/16 ;(72=(64-80)/2=Cax= center of anchor in x-axis)

@yizt
Copy link
Owner

yizt commented Apr 9, 2019

@hcnhatnam 你说的没错;这里的实现不是完全按照论文中的逻辑。
个人理解:论文中说x_side是预测与当前anchor最近的水平边x坐标;本身是比较模糊的,最邻近边可能是左边,也可能是右边;逻辑较为复杂,也比较绕。所以我按照中心点回归的思想,直接将anchor的中心向GT的中心方向偏移;逻辑更简单,更一致;偏移的距离(anchor_cx-gt_cx) * 2; anchor_cx - gt_cx是中心点偏移的距离, (anchor_cx-gt_cx) * 2就是anchor移动到与gt重合的距离。最终尺度不变的回归目标就是
dx=(anchor_cx-gt_cx) * 2/w 恒等于 ((anchor_x1+anchor_x2) - (gt_x1 + gt_x2))/w

@yizt yizt reopened this Apr 9, 2019
@NamNguyenThanh
Copy link

Hi @yizt,
I understood what you implemented for side-refinement. But in your result on ICDAR 2015, I think that not only effect on the head and tail anchors of text line ground truth (refine < 16 pixels) but also more than 16 pixels (ex: below picture)
56456128_1988716384756644_1062434923161321472_n

@yizt
Copy link
Owner

yizt commented Apr 9, 2019

@NamNguyenThanh 感谢您的反馈!有两个方面原因:
a) 虽然x坐标真正的偏移应该在(-16,16); 训练样本的回归目标都是这样的,所以理论上超出16个像素的概率应该很小。但是网络并没有增加明确约束限制在16个像素内;所以预测时有可能超出16个像素。
b) 网络的输入是720*720; 这里可视化使用pyplot保存后图像是1600*1600; 宽度16也是对于720*720来说的, 所以例子中图像偏移应该也没有超过16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants