Training not converging well, Dataset available #22

samhodge-aiml · 2023-09-03T03:40:04Z

Here are my modifications to the source code

diff --git a/configs/Test/images.yaml b/configs/Test/images.yaml
index 81a5824..4435fb2 100644
--- a/configs/Test/images.yaml
+++ b/configs/Test/images.yaml
@@ -12,4 +12,5 @@ training:
   auto_scheduler: True
   eval_pose_every: -1
 extract_images:
-  resolution: [540, 960]
\ No newline at end of file
+  resolution: [3024, 4032]
+with_depth: False
diff --git a/configs/preprocess.yaml b/configs/preprocess.yaml
index c56b1fd..d3ec72c 100644
--- a/configs/preprocess.yaml
+++ b/configs/preprocess.yaml
@@ -1,9 +1,9 @@
 depth:
   type: DPT
 dataloading:
-  path: data/nerf_llff_data
-  scene: ['fern']
+  path: data/Test
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
:
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal
diff --git a/environment.yaml b/environment.yaml
index dfde749..a81a313 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -4,12 +4,13 @@ channels:
   - conda-forge
   - anaconda
   - defaults
+  - nvidia
 dependencies:
-  - python=3.9
-  - pytorch=1.7
-  - torchvision=0.8.2 
-  - torchaudio 
-  - cudatoolkit=10.1
+  - python
+  - pytorch=2.0.0
+  - torchvision=0.15.0
+  - torchaudio=2.0.0
+  - pytorch-cuda=11.8
   - cffi
   - cython
   - imageio
@@ -39,4 +40,4 @@ dependencies:
     - lpips
     - setuptools
     - kornia==0.5.0
-    - imageio-ffmpeg
\ No newline at end of file
+    - imageio-ffmpeg
~
~
~
(END)
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal
diff --git a/environment.yaml b/environment.yaml
index dfde749..a81a313 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -4,12 +4,13 @@ channels:
   - conda-forge
   - anaconda
   - defaults
+  - nvidia
 dependencies:
-  - python=3.9
-  - pytorch=1.7
-  - torchvision=0.8.2 
-  - torchaudio 
-  - cudatoolkit=10.1
+  - python
+  - pytorch=2.0.0
+  - torchvision=0.15.0
+  - torchaudio=2.0.0
+  - pytorch-cuda=11.8
   - cffi
   - cython
   - imageio
@@ -39,4 +40,4 @@ dependencies:
     - lpips
     - setuptools
     - kornia==0.5.0
-    - imageio-ffmpeg
\ No newline at end of file
+    - imageio-ffmpeg

And my dataset

https://drive.google.com/drive/folders/1ZZgZUrFrnP47rx8bN5K6yvYnSC50a-9G?usp=sharing

When what I have done to start training is put the images in

data/Test/images/images

then run the preprocess and train commands

and I have found the tensorboard attached here:

log.zip

Is this OK?

or did I muck up the intrinsics?

attached in a JPG to look at the EXIF information

I think it may be 14 rather than 13 I will try again.

The text was updated successfully, but these errors were encountered:

samhodge-aiml · 2023-09-03T03:48:49Z

Made a change to 14 rather than 13 and if customized_focal: become if customized_focal or True:

samhodge-aiml · 2023-09-03T03:52:14Z

Looks like this is a red hot tip:
https://github.com/t-bence/exif-stats/blob/master/focal_stats.py#L44

samhodge-aiml · 2023-09-03T21:11:20Z

Maybe all I needed was patience.

samhodge-aiml · 2023-09-04T20:55:17Z

Does this seem correct?

@bianwenjing

I am worried that my modification with the width and height

diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal

Is in error, also I am wondering if CX_ should be 4032//2 and CY_ should be 3024//2

Also wondering if

diff --git a/configs/Test/images.yaml b/configs/Test/images.yaml
index 81a5824..4435fb2 100644
--- a/configs/Test/images.yaml
+++ b/configs/Test/images.yaml
@@ -12,4 +12,5 @@ training:
   auto_scheduler: True
   eval_pose_every: -1
 extract_images:
-  resolution: [540, 960]
\ No newline at end of file
+  resolution: [3024, 4032]

Messes up the internals of the convolutions, can I go for large scale resolution?

I am also wondering if I could speed up training the batch size of 1 seems to be under utilising resources, my 24 Gb RTX 3090 is only using a fraction of the VRAM and a fraction of the GPU Utilization.

grep -rn batch configs/
configs/default.yaml:14:  batchsize: 1
configs/default.yaml:78:  batch_size: 1

samhodge-aiml · 2023-09-05T10:02:11Z

Yeah nah, didn't work trying again with different intrinsics values

diff --git a/configs/Test/images.yaml b/configs/Test/images.yaml
index 81a5824..264c4cf 100644
--- a/configs/Test/images.yaml
+++ b/configs/Test/images.yaml
@@ -12,4 +12,5 @@ training:
   auto_scheduler: True
   eval_pose_every: -1
 extract_images:
-  resolution: [540, 960]
\ No newline at end of file
+  resolution: [765, 1008]
+with_depth: False
diff --git a/configs/default.yaml b/configs/default.yaml
index adb9cb0..92aae7b 100644
--- a/configs/default.yaml
+++ b/configs/default.yaml
@@ -75,7 +75,7 @@ training:
   load_distortion_dir: model_distortion.pt
   n_training_points: 1024
   scheduling_epoch: 10000
-  batch_size: 1
+  batch_size: 8
   learning_rate: 0.001
   focal_lr: 0.001
   pose_lr: 0.0005
diff --git a/configs/preprocess.yaml b/configs/preprocess.yaml
index c56b1fd..d3ec72c 100644
--- a/configs/preprocess.yaml
+++ b/configs/preprocess.yaml
@@ -1,9 +1,9 @@
 depth:
   type: DPT
 dataloading:
-  path: data/nerf_llff_data
-  scene: ['fern']
+  path: data/Test
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..717ce8d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -81,12 +81,17 @@ class DataField(object):
         imgs = np.transpose(imgs, (0, 3, 1, 2))
         _, _, h, w = imgs.shape
 
-        if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+        if customized_focal or True:
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 14.0/35.0
+            CX_ = 35.0/2.0
+            CY_ = (2032/3024) * CX_
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal
diff --git a/environment.yaml b/environment.yaml
index dfde749..a81a313 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -4,12 +4,13 @@ channels:
   - conda-forge
   - anaconda
   - defaults
+  - nvidia
 dependencies:
-  - python=3.9
-  - pytorch=1.7
-  - torchvision=0.8.2 
-  - torchaudio 
-  - cudatoolkit=10.1
+  - python
+  - pytorch=2.0.0
+  - torchvision=0.15.0
+  - torchaudio=2.0.0
+  - pytorch-cuda=11.8
   - cffi
   - cython
   - imageio
@@ -39,4 +40,4 @@ dependencies:
     - lpips
     - setuptools
     - kornia==0.5.0
-    - imageio-ffmpeg
\ No newline at end of file
+    - imageio-ffmpeg

samhodge-aiml · 2023-09-11T21:40:46Z

This is with training from COLMAP and hallucinated depth maps

hmm.mp4

What are the limitations on the input dataset?

bianwenjing · 2023-09-24T05:04:18Z

Hi, sorry for my late reply. The input images I used are consecutive and closely sampled from a video. This is essential because the point cloud loss requires a dense matching between two views. I noticed that the images you provided are rather sparse, which might make the point cloud loss less effective.

samhodge · 2023-09-24T05:26:37Z

Yeah they are from photos rather than a video. I can try shooting the same location from another photographic approach with much more dense sampling. Thanks for your response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training not converging well, Dataset available #22

Training not converging well, Dataset available #22

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 4, 2023 •

edited

Loading

samhodge-aiml commented Sep 5, 2023

samhodge-aiml commented Sep 11, 2023

bianwenjing commented Sep 24, 2023

samhodge commented Sep 24, 2023

Training not converging well, Dataset available #22

Training not converging well, Dataset available #22

Comments

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 3, 2023

samhodge-aiml commented Sep 4, 2023 • edited Loading

samhodge-aiml commented Sep 5, 2023

samhodge-aiml commented Sep 11, 2023

bianwenjing commented Sep 24, 2023

samhodge commented Sep 24, 2023

samhodge-aiml commented Sep 4, 2023 •

edited

Loading