Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Finetuning on chairsSDHom epe doesn't go down. #31

Open
vivasvan1 opened this issue Dec 22, 2020 · 1 comment
Open

Finetuning on chairsSDHom epe doesn't go down. #31

vivasvan1 opened this issue Dec 22, 2020 · 1 comment

Comments

@vivasvan1
Copy link

vivasvan1 commented Dec 22, 2020

Issue is on training the validation loss goes up too much very quickly. check logs below.

I have added chairsSDHom data loading script as follows.
Changes:

  1. Loading data at iterate_data instead of reading all images into a list in main.py
  2. added chairsSDHom.py, chairsSDHom.yaml
    I have attached all code which i have updated below.

1 . main.py

...
...
elif dataset_cfg.dataset.value == "chairsSDHom":
        batch_size=3
        orig_shape= [384,512]
        # training
        chairsSDHom_dataset = chairsSDHom.list_data()
        print(chairsSDHom_dataset['flow'][0])
        from pympler.asizeof import asizeof
        trainImg1 = [file for file in chairsSDHom_dataset['image_0']]
        trainImg2 = [file for file in chairsSDHom_dataset['image_1']]
        trainFlow = [file for file in chairsSDHom_dataset['flow']]
        trainMask = [file for file in chairsSDHom_dataset['mask']]
        trainSize = len(trainFlow)
        training_datasets = [(trainImg1, trainImg2, trainFlow,trainMask)] * batch_size

        # validaion- sintel
        sintel_dataset = sintel.list_data()
        divs = ('training',) if not getattr(config.network, 'class').get() == 'MaskFlownet' else ('training2',)
        for div in divs:
                for k, dataset in sintel_dataset[div].items():
                        dataset = dataset[:samples]
                        img1, img2, flow, mask = [[sintel.load(p) for p in data] for data in zip(*dataset)]
                        validationSize = len(flow)
                        validation_datasets['sintel.' + k] = (img1, img2, flow, mask)
...
...
def iterate_data(iq, dataset):
    if dataset_cfg.dataset.value == 'chairsSDHom' or dataset_cfg.dataset.value == "things3d":
        gen = index_generator(len(dataset[0]))
        while True:
            i = next(gen)
            data = [item[i] for item in dataset]
            if dataset_cfg.dataset.value == "chairsSDHom":
                data = [skimage.io.imread(data[0]),skimage.io.imread(data[1]),chairsSDHom.load(data[2]),skimage.io.imread(data[3])]
            elif dataset_cfg.dataset.value == "things3d":
                data = [cv2.imread(data[0]).astype('uint8'),skimage.io.imread(data[1]).astype('uint8'),things3d.load(data[2]).astype('float16')]
            space_x, space_y = data[0].shape[0] - orig_shape[0], data[0].shape[1] - orig_shape[1]
            crop_x, crop_y = space_x and np.random.randint(space_x), space_y and np.random.randint(space_y)
            data = [np.transpose(arr[crop_x: crop_x + orig_shape[0], crop_y: crop_y + orig_shape[1]], (2, 0, 1)) for arr in data]
            # vertical flip
            if np.random.randint(2):
                data = [arr[:, :, ::-1] for arr in data]
                data[2] = np.stack([-data[2][0, :, :], data[2][1, :, :]], axis = 0)
            iq.put(data)
    else:
        gen = index_generator(len(dataset[0]))
        while True:
            i = next(gen)
            data = [item[i] for item in dataset]
            space_x, space_y = data[0].shape[0] - orig_shape[0], data[0].shape[1] - orig_shape[1]
            crop_x, crop_y = space_x and np.random.randint(space_x), space_y and np.random.randint(space_y)
            data = [np.transpose(arr[crop_x: crop_x + orig_shape[0], crop_y: crop_y + orig_shape[1]], (2, 0, 1)) for arr in data]
            # vertical flip
            if np.random.randint(2):
                data = [arr[:, :, ::-1] for arr in data]
                data[2] = np.stack([-data[2][0, :, :], data[2][1, :, :]], axis = 0)
            iq.put(data)
...

rest everthing is same

yet training

updated code.zip


Logs:

[2020/12/22 21:36:48] start=0, train=21670, val=224, host=ludwig, batch=3
[2020/12/22 21:36:48] batch=8, config='MaskFlownet_ft.yaml', dataset_cfg='chairsSDHom.yaml', shard=1, gpu_device='1', checkpoint='5adNov03', clear_steps=True, network='MaskFlownet', debug=False, valid=Fa
lse, predict=False, resize=''
[2020/12/22 21:36:54] steps=1, epe=81.23613661839343, total_time=0.00
[2020/12/22 21:37:20] steps=1, sintel.clean=1.4036083221435547, sintel.final=**1.7385120391845703**
[2020/12/22 21:37:20] steps=2, epe=82.52426050579368, total_time=31.65
[2020/12/22 21:37:21] steps=3, epe=70.33922181313649, total_time=15.62
[2020/12/22 21:37:21] steps=4, epe=64.53729546698513, total_time=10.30
[2020/12/22 21:37:21] steps=5, epe=73.13790790314701, total_time=7.64
[2020/12/22 21:37:22] steps=6, epe=69.97008332644914, total_time=6.04
[2020/12/22 21:37:22] steps=7, epe=63.190831684866595, total_time=4.98
[2020/12/22 21:37:23] steps=8, epe=69.54386270096657, total_time=4.23
[2020/12/22 21:37:23] steps=9, epe=71.65906570549198, total_time=3.66
[2020/12/22 21:37:24] steps=10, epe=70.68287622669239, total_time=3.22
[2020/12/22 21:37:24] steps=11, epe=68.10887379487774, total_time=2.88
[2020/12/22 21:37:24] steps=12, epe=65.31357897717663, total_time=2.59
[2020/12/22 21:37:25] steps=13, epe=67.39865911195284, total_time=2.36
[2020/12/22 21:37:25] steps=14, epe=66.05316386284305, total_time=2.16
[2020/12/22 21:37:26] steps=15, epe=62.74090359794587, total_time=1.99
[2020/12/22 21:37:26] steps=16, epe=65.24516708995266, total_time=1.85
[2020/12/22 21:37:27] steps=17, epe=61.783343363284466, total_time=1.72
[2020/12/22 21:37:27] steps=18, epe=66.12157773880946, total_time=1.61
[2020/12/22 21:37:27] steps=19, epe=65.41601491031372, total_time=1.51
[2020/12/22 21:37:28] steps=20, epe=67.27401184191667, total_time=1.42
[2020/12/22 21:37:41] steps=50, epe=64.05605013410363, total_time=0.57
[2020/12/22 21:38:03] steps=100, epe=60.72789733634401, total_time=0.45
[2020/12/22 21:38:30] steps=100, sintel.clean=3.107024669647217, sintel.final=**3.6572041511535645**
[2020/12/22 21:38:51] steps=150, epe=58.168171286698964, total_time=0.55
[2020/12/22 21:39:14] steps=200, epe=55.366796654848244, total_time=0.45
[2020/12/22 21:39:41] steps=200, sintel.clean=4.636238098144531, sintel.final=**5.08129358291626**
[2020/12/22 21:40:03] steps=250, epe=52.92103477169547, total_time=0.56
[2020/12/22 21:40:25] steps=300, epe=50.651504112365515, total_time=0.45
[2020/12/22 21:40:52] steps=300, sintel.clean=5.46751070022583, sintel.final=**5.855245113372803**
[2020/12/22 21:41:13] steps=350, epe=48.90560261388807, total_time=0.55
[2020/12/22 21:41:36] steps=400, epe=47.090479957163055, total_time=0.45
[2020/12/22 21:42:02] steps=400, sintel.clean=6.850785255432129, sintel.final=**7.147568702697754**
[2020/12/22 21:42:24] steps=450, epe=45.47630244939083, total_time=0.55
[2020/12/22 21:42:47] steps=500, epe=43.721847967473224, total_time=0.45
[2020/12/22 21:43:14] steps=500, sintel.clean=7.392406940460205, sintel.final=**7.563663005828857**
[2020/12/22 21:43:36] steps=550, epe=41.861068025751216, total_time=0.56
[2020/12/22 21:43:59] steps=600, epe=40.728338542736246, total_time=0.45
[2020/12/22 21:44:25] steps=600, sintel.clean=8.37342643737793, sintel.final=**8.398472785949707**
[2020/12/22 21:44:47] steps=650, epe=39.22414651439415, total_time=0.55
[2020/12/22 21:45:09] steps=700, epe=38.01273616706755, total_time=0.45
[2020/12/22 21:45:36] steps=700, sintel.clean=8.904271125793457, sintel.final=**8.86906623840332**
[2020/12/22 21:45:57] steps=750, epe=36.68394209224638, total_time=0.55
[2020/12/22 21:46:20] steps=800, epe=35.51223404091925, total_time=0.45
[2020/12/22 21:46:46] steps=800, sintel.clean=9.723841667175293, sintel.final=**9.715934753417969**
[2020/12/22 21:47:08] steps=850, epe=34.441762749200876, total_time=0.55
[2020/12/22 21:47:30] steps=900, epe=33.21928807435762, total_time=0.45
[2020/12/22 21:47:56] steps=900, sintel.clean=10.129880905151367, sintel.final=**10.09166431427002**

Question 1) Any idea on why is the network output is such? And how may i fix this?
Question 2) Is there anything you think that is very wrong in the edits i have made?

Thank you so much. Highly appriciate your work.<3 :D

@vivasvan1 vivasvan1 reopened this Dec 22, 2020
@vivasvan1
Copy link
Author

Output after finetuning
KLE_1134 jpg4

Output of original maskflownet 5ac*.pth
KLE_1134 jpg4

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant