Skip to content

Add simclr eval#13

Open
sanaAyrml wants to merge 59 commits into
mainfrom
add_simclr_eval
Open

Add simclr eval#13
sanaAyrml wants to merge 59 commits into
mainfrom
add_simclr_eval

Conversation

@sanaAyrml

@sanaAyrml sanaAyrml commented Feb 1, 2024

Copy link
Copy Markdown
Contributor

PR Type

[Feature ]

Short Description

I added evaluation module in this new PR which adds a classification head to pre-trained backbone. Then it freezes backbone and only train classification head and report accuracy over train and test data.


This change is Reviewable

@afkanpour

Copy link
Copy Markdown
Collaborator

It looks like this PR includes changes unrelated to linear eval (perhaps from multi-gpu support?). Can you please fix that?

@sanaAyrml

Copy link
Copy Markdown
Contributor Author

It looks like this PR includes changes unrelated to linear eval (perhaps from multi-gpu support?). Can you please fix that?

I have put this pr before merging ICGAN, maybe that is the reason to have extra changes. I will update it today and put it up.

@sanaAyrml sanaAyrml marked this pull request as draft February 12, 2024 17:50
Comment thread run_simCLR.py
Comment thread evaluate_simCLR.py Outdated
Comment thread evaluate_simCLR.py
Comment thread evaluate_simCLR.py Outdated
Comment thread evaluate_simCLR.py Outdated
Comment thread evaluate_simCLR.py Outdated
Comment thread SimCLR/data_aug/supervised_dataset.py Outdated
size (int): Image size.
"""
transform_list = [
transforms.Resize(size=(size,size)),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For inference I believe SimCLR performs a random center crop before resizing. Can you please have a look at either their paper or the code to verify?

@sanaAyrml sanaAyrml Feb 16, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code they didn't have any transformation rather than .ToTensor, but I thought this would fail on imagenet as we have various images sizes. However in the paper they mentioned "As preprocessing, all images were resized to 224 pixels along the shorter side using bicubic resampling, after which we took a 224 × 224 center crop." It is bit vague for me, should we first resize on smaller dimension and then centre crop? or should we first crop on smaller dimension and then resize?

@sanaAyrml sanaAyrml marked this pull request as ready for review February 20, 2024 18:53
data_utils.CenterCropLongEdge(),
transforms.Resize((size, size)),
transforms.ToTensor(),
transforms.Normalize(self.config.norm_mean, self.config.norm_std),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this? Is it causing an error?

Comment thread evaluate_simCLR.py Outdated

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this implementation should be available. Do you need additional capability beyond this? https://github.com/VectorInstitute/GenerativeSSL/blob/main/icgan/data_utils/utils.py#L29

Comment thread eval_simclr.slrm
@@ -0,0 +1,51 @@
#!/bin/bash

#SBATCH --job-name=train_sunrgbd

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename

Comment thread run_simCLR.py

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added changes to run_simCLR.py in my PR that takes checkpoint_dir and creates a directory based on the time of running the training job. So let's revert these changes.

from torch import nn
from torchvision import models

from ..exceptions.exceptions import InvalidBackboneError

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use absolute paths.

Comment thread SimCLR/simclr.py
self.device_id = kwargs["device_id"]
self.writer = SummaryWriter()
# Create a directory to save the model checkpoints and logs
now = datetime.now()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added logic to run_simCLR.py that does something like this. So you have to either remove these parts, or remove my changes before merging.

Comment thread train_simclr.slrm
#SBATCH --nodes=1
#SBATCH --gres=gpu:4
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-node=1

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep in mind that this uses a single GPU per node.

Comment thread train_simclr.slrm
# srun” executes the script <ntasks-per-node * nodes> times
srun python run_simCLR.py \
# srun execute ntasks-per-node * nodes times
srun pythong run_simCLR.py \

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pythong?

Comment thread evaluate_simCLR.py

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the github repo from which we borrowed simclr implementation provides pretrained model checkpoints, can you please download one of them and test this script on it to see if the results match with what's reported by them?

@afkanpour afkanpour left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this? If not, should we delete it?

Reviewable status: 0 of 19 files reviewed, 30 unresolved discussions (waiting on @sanaAyrml and @vahid0001)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants