Steps followed for Bengali ai
- Understand the data directory. Do some EDA of data files
- For the efficiency of storage the data was provided in Parquet file format
- Flattened images are stored in single dataframe
- So you have to unflatten them to create Image object and then tensors
- Multiple input dataframes, so you have to join them using concat
- Delete individual objects to free-up some memory
- Images are grayscale images (with only one channel) but the pre-trained model expects 3-channel image. So you have to convert grayscale to RGB using convert.RGB from Image
- Also the pre-trained model expects 224*224 image but our original images have dimensions of 137*236. So you have to resize them to larger dimension.
- We create a custom class for our dataset inheriting the Dataset class from PyTorch. This class enables us to access/get an image and the label, transform it using the input arguments and also convert it to RGB (3-channel image).
- Then we create the instance of transformed data object.
- Create a sampler to randomly separate 20% of the data as validation data and 80% as training data.
- Create a datalaoder object for train and val using the sampler.
- Then create the train model function which uses the pre-trained model for number of epochs we have specified using the criteria (evaluation matrix), optimizer and scheduler //explore more. This function also does the forward pass, backward pass, loss computation and evaluation matrix (accuracy) computation. Finally, it prints the loss and accuracy for each epoch on both train and val data.
- Set the computation device to cuda.
- Specify hyperparameters: learning rate, decay, momentum
- Run the model. Iterate by changing hyper-parameters.
No comments:
Post a Comment