CS411: Applications of Computer Vision and Deep Learning

Conducted Online Instruction for Semester II, 2024 at RKMVERI (3 credits)

Course Details


The surge in integrating computer vision algorithms into daily activities has prompted the development of this comprehensive course. Commencing with foundational Python3 programming, the curriculum advances toward the pragmatic implementation of programming concepts using the widely recognized OpenCV library, renowned for its classical computer vision methodologies. Participants will not only explore the intricacies of the PyTorch framework but also attain mastery in designing models tailored explicitly for classification and image segmentation—critical algorithms in both industry and research domains. The course goes beyond theoretical understanding, offering potential hands-on experience in creating a deployable real-world application, effectively showcasing the practical application of acquired skills.

This introductory program seamlessly blends theoretical concepts with immersive, hands-on tutorials, making it an ideal choice for individuals eager to delve into the realm of computer vision. The tutorial-driven approach is meticulously designed to cater to those enthusiastic about applying classical and cutting-edge deep learning algorithms to real-world scenarios, thereby enriching their comprehension and practical application skills in the field. For those who have previously taken computer vision courses, there may be some overlap in theory; however, this course distinguishes itself by providing complementary support in terms of programming and application—an invaluable asset widely utilized in the industry for creating real-world applications. PDF of this Webpage can be found [ here ].

We will actively use Google Space for communication and slide sharing. Please find the Google Space here. Please note due to certain restrictions, I cannot put the slides online at the current moment, however they are available in the private Google Space.

Grading


(Tentative)
50%: Assignments (about 3-4)
20%: Recitation
30%: End-semester examination

Pre-requisites


No formal prerequisites; only a willingness to learn and explore.

Instructor


Jimut Bahan Pal, Seshadri Mazumder

Contact me


<my-last-name>.<my-first-name>@iitb.ac.in

Schedule, Materials and Topics Covered


Date Time Topic
13-01-2024 4:30 - 7:30 P.M. (3 hrs) Lec-1: Overview of the course & General Introductions. [ Attendance | General Notes | Recitation Topics| Lecture - 1 slides ]
20-01-2024 1:00 - 4:00 P.M. (3 hrs) Lec-2: Numpy basics, Pytorch fundamentals, Autograd, Optimizers, Loss functions, Activation functions [ Attendance | Autograd Notes | More notes| Lecture - 2 Notebook ]
28-01-2024 5:00 - 8:00 P.M. (3 hrs) Lec-3: Sigmoid, Softmax, Categorical Cross Entropy, Binary Cross Entropy, Back-propagation theory [ Attendance | Notes ]
03-02-2024 5:30 - 9:00 P.M. (3.5 hrs) Lec-4: Classification pipeline, Parameter Calculation in models [ Attendance | Notes | Parameter calculation notebook | Classification Pipeline notebook | TEAM_FORMATION_RECITATION ]
11-02-2024 2:00 - 5:00 P.M. (3 hrs) Lec-5: Theory of Back-prop continuation, Discussions about assignments, Word Vectors [ Attendance | Notes ]
ASSIGNMENT-1 (Multi-Class Classification pipeline using Pytorch) out on 11-02-23 (Deadline: 3rd-March-2024, 11:59 P.M.)
17-02-2024 5:00 - 7:00 P.M. (2 hrs) Lec-6: Lecture by Seshadri Mazumder on NLP [ Attendance | NLP-1 Slides ]
25-02-2024 11:00 P.M. - 1:30 A.M. (2.5 hrs) Lec-7: Lecture by Seshadri Mazumder on CBOW, Skip-gram etc., (Discussion about: Image Segmentation) [ Attendance | NLP-2 Slides ]
03-03-2024 10:00 P.M. - 12:30 A.M. (2.5 hrs) Lec-8: Image Segmentation using U-Net (Pytorch Code) - Segmenting Skin Lesion (Single Class) [ Attendance | Notebook ]
04-03-2024 10:30 P.M. - 12:30 A.M. (2.0 hrs) Lec-9: Lecture by Seshadri Mazumder on Primordal Language Models [ Attendance ]
10-03-2024 10:00 P.M. - 1:30 A.M. (3.5 hrs) Lec-10: VAE's theory [ Attendance | VAE Theory | VAE Paper - Recommended Reading ]
11-03-2024 10:00 P.M. - 12:30 A.M. (2.5 hrs) Lec-11: Neural Machine Translation using LSTM - coding session-1 by Seshadri Mazumder [ Attendance | Notebook ]
ASSIGNMENT-2 (Multi-Class Semantic Segmentation pipeline using Pytorch) out on 13-03-24 (Deadline: 5th-April-2024, 11:59 P.M.) | [ ASSIGNMENT BACKUP DAYS REMAINING ]
16-03-2024 1:00 P.M. - 3:30 P.M. (2.5 hrs) Assignment-1 evaluations for credited students
19-03-2024 10:00 P.M. - 12:30 A.M. (2 hrs) Lec-12: Neural Machine Translation using LSTM - coding session-2 by Seshadri Mazumder [ Attendance | Notebook ]
28-03-2024 10:00 P.M. - 1:30 A.M. (3.5 hrs) Lec-13: Transformers and Attention by Seshadri Mazumder [ Attendance | Notebook-2,4,5 ]
10-04-2024 10:00 P.M. - 1:00 A.M. (3.0 hrs) Lec-14: Transformers theory [ Attendance | Minor notes ]


Total hours covered till now: 42

Online Useful Resources for the curious ones


Completely optional for those who are advanced and curious, with additional time and an interest in delving into research.

Reference Materials


Course materials, e.g., slides, jupyter notebook would be made available via Google Spaces.
Pointers to additional resources week-wise will be made via course website or via Google Spaces.
Here are the classical books if you are interested:

Acknowledgements


This course has been designed to meet the requirements of the Prime Minister's Research Fellowship for teaching work completion. I am grateful for the assistance provided by my supervisor, Prof. Suyash P. Awate (CSE IIT Bombay) in acquiring the materials necessary for online teaching. I am also thankful to Swami Punyeshwarananda, the Head of the Computer Science Department at RKMVERI, for placing trust in me to take on the responsibility of teaching this course. In conclusion, my heartfelt gratitude goes to my Department, CMInDS, encompassing the dedicated faculty (Prof. Manjunath), supportive staff (Taha and Suraj sir), and cherished friends. Their unwavering support has been instrumental in making all these achievements possible.

[ Back to home page ]