Course Details
The surge in integrating computer vision algorithms into daily activities has prompted
the development of this comprehensive course. Commencing with foundational Python3
programming, the curriculum advances toward the pragmatic implementation of programming
concepts using the widely recognized OpenCV library, renowned for its classical computer
vision methodologies. Participants will not only explore the intricacies of the PyTorch
framework but also attain mastery in designing models tailored explicitly for classification
and image segmentation—critical algorithms in both industry and research domains. The course
goes beyond theoretical understanding, offering potential hands-on experience in creating a
deployable real-world application, effectively showcasing the practical application of acquired skills.
This introductory program seamlessly blends theoretical concepts with immersive, hands-on tutorials,
making it an ideal choice for individuals eager to delve into the realm of computer vision.
The tutorial-driven approach is meticulously designed to cater to those enthusiastic about applying
classical and cutting-edge deep learning algorithms to real-world scenarios, thereby enriching their
comprehension and practical application skills in the field. For those who have previously taken computer
vision courses, there may be some overlap in theory; however, this course distinguishes itself by providing
complementary support in terms of programming and application—an invaluable asset widely utilized in the
industry for creating real-world applications. PDF of this Webpage can be found [
here ].
We will actively use Google Space for communication and slide sharing. Please find
the
Google Space here.
Please note due to certain restrictions, I cannot put the slides online at the current moment, however they are available
in the private Google Space.
Grading
(Tentative)
50%: Assignments (about 3-4)
20%: Recitation
30%: End-semester examination
Pre-requisites
No formal prerequisites; only a willingness to learn and explore.
Instructor
Jimut Bahan Pal, Seshadri Mazumder
Contact me
<my-last-name>.
<my-first-name>@iitb.ac.in
Schedule, Materials and Topics Covered
Assignments and Question Papers:
Assignment-1
Assignment-2
End-Semester Question Paper
Recitation Slides:
Privacy in Age Recognition From Images - Anirban & Sourish
Uncertainty Sets for Image Classifiers using Conformal Prediction - Bidit & Srijan
Normalizing Flow - Shreyas
Date |
Time |
Topic |
13-01-2024 |
4:30 - 7:30 P.M. (3 hrs) |
Lec-1: Overview of the course & General Introductions.
[ Attendance |
General Notes |
Recitation Topics|
Lecture - 1 slides ]
|
20-01-2024 |
1:00 - 4:00 P.M. (3 hrs) |
Lec-2: Numpy basics, Pytorch fundamentals, Autograd, Optimizers, Loss functions, Activation functions
[ Attendance |
Autograd Notes |
More notes|
Lecture - 2 Notebook ]
|
28-01-2024 |
5:00 - 8:00 P.M. (3 hrs) |
Lec-3: Sigmoid, Softmax, Categorical Cross Entropy, Binary Cross Entropy, Back-propagation theory
[ Attendance |
Notes ]
|
03-02-2024 |
5:30 - 9:00 P.M. (3.5 hrs) |
Lec-4: Classification pipeline, Parameter Calculation in models
[ Attendance |
Notes |
Parameter calculation notebook |
Classification Pipeline notebook |
TEAM_FORMATION_RECITATION ]
|
11-02-2024 |
2:00 - 5:00 P.M. (3 hrs) |
Lec-5: Theory of Back-prop continuation, Discussions about assignments, Word Vectors
[ Attendance |
Notes ]
|
ASSIGNMENT-1
(Multi-Class Classification pipeline using Pytorch) out on 11-02-23 (Deadline: 3rd-March-2024, 11:59 P.M.) |
17-02-2024 |
5:00 - 7:00 P.M. (2 hrs) |
Lec-6: Lecture by Seshadri Mazumder on NLP
[ Attendance |
NLP-1 Slides ]
|
25-02-2024 |
11:00 P.M. - 1:30 A.M. (2.5 hrs) |
Lec-7: Lecture by Seshadri Mazumder on CBOW, Skip-gram etc., (Discussion about: Image Segmentation)
[ Attendance |
NLP-2 Slides ]
|
03-03-2024 |
10:00 P.M. - 12:30 A.M. (2.5 hrs) |
Lec-8: Image Segmentation using U-Net (Pytorch Code) - Segmenting Skin Lesion (Single Class)
[ Attendance |
Notebook ]
|
04-03-2024 |
10:30 P.M. - 12:30 A.M. (2.0 hrs) |
Lec-9: Lecture by Seshadri Mazumder on Primordal Language Models
[ Attendance ]
|
10-03-2024 |
10:00 P.M. - 1:30 A.M. (3.5 hrs) |
Lec-10: VAE's theory
[ Attendance |
VAE Theory |
VAE Paper - Recommended Reading ]
|
11-03-2024 |
10:00 P.M. - 12:30 A.M. (2.5 hrs) |
Lec-11: Neural Machine Translation using LSTM - coding session-1 by Seshadri Mazumder
[ Attendance |
Notebook ]
|
ASSIGNMENT-2 (Multi-Class
Semantic Segmentation pipeline using Pytorch) out on 13-03-24 (Deadline: 5th-April-2024, 11:59 P.M.) |
[ ASSIGNMENT BACKUP DAYS REMAINING ] |
16-03-2024 |
1:00 P.M. - 3:30 P.M. (2.5 hrs) |
Assignment-1 evaluations for credited students and related discussions.
|
19-03-2024 |
10:00 P.M. - 12:30 A.M. (2 hrs) |
Lec-12:
Neural Machine Translation using LSTM - coding session-2 by Seshadri Mazumder
[ Attendance |
Notebook ]
|
28-03-2024 |
10:00 P.M. - 1:30 A.M. (3.5 hrs) |
Lec-13:
Transformers and Attention by Seshadri Mazumder
[ Attendance |
Notebook-2,4,5 ]
|
10-04-2024 |
10:00 P.M. - 1:00 A.M. (3.0 hrs) |
Lec-14:
Transformers theory
[ Attendance |
Minor notes ]
|
24-04-2024 |
5:00 P.M. - 8:00 P.M. (3.0 hrs) |
Recitation Presentation
[ Attendance |
Presentation ]
|
08-05-2024 |
09:00 P.M. - 11:00 P.M. (2.0 hrs) |
Viva for Assignment-2 and closure
[ Attendance ]
|
Total hours covered till now: 46.5
Online Useful Resources for the curious ones
Completely optional for those who are advanced and curious, with additional time and an interest in delving into research.
Reference Materials
Course materials, e.g., slides, jupyter notebook would be made available via Google Spaces.
Pointers to additional resources week-wise will be made via course website or via Google Spaces.
Here are the classical books if you are interested:
- Deep Learning with PyTorch (Thomas Viehmann, Eli Stevens, Luca Pietro Giovanni Antiga), Manning Publications, 2020.
- (Optional) Deep Learning (Ian J. Goodfellow, Yoshua Bengio and Aaron Courville), MIT Press, 2016.
Acknowledgements
This course has been designed to meet the requirements of the
Prime Minister's
Research Fellowship for teaching work completion. I am grateful for the assistance
provided by my supervisor,
Prof. Suyash P. Awate (CSE IIT Bombay) in acquiring
the materials necessary for online teaching. I am also thankful to
Swami Punyeshwarananda,
the Head of the Computer Science Department at RKMVERI, for placing trust in me to
take on the responsibility of teaching this course. In conclusion, my heartfelt gratitude goes to my
Department,
CMInDS,
encompassing the dedicated faculty (
Prof. Manjunath), supportive staff (
Taha and Suraj sir), and cherished friends.
Their unwavering support has been instrumental in making all these achievements possible.