Building a gender classifier model based on the dialogues of characters in Hollywood movies
How I ended up with a data science job
My topsy turvy ride to data science
Back in 2009, following a traditional middle class pathway, I cleared JEE & got into the Civil Engineering Department at IIT Bombay. I then changed by branch from Civil to Mechanical Engineering at the end of the first year on account of my good grades (In hindsight, it made no difference whatsoever!). But I never really felt excited about my core engineering classes. I interned at a major automobiles manufacturing firm and at a computional fluid dynamics software firm, 2 areas I thought I’d enjoy especially because I was interested in racing & F-1, but I could never see myself working in those fields for the rest of my life. I wouln’t have minded studying further, if only I knew what area I wanted to deep dive into.
Being sort of disillusioned with engineering, I ended up joining a Big-4 firm as a management consultant, because consulting somehow become the “dream job” during campus placements. You get a fancy lifestyle travelling around telling CXOs what to do with their businesses while still in your 20s & it also opens the door to various exit options. I actually did learn quite a bit about how businesses work across various sectors. It helped improve my soft skills as well but I felt I wasn’t doing justice to my technical skills. So after ~2 years, I quit the job but again had no idea what to do next except the fact that I didn’t want to do an MBA.
After consulting, I went to an edtech startup called Avanti, founded by my college seniors, aimed at providing high quality math & science education to students in remote areas using technology & an innovative pedagogy. Since I didn’t have any clear long term goals back then, I joined this startup in the hope of doing something good while I try & figure my stuff out. Working towards a common mission with a young bunch of talented & motivated team members was easily the best professional experience of my career! It almost felt like the extended semester of my college working with like minded people from the top schools of the country.
Breaking into data science
My foray into data science started while I was working on the large scale school transformation project with the Haryana government at Avanti. We were working with a team from J-PAL who were measuring the efficacy of Avanti’s classroom pedagogy and seeing how they were measuring student learning purely using data got me really hooked into statistics. A big advantage of working at an early stage startup is that you have the freedom to create your own role, and Avanti’s founding team supported me a lot to start the data science practise in the company.
Learning goes on
I had strong Excel skills & a brief coding experience when I started but I had to mainly rely on self learning to pick up the data wrangling, visualization & modeling skills across tools like R, Python & Tableau. I was fortunate that Avanti had a small but very strong team of developers that helped me pick up a lot of the best practises in tech along the way. Most of my projects back then revolved around creating web apps / dashboards to A) help students track their learning and fix their gaps and B) help teachers run classrooms more smoothly. These reporting tools laid the foundation for a lot of ML applications, and so, I always had something interesting to look forward to for learning.
I switched from Avanti after 4 years to head the data science team at Transunion, a major credit bureau. Compared to a startup, things move slowly in a big MNC, but then, you get more perks, and need to tackle fairly different set of problems such as scale, security and reliability. Regardless of the projects I work on, learning has now became a permanent part of my lifestyle, and if not for anything else, it’s this aspect of my job that I feel the most excited about.
The uncertain road ahead
The field of ML is changing at an amazing speed. To give you a rough idea, it was just 6 years ago that a winning solution to a competition on Kaggle was an XGBoost model written in a forgotten language like Common Lisp. Lisp, of all languages!! Today, you have tools like Tensorflow & PyTorch to train giant networks on GPUs & TPUs on the cloud. And then, everyday, you keep hearing about so many advancements in deep learning such as the new GPT-3 model being able to write student’s college application essays & viral posts on Hacker News.
Will this hype in ML or data science last ? Surely NOT !! We might even see an AI winter if the current research is not able to meet the oh-so-high expectations of investors and companies all around the world. I’m pretty sure the role of the unicorn “data scientist” will also go away soon. ML might just get integrated into software engineering creating a new role alongside front-end & back-end development, while Business Intelligence & Data Analyst roles might see more requirements added into their job descriptions.
The hype has really helped the L&D functions of the HR departments around the world though. Without much investment from thier side, their IT workforce has almost been forced to learn & adapt to this new technology due to excessive competition in the job market. For someone just entering the field, I must admit the road is much more tougher. The job market is highly saturated with thousands of entry level candidaites hoping to break into ML or AI, and the situation has gotten worse with COVID-19 (You can say the same thing about entry level CS jobs as well).
So should you really be working in this industry right now ? Regardless of an economic boom or bust, companies will always be on the lookout for good problem solvers who can quickly learn and adapt. You need to commit to the learning lifestyle though. For me, personally, I realized I needed to fill a lot of gaps in my theoritical CS knowledge and that is something I’m enjoying learning these days. A wise man once said you should start from scratch every 5 years. Maybe that sounds too extreme, but I’ve realized that as long as you’re moving, it’s not that difficult to change your trajectory.
Simple EDA of my reading activity using tidyverse on R Markdown
My experience using productivity tools for personal projects
Comparing Tree Recursion & Tail Recursion in Scheme & Python
My notes halfway through the book Learn You A Haskell
My topsy turvy ride to data science
ML Learning Resources that I loved
The magic of SICP