Choosing a university major was difficult for me. It felt like the first step in committing to a career and I wanted a little bit of everything. I liked math and programming, but I also wanted a job that allowed me to be creative, gave me a platform for communication, and was versatile enough to explore different industries. After some research, UC San Diego's Halıcıoğlu Data Science Institute (HDSI) data science program seemed like a good fit. Despite my decision to follow this path, I still had doubts and the assumptions I made at the beginning reflected this skepticism. However, as I work through my final trimesters, I am pleased (and surprised!) at how the realities of my experience have diverged from those expectations.
Expectation #1: Data science will consist of many repetitive math and programming classes.
The reality: although mathematics and programming are pillars, there is actually a lot of variety in the classes.
Looking back, my classes have had a lot more variety than I expected. Programming and math classes are the majority, but each course offers a different perspective on core topics while equipping us with a wide variety of tools. There is also significantly more diversity in the field, ranging from classes on definitions of statistical fairness to bioinformatics. I also found niches I especially enjoyed in healthcare, data ethics, and privacy. This helped me broaden my perspectives on the roles and industries I could enter as a data scientist early on.
Expectation #2: I would work alone most of the time.
The Reality: I work a lot with others and I am better for it.
I like working with people. Ideas are generated faster. I feel more creative and it's more fun! However, I initially gave in to the stereotype and imagined myself doing my data science homework hunched over a laptop for most of the day, so I was surprised by how much group work there was. Almost all of my programming and math classes encourage us to work with at least one other person. Meeting and working with people I didn't know took me out of my comfort zone and honed my communication and teamwork skills. Even in professional settings, when my work was freelance, I found that working with other interns made me a better data scientist. Although each of us had similar foundational skills, leaning on each other to utilize our different strengths and areas of focus allowed us to be better overall.
Expectation #3: Data science is the same as machine learning.
The reality: Machine learning is only part of the data science project lifecycle.
To be fair, I didn't know much about data science or how machine learning (ML) was defined when I started my journey. Still, entering the HDSI program, I thought data science was synonymous with ML. I imagined that most of my classes and work would consist of creating predictive models and delving into neural networks. Instead, most courses and jobs in data science focus on data cleaning, expiration, and visualization, and ML analysis takes less time than expected in the end… at least for now.
Expectation #4: My function could be automated.
The reality: Certain responsibilities can be automated, but the creativity of data scientists as problem solvers cannot.
This concern arose during my first natural language processing class, where my professor showed how quickly GPT-3 could write code. It was daunting as a beginning data scientist: how was I supposed to compete with models that could correctly write SQL queries faster than I could read them? However, this exercise was intended to illustrate that our role as technologists was not simply to learn how to use tools and understand the inherent processes that allow them to function. Great language models still can't do their job properly, but they will eventually (and inevitably) improve, and when they do, I'm optimistic that they will be more of a help than a harm to data scientists. Unlike data scientists, LLMs do not solve problems. They cannot generate original ideas, use creativity to solve ambiguous problems, or communicate effectively with different audiences. This may change in the future, but through my education and professional experiences, I am confident that I can still make a positive impact in this field.
Food to go
As part of my journey into data science, I learned to accept the unexpected that comes with reality. I learned that the breadth and depth of data science was ideal for doing a little bit of everything: research, programming, analysis, and storytelling. With that, I am confident in my decision to pursue data science and am excited to see what the next phase of my career has in store for me.