Image by author
I like to think of ChatGPT as a smarter version of StackOverflow. Very useful, but it won't replace the professionals anytime soon. As a former data scientist, I spent a lot of time playing with ChatGPT when it came out. I was quite impressed with its coding capabilities. You could generate some pretty useful code from scratch; I could offer suggestions on my own code. It was pretty good at debugging if I asked it for help with an error message.
But inevitably, the more time I spent using it, the more I ran into its limitations. For any developers who fear ChatGPT will take their jobs, here's a list of what ChatGPT can't do.
The first limitation does not have to do with its capacity, but with legality. Any code generated solely by ChatGPT and copied and pasted by you into a company product could expose your employer to a nasty lawsuit.
This is because ChatGPT freely extracts code snippets from the data it was trained on, which comes from all over the Internet. “I had Chat gpt generate some code for me and instantly recognized which GitHub repository it got a big chunk of it from.” explained Reddit user ChunkyHabaneroSalsa.
Ultimately, it is not known where ChatGPT's code came from or what license it was under. And even if it was generated completely from scratch, anything created by ChatGPT is copyright-free. As Bloomberg Law writers Shawn Helms and Jason Krieser write put it, “A 'derivative work' is 'a work based on one or more pre-existing works.' ChatGPT is trained on pre-existing jobs and generates results based on that training.”
If you use ChatGPT to generate code, you may get in trouble with your employers.
Here's a fun test: have ChatGPT create code that runs statistical analysis in Python.
Is the statistical analysis correct? Probably not. ChatGPT does not know if the data meets the assumptions necessary for the test results to be valid. ChatGPT also doesn't know what interested parties want to see.
For example, I could ask ChatGPT to help me determine if there is a statistically significant difference in satisfaction ratings between different age groups. ChatGPT suggests an independent sample t-test and finds no statistically significant differences across age groups. But the t test is not the best option in this case for several reasons, such as the fact that there may be multiple age groups or the data is not normally distributed.
Picture of decipherzone.com
TO full stack data scientist you would know which assumptions to check and what type of test to run, and could possibly give ChatGPT more specific instructions. But ChatGPT itself will happily generate the correct code for the wrong statistical analysis, making the results unreliable and unusable.
For any problem like that that requires more critical thinking and problem solving, ChatGPT is not the best option.
Any data scientist will tell you that part of the job is understanding and interpreting the priorities of a project's stakeholders. ChatGPT, or any ai, cannot fully understand or manage them.
On the one hand, stakeholder priorities often involve complex decision-making that takes into account not only data, but also human factors, business objectives, and market trends.
For example, in an app redesign, you might find that the marketing team wants to prioritize user engagement features, the sales team is pushing features that support cross-selling, and the customer service team needs better features. in-app support to help users.
ChatGPT can provide information and generate reports, but it cannot make nuanced decisions that align with the varied (and sometimes competing) interests of different stakeholders.
Additionally, stakeholder management often requires a high degree of emotional intelligence: the ability to empathize with stakeholders, understand their concerns on a human level, and respond to their emotions. ChatGPT lacks emotional intelligence and cannot manage the emotional aspects of stakeholder relationships.
You may not think of this as a coding task, but the data scientist currently working on the code for that new feature release knows how closely he or she is working with stakeholder priorities.
ChatGPT doesn't come up with anything really new. You can only remix and reframe what you have learned from your training data.
Picture of theinsaneapp.com
Do you want to know how to change the legend size on your R chart? No problem: ChatGPT can pull thousands of StackOverflow answers to questions asking the same thing. But (using an example I asked ChatGPT to generate), what about something you're unlikely to have encountered before, like hosting a community meal where each person's plate must contain an ingredient that begins with same letter as your last name and you want to make sure there is a good variety of dishes.
When I tried this message, it gave me a Python code that decided the name of the dish had to match the last name, not even correctly capturing the ingredient requirement. He also wanted me to create 26 categories of dishes, one per letter of the alphabet. It wasn't a smart answer, probably because it was a completely new problem.
Last but not least, ChatGPT cannot code ethically. It does not have the ability to make value judgments or understand the moral implications of a piece of code as a human being does.
Ethical coding involves considering how the code might affect different groups of people, ensuring it does not discriminate or cause harm, and making decisions that align with ethical standards and social norms.
For example, if you ask ChatGPT to write code for a loan approval system, it could generate a model based on historical data. However, it fails to understand the social implications of such a model that potentially denies loans to marginalized communities due to biases in the data. It would be up to human developers to recognize the need for fairness and equity, look for and correct biases in the data, and ensure that code aligns with ethical practices.
It's worth noting that people aren't perfect at that either: someone coded Amazon's biased recruiting tooland someone coded the ai” rel=”noopener” target=”_blank”>Google Photo Categorization that identified black people as gorillas. But we humans do it better. ChatGPT lacks the empathy, awareness, and moral reasoning necessary to code ethically.
Humans can understand the broader context, recognize the subtleties of human behavior, and hold debates about right and wrong. We engage in ethical debates, weigh the pros and cons of a particular approach, and take responsibility for our decisions. When we make mistakes, we can learn from them in a way that contributes to our moral growth and understanding.
I loved Redditor Empty_Experience_10 carry in it: “If all you do is program, you are not a software engineer and yes, your job will be replaced. If you think software engineers get paid a lot because they can write code, then you have a fundamental misunderstanding of what it is to be a software engineer.”
I've found ChatGPT to be great for debugging, reviewing some code, and being a little faster than searching for that StackOverflow answer. But a lot of “coding” is more than just punching Python into a keyboard. It is knowing what the objectives of your business are. It is understanding the care that must be taken with algorithmic decisions. It's about building relationships with stakeholders, truly understanding what they want and why, and finding a way to make it happen.
It's storytelling, it's knowing when to choose a pie or bar chart, and it's understanding the narrative the data is trying to tell you. It's about being able to communicate complex ideas in simple terms that stakeholders can understand and make decisions about.
ChatGPT can't do any of that. As long as you can, your job will be safe.
Nate Rosidi He is a data scientist and in product strategy. He is also an adjunct professor of analysis and is the founder of StrataScratch, a platform that helps data scientists prepare for their interviews with real questions from top companies. Connect with him on Twitter: StrataScratch either LinkedIn.