Image created by the author with DALL•E 3
We know that programming is a useful (essential?) skill for data scientists to possess. But what level of programming skill is necessary? Should a data scientist aspire to be “good enough” or instead want to become an expert-level programmer? Should we aspire to be coding ninjas?
If we are going to explore this topic, we should first have an idea of what a beginner, intermediate and expert level programmer $mdash; or at least what is your code seems.
Below you will find 2 programming tasks, each with 3 code snippets; one for each potential beginner, intermediate, and expert programmer’s approach to completing those tasks, with some explanation of the differences. This should give us a foundation on which to build a discussion about the importance of programming skills.
Remember, these are made-up approaches meant to mimic programming at these different levels. All scripts are functional and get the job done, but they do it with varying degrees of elegance, efficiency, and pythonicity.
Let’s first take a task that is simple but can be approached in multiple ways: finding the factorial of a given number. Let’s implement this task for hypothetical beginner, intermediate, and expert Python programmers, and compare the differences in the code.
Beginner Approach
A beginner can use a simple approach using a for
loop to calculate the factorial. This is how they could do it.
n = int(input("Enter a number to find its factorial: "))
factorial = 1
if n < 0:
print("Factorial does not exist for negative numbers")
elif n == 0:
print("The factorial of 0 is 1")
else:
for i in range(1, n + 1):
factorial *= i
print(f"The factorial of {n} is {factorial}")
Intermediate approach
An intermediate programmer could use a function to improve code reusability and readability, and also use the function math
library for basic checks.
import math
def factorial(n):
if n < 0:
return "Factorial does not exist for negative numbers"
elif n == 0:
return 1
else:
return math.prod(range(1, n + 1))
n = int(input("Enter a number to find its factorial: "))
result = factorial(n)
print(f"The factorial of {n} is {result}")
Expert approach
A skilled programmer could use recursion and add type hints for better maintainability. They can also make use of Python’s concise and expressive syntax.
from typing import Union
def factorial(n: int) -> Union(int, str):
return 1 if n == 0 else n * factorial(n - 1) if n > 0 else "Factorial does not exist for negative numbers"
n = int(input("Enter a number to find its factorial: "))
print(f"The factorial of {n} is {factorial(n)}")
Summary
Let’s take a look at the differences in the code and what stands out the most between experience levels.
- Beginner: uses longer general code, no functions or libraries, simple logic
- Intermediate: use a function for better structure, use
math.prod
to calculate the product - Expert: Use recursion for elegance, add typography hints, and use Python conditional expression for conciseness.
For a second example, consider the task of finding the Fibonacci sequence up to north numbers. This is how programmers of different levels could approach this task.
Beginner Approach
A beginner could use a basic. for
loop and a list to collect the Fibonacci numbers.
n = int(input("How many Fibonacci numbers to generate? "))
fibonacci_sequence = ()
if n <= 0:
print("Please enter a positive integer.")
elif n == 1:
print((0))
else:
fibonacci_sequence = (0, 1)
for i in range(2, n):
next_number = fibonacci_sequence(-1) + fibonacci_sequence(-2)
fibonacci_sequence.append(next_number)
print(fibonacci_sequence)
Intermediate approach
An intermediate programmer could use list comprehensions and the zip
feature for a more pythonic approach.
n = int(input("How many Fibonacci numbers to generate? "))
if n <= 0:
print("Please enter a positive integer.")
else:
fibonacci_sequence = (0, 1)
(fibonacci_sequence.append(fibonacci_sequence(-1) + fibonacci_sequence(-2)) for _ in range(n - 2))
print(fibonacci_sequence(:n))
Expert approach
An expert could use generators for a more memory-efficient approach, along with Python’s unpack function to swap variables in a single line.
def generate_fibonacci(n: int):
a, b = 0, 1
for _ in range(n):
yield a
a, b = b, a + b
n = int(input("How many Fibonacci numbers to generate? "))
if n <= 0:
print("Please enter a positive integer.")
else:
print(list(generate_fibonacci(n)))
Summary
Let’s look at what the main differences are and what main programmatic differences separate the experience levels.
- Beginner: Use basic checklists and structures, simple but a bit detailed.
- Intermediate: Use list comprehensions and
zip
for a more pythonic and concise solution - Expert: Employ a generator for a memory-saving solution and use unpacking for elegant variable exchange.
If all the example code works and finally gets the job done, Why should we strive to become the best coders we can be? Big question!
Becoming a competent programmer is about more than just making code work. Below are some reasons why it is beneficial to strive to be a better coder:
1. Efficiency
- Time: Writing more efficient code means that tasks are completed faster, which is beneficial for both the programmer and anyone who uses the software.
- Resource utilization: Efficient code uses less CPU and memory, which can be crucial for applications running with limited resources or at large scale.
2. Readability and maintainability
- Collaboration: Code is often written and maintained by teams. Clean, well-structured, and well-commented code is much easier for others to understand and collaborate with.
- Longevity: As projects grow or evolve, maintainable code is easier to extend, debug, and refactor, saving time and effort in the long run.
3. Reusability
- Modularity: Writing functions or modules that solve a problem well means you can easily reuse that code in other projects or contexts.
- Community Contributions: High-quality code can be open sourced and benefit a broader community of developers.
4. Robustness and Reliability
- Error Handling: Advanced programmers often write code that can not only solve problems but also handle errors gracefully, making the software more reliable.
- Testing: Understanding how to write testable code and real tests ensures that the code works as expected in various scenarios.
5. Skills recognition
- Career Advancement: Being recognized as an expert coder can lead to promotions, job opportunities, and higher salaries.
- Personal satisfaction: There is a sense of accomplishment and pride in knowing that you are capable of writing high-quality code.
6. Adaptability
- New technologies: Strong fundamental skills make it easier to adapt to new languages, libraries or paradigms.
- Problem Solving: A deeper understanding of programming concepts improves your ability to approach problems creatively and effectively.
7. Profitability
- Less debugging: Well-written code is typically less error-prone, reducing the amount of time and resources spent debugging.
- Scalability: Good code can be scaled up or down more easily, making it more profitable in the long run.
So while getting the job done is certainly important, the way you do it can have wide-ranging implications for your personal development, your team, and your organization. We should all strive to become the best programmers possible, and that goes for data scientists too.
Matthew May (@mattmayo13) has a master’s degree in computer science and a postgraduate diploma in data mining. As Editor-in-Chief of KDnuggets, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, machine learning algorithms, and exploring emerging ai. He is driven by the mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.