What kind of data analysis can ai do?
We already know ChatGPT as the most versatile ai tool, with plugins that let you do almost anything. Can generate working code in Python, R and many other languages, as well as complex SQL queries. As you can imagine, combining these capabilities would allow you to use ai for almost every part of your data analysis work.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_123_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_123_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%”/>
Use cases include:
- Consulting
- Cleaning and other processing.
- Visualizing
When it comes to working with data, specialized tools like ai/page/2/?et_blog” rel=”noopener” target=”_blank”>july ai (for csv files) or BlazeSQL (for SQL databases) are designed specifically for this purpose. Unlike ChatGPT, these tools do not require you to upload/connect and explain your data every time you open them.
ChatGPT works for quick analysis on a csv file, but most companies store data in SQL databases within private networks. However, specialized tools can connect to these secure SQL databases and answer your questions by querying your database and viewing the results.
How could ai replace data analysts?
Data analytics is about getting insights from data, data analysts and data scientists are those who have the technical skills to provide stakeholders with the information they need. But things have changed and now ai tools can successfully complete some of the tasks that were previously only possible for data analysts and scientists.
In theory, a business stakeholder with no technical skills could now plug their data into an ai tool and make a request like “Get monthly revenue grouped by product, for the top 3 products of the year.” ai can then capture the data and even visualize it. The user would only need to spend a few seconds typing the request. If they had asked a human colleague, they might not have gotten an answer for a few days or more.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_993_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_993_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%”/>
Seeing a picture like this can be surprising and worrying for data analysts, but Replacing analysts and data scientists is not that simple.. Simply run a SQL query and graph the result is just part of your job, and even that can’t always be done reliably using ai. It may have worked in the screenshot above, but what if the result is wrong even though it looks correct?
It seems like it’s time to talk about some limitations of ai in working with data.
Limitation #1: ai Hallucinations
Most people who have worked with ChatGPT and similar tools have heard the term “hallucination” in this context. When you ask them about something they don’t know, sometimes they will answer you. just make things up.
The reason for these hallucinations is simple: LLMs are like very advanced autocomplete algorithms. they return the it will most likely be the next message in a conversation, based on the data with which they were trained. Thanks to high-quality data sets and advanced training techniques, this “autocompletion” works so well that these tools can satisfy complex requests with remarkably high-quality results. Unfortunately, when they encounter situations that their training data did not prepare them for, the it will most likely be the next message It may not actually make much sense.
What if you generate some code that runs, but the code returns incorrect data? The business stakeholder using ai Data Analyst may have no idea that the result is incorrect, but they cannot see the error because they do not understand the code.
Limitation #2: Commercial information.
Typically, when a new data analyst starts working at a company, they will have to learn what some of the columns and values mean. This is because the data model was designed by the company. You can’t just analyze data without understanding where it comes from, because common knowledge is not enough to understand most databases.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_757_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_757_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%”/>
ai tools like BlazeSQL allow you to include this information for ai to use, but a data analyst or data scientist will be needed to keep it up to date.
Limitation #3: Sometimes the ai just gets stuck. Also known as “blind spots”
You may have seen examples where ChatGPT got stuck on a very basic question. These questions are usually very easy to answer, but they require the ai to reason in a way that it is not very good at.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_984_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_984_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%”/>
We can call these cases “blind spots” and they also exist for writing code. Ex. A common blind spot that ai has when generating SQL queries is the use of subqueries. ai models often generate queries that attempt to select a column from a subquery, even though that column does not exist in the subquery.
WITH recent_orders AS (
SELECT
customer_id,
MAX(order_date) AS latest_order_date
FROM
orders
GROUP BY
customer_id
)
SELECT
customer_id,
product_id, -- (This column is not defined in the subquery)
latest_order_date
FROM
recent_orders
Even when the mistake is pointed out to them, they will often make the same mistake when trying again.
Limitation #4: ai Models Match Too Much
ai models will tend to agree with you, even when you are wrong. This can be a big problem when the ai model is supposed to play the role of an expert, since an expert should be able to correct you when you make a mistake.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”50%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_97_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_97_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”50%”/>
Limitation #5: Input Length
A human could spend months learning about a project and the database, gathering a lot of important information. On the other hand, an LLM typically has a “token limit,” meaning it can only accept a certain amount of information.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_976_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_976_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%”/>
This input length (also known as the “token limit”) is often restrictive when dealing with complex tasks. How could you summarize those months of learning into a few pages and fit them into the ai model?
The widely available version of GPT-4 is limited to 12 pages input + output. Keep in mind that a data analyst will attend hours of meetings and read documentation or reports. All output (GPT-4 code and explanation) must be subtracted from the 12 pages, since the limit includes the output, not just the input.
This means that a major data analysis project that requires a lot of learning and exploration is simply not feasible.
Limitation #6: Social Skills
Last but not least, ChatGPT and other ai chatbots are… just chatbots. Human interaction and social skills are a big part of working on data projects. Whether it’s building trust, dealing with office politics, or interpreting non-verbal communication. These elements are crucial to successfully collaborating with stakeholders and completing a project.
Whats Next?
As you can see, ai has a number of limitations that prevent it from being a fully capable data analyst. The list above only contains a few of the main limitations, but there are many other big obstacles when it comes to replacing a data expert. In other words, you don’t need to worry about ai replacing you!
Having said that, ai is already having a significant impact on data analysts and scientists. It may not be perfect, but it already offers incredible value.
Work faster with ai
Writing code, whether Python, SQL, or R, can be time-consuming. These ai tools may not be 100% accurate, but they still work well most of the time. It’s often 10 times faster to quickly review what they generated than to do everything from scratch.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_467_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_467_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”100%”/>
In cases where the ai struggles or makes mistakes frequently, it may be faster to do it from scratch. In other cases, the massive increase in productivity is worth the occasional debugging effort. The important thing is to experiment with different tools, know their strengths and weaknesses, and integrate them into your workflow accordingly.
What about the future?
Things are progressing extremely quickly, so some of the current limitations won’t necessarily be a factor for long. This is especially true now that so many people are using ai tools, as learn from your users. These interactions are used to train the models and there are millions of interactions every day.
ChatGPT has the fastest growing user base of all time and learn from that user base.
<img decoding="async" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”60%” src=”https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_487_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2023/11/1698836540_487_AI-vs-Data-Analysts-Top-Six-Limitations-Impacting-the-Future.png" alt="ai vs. Data Analysts: Top Six Limitations Impacting the Future of Analytics” width=”60%”/>
With competitors like Claude, Bard, and others joining the race, we’re sure to see big improvements soon.
Being prepared for these changes is easy: just be on the lookout for new tools and experiment with them. That way, you’ll know your strengths and weaknesses and can ensure you’re taking advantage of the latest technology and adapting as it evolves.
In that sense, some tools to consider include:
BlazeSQL (for SQL databases)
ChatGPT Advanced Data Analysis (For csv and other files)
ai” rel=”noopener” target=”_blank”>Panda ai (adding generative ai to pandas library)
Justus Mulli is a data scientist and founder, with experience in finance, healthcare and e-commerce. He leverages his expertise in data science and artificial intelligence to implement disruptive ai solutions across various industries and professions.