Recently I was playing with deep learning models in Tensorflow and consequently got introduced to data management like tensors.
As a data engineer who works all day on tables that I can easily slice, dice, and visualize, I had absolutely no intuition about working with tensors and seemed to constantly run into the same errors that, especially in the beginning, I far outgrew. my head.
However, delving into them has taught me a lot about tensors and TensorFlow, and I wanted to consolidate those learnings here to use as a reference.
If you have a favorite bug, fix, or debugging tip, please leave a comment!
Before delving into the errors themselves, I wanted to document some of the simple, lightweight code snippets that I've found useful for debugging. (Although for legal reasons it should be noted that we of course always debug with official debugging functions and never just with dozens of printed statements 🙂
See inside our Tensorflow datasets
First, looking at our actual data. When we print a Dataframe or SELECT * in SQL, we see the data! When we print a tensor data set we see…
<_TensorSliceDataset element_spec=(TensorSpec(shape=(2, 3), dtype=tf.int32, name=None), TensorSpec(shape=(1, 1), dtype=tf.int32, name=None))>
This is all pretty useful information, but it doesn't help us understand what's really going on in our data.
To print a single tensor within the execution graph, we can leverage tf.print. This article is a wonderful dive into tf.print that I highly recommend if you plan to use it frequently: Using tf.Print() in TensorFlow
But when we work with Tensorflow datasets during development, sometimes we need to see a few values at a time. For that we can loop through and print individual data like this:
# Generate dummy 2D data
np.random.seed(42)
num_samples = 100
num_features = 5
X_data = np.random.rand(num_samples, num_features).astype(np.float32)
y_data = 2 * X_data(:, 0) + 3 * X_data(:, 1) - 1.5 * X_data(:, 2) + 0.5 * X_data(:, 3) + np.random.randn(num_samples)# Turn it into a Tensorflow Dataset
dataset = tf.data.Dataset.from_tensor_slices((X_data, y_data))
# Print the first 10 rows
for i, (features, label) in enumerate(dataset.take(10)):
print(f"Row {i + 1}: Features - {features.numpy()}, Label - {label.numpy()}")
We can also use skip to reach a specific index:
mini_dataset = dataset.skip(100).take(20)
for i, (features, label) in enumerate(mini_dataset):
print(f"Row {i + 1}: Features - {features.numpy()}, Label - {label.numpy()}")
Knowing the specifications of our tensioners
When working with tensors, we also need to know their shape, range, dimension, and data type (if some of that vocabulary is unfamiliar to you, as it was to me initially, don't worry, we'll come back to it later in the article). Anyway, below are some lines of code to collect this information:
# Create a sample tensor
sample_tensor = tf.constant(((1, 2, 3), (4, 5, 6)))# Get the size of the tensor (total number of elements)
tensor_size = tf.size(sample_tensor).numpy()
# Get the rank of the tensor
tensor_rank = tf.rank(sample_tensor).numpy()
# Get the shape of the tensor
tensor_shape = sample_tensor.shape
# Get the dimensions of the tensor
tensor_dimensions = sample_tensor.shape.as_list()
# Print the results
print("Tensor Size:", tensor_size)
print("Tensor Rank:", tensor_rank)
print("Tensor Shape:", tensor_shape)
print("Tensor Dimensions:", tensor_dimensions)
Previous results:
Tensor Size: 6
Tensor Rank: 2
Tensor Shape: (2, 3)
Tensor Dimensions: (2, 3)
Augmentation model.summary()
Finally, it is always useful to be able to see how data moves through a model and how the shape changes at inputs and outputs between layers. The source of many errors will be a mismatch between these expected input and output shapes and the shape of a given tensor.
Model Summary() Of course, it gets the job done, but we can supplement that information with the following snippet, which adds a bit more context with the model and layer inputs and outputs:
print("###################Input Shape and Datatype#####################")
(print(i.shape, i.dtype) for i in model.inputs)
print("###################Output Shape and Datatype#####################")
(print(o.shape, o.dtype) for o in model.outputs)
print("###################Layer Input Shape and Datatype#####################")
(print(l.name, l.input, l.dtype) for l in model.layers)
So let's jump into some mistakes!
Range
ValueError: The form must have range x but range y….
Well, first of all, what is a range? The range is just the unit of dimensionality we use to describe tensors. A rank 0 tensor is a scalar value; a rank one tensor is a vector; a rank two is a matrix, and so on for all n dimensional structures.
Take, for example, a 5-dimensional tensor.
rank_5_tensor = tf.constant((((((1, 2), (3, 4)), ((5, 6), (7, 8))), (((9, 10), (11, 12)), ((13, 14), (15, 16)))),
((((17, 18), (19, 20)), ((21, 22), (23, 24))), (((25, 26), (27, 28)), ((29, 30), (31, 32))))))
print("\nRank 5 Tensor:", rank_5_tensor.shape)
Rank 5 Tensor: (2, 2, 2, 2, 2)
The code above shows that each dimension of the five has a size of two. If we wanted to index it we could do it according to any of these axes. To get to the last element, 32, we would run something like:
rank_5_tensor.numpy()(1)(1)(1)(1)(1)
He official tensioner documentation has some really useful visualizations to make this a little more understandable.
Back to the error: it simply indicates that the given tensor has a different dimension than expected for a particular function. For example, if the error states that “The shape should be of rank 1 but is of rank 0…” it means that we are providing a scalar value and it expects a 1-D tensor.
Let's take the following example where we are trying to multiply tensors along with the matmul method.
import tensorflow as tf
import numpy as np
# Create a TensorFlow dataset with random matrices
num_samples = 5
matrix_size = 3
dataset = tf.data.Dataset.from_tensor_slices(np.random.rand(num_samples, matrix_size, matrix_size))
mul = (1,2,3,4,5,6)# Define a function that uses tf.matmul
def matmul_function(matrix):
return tf.matmul(matrix, mul)
# Apply the matmul_function to the dataset using map
result_dataset = dataset.map(matmul_function)
If we take a look at the documentationmatmul expects at least a rank 2 tensor, so multiplying the matrix by (1,2,3,4,5,6), which is just a matrix, will generate this error.
ValueError: Shape must be rank 2 but is rank 1 for '{{node MatMul}} = MatMul(T=DT_DOUBLE, transpose_a=false, transpose_b=false)(args_0, MatMul/b)' with input shapes: (3,3), (2).
A great first step for this error is to dig into the documentation and understand what the function you are using is looking for (here is a good list of functions available in tensors: raw_operations.
Then use the ranking method to determine what we are actually providing.
print(tf.rank(mul))
tf.Tensor(1, shape=(), dtype=int32)
As far as solutions go, tf.reshape is usually a good place to start. Let's take a brief moment to talk a little about tf.reshape, as it will be a faithful companion throughout our Tensorflow journey: tf.reshape(tensor,shape,name=None)
Reshape simply takes the tensor we want to reshape and another tensor containing the shape we want the output to have. For example, let's reshape our multiplication input:
mul = (1,2,3,4,5,6)
tf.reshape(mul, (3, 2)).numpy()
array(((1, 2),
(3, 4),
(5, 6)), dtype=int32)
Our variable will become a (3,2) tensor (3 rows, 2 columns). A quick note, tf.reshape(t, (3, -1)).numpy() will produce the same thing because -1 tells Tensorflow to calculate the size of the dimension so that the total size remains constant. The number of elements in the shape tensor is the range.
Once we create a tensor with the proper range, our multiplication will work fine!
Shape
ValueError: layer input is incompatible with layer….
Having an intuitive understanding of the shape of the tensor and how it interacts and changes between the layers of the model has made life with deep learning much easier.
First, getting the basic vocabulary out of the way: the shape of a tensor refers to the number of elements along each dimension or axis of the tensor. For example, a 2D tensor with 3 rows and 4 columns has the form (3, 4).
So what can go wrong with the shape? I'm glad you asked, quite a few things!
First, the shape and range of the training data must match the input shape expected by the input layer. Let's take a look at an example, a basic CNN:
import tensorflow as tf
from tensorflow.keras import layers, models# Create a function to generate sample data
def generate_sample_data(num_samples=100):
for _ in range(num_samples):
features = tf.random.normal(shape=(64, 64, 3))
labels = tf.one_hot(tf.random.uniform(shape=(), maxval=10, dtype=tf.int32), depth=10)
yield features, labels
# Create a TensorFlow dataset using the generator function
sample_dataset = tf.data.Dataset.from_generator(generate_sample_data, output_signature=(tf.TensorSpec(shape=(64, 64, 3), dtype=tf.float32), tf.TensorSpec(shape=(10,), dtype=tf.float32)))
# Create a CNN model with an input layer expecting (128, 128, 3)
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=('accuracy'))
# Fit the model using the dataset
model.fit(sample_dataset.batch(32).repeat(), epochs=5, steps_per_epoch=100, validation_steps=20)
Trying to run the above code will result in:
ValueError: Input 0 of layer "sequential_5" is incompatible with the layer: expected shape=(None, 128, 128, 3), found shape=(None, 64, 64, 3)
This is because our model expects the input tensor to be of the form (128, 128, 3) and our generated data to be (64, 64, 3).
In a situation like this, our good friend, reshape, or another Tensorflow feature, resize, can help. If, as in the previous case, we are working with images, we can simply run resize or change the expectations of our model input:
def resize_image(image, label):
resized_image = tf.image.resize(image, size=target_shape)
return resized_image, label# Apply the resize function to the entire dataset
resized_dataset = sample_dataset.map(resize_image)
In this context, it's useful to know a little about how common types of models and model layers expect input in different ways, so let's take a little detour.
Dense layer deep neural networks take one-dimensional (or two-dimensional, depending on whether you include batch size, but we'll talk about batch size in a moment) tensors of the form (feature_size, ) where feature_size is the number of features in each sample. .
Convolutional neural networks take data representing images, using three-dimensional tensors (width, height, channels), where the channels are the color scheme, 1 for grayscale, and 3 for RBG.
And finally, recurrent neural networks, like LTSM, take 2 dimensions (time steps, feature_size)
But back to the mistakes! Another common culprit for Tensorflow shape errors has to do with how the shape changes as the data passes through the model layers. As mentioned above, different layers adopt different forms of input and can also reshape the output.
Going back to our CNN example above, let's look at it again and see what happens when we remove the Flatten layer. If we try to run the code we will see
ValueError: Shapes (None, 10) and (None, 28, 28, 10) are incompatible
This is where printing all the input and output shapes of our model along with our data shapes is useful to help us identify where there is a discrepancy.
model.summary() will show us
Layer (type) Output Shape Param #
=================================================================
conv2d_15 (Conv2D) (None, 126, 126, 32) 896
max_pooling2d_10 (MaxPooli (None, 63, 63, 32) 0
ng2D)
conv2d_16 (Conv2D) (None, 61, 61, 64) 18496
max_pooling2d_11 (MaxPooling2D) (None, 30, 30, 64) 0
conv2d_17 (Conv2D) (None, 28, 28, 64) 36928
flatten_5 (Flatten) (None, 50176) 0
dense_13 (Dense) (None, 64) 3211328
dense_14 (Dense) (None, 10) 650
=================================================================
Total params: 3268298 (12.47 MB)
Trainable params: 3268298 (12.47 MB)
Non-trainable params: 0 (0.00 Byte)
And our further diagnosis will reveal
###################Input Shape and Datatype#####################
(None, 128, 128, 3) <dtype: 'float32'>
###################Output Shape and Datatype#####################
(None, 10) <dtype: 'float32'>
###################Layer Input Shape and Datatype#####################
conv2d_15 KerasTensor(type_spec=TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name='conv2d_15_input'), name='conv2d_15_input', description="created by layer 'conv2d_15_input'") float32
max_pooling2d_10 KerasTensor(type_spec=TensorSpec(shape=(None, 126, 126, 32), dtype=tf.float32, name=None), name='conv2d_15/Relu:0', description="created by layer 'conv2d_15'") float32
conv2d_16 KerasTensor(type_spec=TensorSpec(shape=(None, 63, 63, 32), dtype=tf.float32, name=None), name='max_pooling2d_10/MaxPool:0', description="created by layer 'max_pooling2d_10'") float32
max_pooling2d_11 KerasTensor(type_spec=TensorSpec(shape=(None, 61, 61, 64), dtype=tf.float32, name=None), name='conv2d_16/Relu:0', description="created by layer 'conv2d_16'") float32
conv2d_17 KerasTensor(type_spec=TensorSpec(shape=(None, 30, 30, 64), dtype=tf.float32, name=None), name='max_pooling2d_11/MaxPool:0', description="created by layer 'max_pooling2d_11'") float32
flatten_5 KerasTensor(type_spec=TensorSpec(shape=(None, 28, 28, 64), dtype=tf.float32, name=None), name='conv2d_17/Relu:0', description="created by layer 'conv2d_17'") float32
dense_13 KerasTensor(type_spec=TensorSpec(shape=(None, 50176), dtype=tf.float32, name=None), name='flatten_5/Reshape:0', description="created by layer 'flatten_5'") float32
dense_14 KerasTensor(type_spec=TensorSpec(shape=(None, 64), dtype=tf.float32, name=None), name='dense_13/Relu:0', description="created by layer 'dense_13'") float32
That's a lot of results, but we can see that the dense_13 layer is looking for shape inputs (None, 50176). However, the conv2d_17 layer outputs (None, 28, 28, 64)
Flattening layers transforms the multidimensional output of previous layers into a one-dimensional (flat) vector expected by the Dense layer.
The Conv2d and Max Pooling layers also change their input data in other interesting ways, but they are outside the scope of this article. For an impressive breakdown, see: Ultimate Guide to Input Shape and Model Complexity in Neural Networks
But what about the lot size? I haven't forgotten!
If we break our code once again by removing the .batch(32) from the data set in model.fit we will get the error:
ValueError: Input 0 of layer "sequential_10" is incompatible with the layer: expected shape=(None, 128, 128, 3), found shape=(128, 128, 3)
This is because the first dimension of a layer's input is reserved for the batch size or the number of samples we want the model to work with at a time. For a great deep dive, read carefully Difference between batch and era.
The default batch size is None before tuning, as we can see in the model summary output, and our model expects us to set it somewhere else, depending on how we tune the hyperparameter. We can also force it on our input layer by using batch_input_size instead of input_size, but that decreases our flexibility in testing different values.
Guy
TypeError: Could not convert type object to tensor. Unsupported object type
Finally, let's talk a little about some specific data types in Tensors.
The error above is another one that, if you are used to working in database systems with tables built from all types of data, can be a little disconcerting, but it is one of the simplest to diagnose and solve, although there are A A couple of common causes to take into account.
The main problem is that, although tensioners support a variety of type of data, when we convert a NumPy array to tensors (a common flow within deep learning), the data types must be float. The following script initializes an artificial example of a data frame with None and with string data points. Let's look at some problems and solutions for this example:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
data = (
(None, 0.2, '0.3'),
(0.1, None, '0.3'),
(0.1, 0.2, '0.3'),
)
X_train = pd.DataFrame(data=data, columns=("x1", "x2", "x3"))
y_train = pd.DataFrame(data=(1, 0, 1), columns=("y"))# Create a TensorFlow dataset
train_dataset = tf.data.Dataset.from_tensor_slices((X_train.to_numpy(), y_train.to_numpy()))
# Define the model
model = Sequential()
model.add(Dense(1, input_dim=X_train.shape(1), activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=('accuracy'))
# Fit the model using the TensorFlow dataset
model.fit(train_dataset.batch(3), epochs=3)
Running this code will tell us that:
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type float).
The most obvious problem is that you are sending a NumPy array that contains some non-floating type, an object. If you have an actual column of categorical data, there are many ways to convert it to numeric data (single encoding, etc.), but that's outside the scope of this discussion.
We can determine that by running print(X_train.dtypes), which will tell us what is in our data frame that Tensorflow doesn't like.
x1 float64
x2 float64
x3 object
dtype: object
If we encounter non-floating data points, the following line will magically solve all our problems:
X_train = np.asarray(X_train).astype('float32')
Another thing to check is if you have None or np.nan somewhere.
To find out we can use some lines of code like:
null_mask = X_train.isnull().any(axis=1)
null_rows = X_train(null_mask)
print(null_rows)
Which tells us that we have nulls in rows 0 and 1:
x1 x2 x3
0 NaN 0.2 0.3
1 0.1 NaN 0.3
If so, and that is expected/intentional, we should replace those values with an acceptable alternative. Fillna can help us here.
X_train.fillna(value=0, inplace=True)
With these changes to the code below, our NumPy array will successfully be converted to a tensor dataset and we will be able to train our model.
I often find that I learn more about a particular technology when I have to fix bugs, and I hope this was helpful to you as well.
If you have interesting tips and tricks or funny Tensorflow bugs, pass them on!