## The Silence of the Integers (Part 2)

In [119]:
import numpy as np
import scipy
from scipy import sparse

Version check, please

In [120]:
print(f'Numpy version={np.__version__}')
print(f'Scipy version={scipy.__version__}')

Numpy version=1.16.0
Scipy version=1.2.0


Lets define two numpy arrays, `y` contains the value 255, which is the max value for `np.uint8`

In [121]:
x = np.array([2, 3, 4], dtype=np.uint8)
y = np.array([5, 6, 255], dtype=np.uint8)

Now, lets add x and y, all really simple

In [122]:
x + y

array([7, 9, 3], dtype=uint8)

Whoops, the last element shouldn't really be 3, it should be more like 259, right?

What if we define an invalid array to begin with?

In [123]:
z = np.array([3, 4, 666], dtype=np.uint8)
z

array([  3,   4, 154], dtype=uint8)

Doesn't look any better, really

Maybe scipy is smarter?

In [124]:
zs = sparse.csr_matrix(z, dtype=np.uint16)

In [125]:
print(f'data={zs.data}; zs.dtype={zs.dtype}')

data=[  3   4 154]; zs.dtype=uint16


No, it isn't.

Lets just add two scalars, surely there's nothing wrong here??!!

In [126]:
np.uint8(255) + np.uint8(2)

1

I'm sorry??!! 

But wait, there is hope! By default numpy ignores all overruns, lets just increase the error sensitivity a bit

In [127]:
np.seterr(all='raise')

{'divide': 'ignore', 'over': 'ignore', 'under': 'ignore', 'invalid': 'ignore'}

In [128]:
np.uint8(255) + np.uint8(2)

FloatingPointError: overflow encountered in ubyte_scalars

Ahhh, much better!

Lets try adding the two arrays again!

In [129]:
x + y

array([7, 9, 3], dtype=uint8)

I'm really sorry, but `np.seterr` only works on scalars, not on arrays, how very useful indeed!

What about pandas? Maybe pandas is a better citizen?

In [130]:
import pandas as pd
print(f'Pandas version={pd.__version__}')

Pandas version=0.23.1


In [131]:
df1 = pd.DataFrame(x)
df2 = pd.DataFrame(y)
df3 = df1.add(df2)
df3

Unnamed: 0,0
0,7
1,9
2,3


Hmm...not really, it just takes whatever `dtype` is in the arrays and then goes with it apparantely

Any better luck when we specify the `dtype`?

In [132]:
df1 = pd.DataFrame(x, dtype=np.uint16)
df2 = pd.DataFrame(y, dtype=np.uint16)
df3 = df1.add(df2)
df3

Unnamed: 0,0
0,7
1,9
2,259


Ahh, finally a correct value somewhere!

Now what about pytorch? Surely, these clever Facebook people will not fall into the silent integer overflow trap!

In [133]:
import torch
print(f'pyTorch version={torch.__version__}')

pyTorch version=1.0.1.post2


In [134]:
x = torch.randint(low=136, high=255, size=(1, 3), dtype=torch.uint8)
y = torch.randint(low=136, high=255, size=(1, 3), dtype=torch.uint8)
print(f'x={x}; y={y}')

x=tensor([[214, 241, 245]], dtype=torch.uint8); y=tensor([[215, 167, 186]], dtype=torch.uint8)


In [135]:
x + y

tensor([[173, 152, 175]], dtype=torch.uint8)

Well, I'm sorry again, but they do. Maybe they've just invested all their money on privacy rather than engineering?

But how bad is it?

In [136]:
z = torch.randint(low=666, high=999, size=(1, 3), dtype=torch.uint8) 

In [137]:
z

tensor([[156, 225, 155]], dtype=torch.uint8)

Very bad.

Ok, if anyone can do stuff then it must be the Google people! 

In [138]:
import tensorflow as tf
print(f'{tf.__version__}')

1.12.0


In [139]:
x = tf.constant(value=np.array([1, 2, 255]), name='x', shape=(1, 3), dtype=tf.uint8)
y = tf.constant(value=np.array([3, 4, 2]), name='y', shape=(1, 3), dtype=tf.uint8)

In [140]:
session = tf.Session()
session.run(tf.global_variables_initializer())
print(session.run(tf.add(x, y)))

[[4 6 1]]


Despite all the clunkyness, it fails and fails and fails.

In [141]:
z = tf.constant(value=np.array([1000, 2000, 2550]), name='z', shape=(1, 3), dtype=tf.uint8)

In [142]:
print(session.run(z))

[[232 208 246]]


And fails and fails and fails.