Categorical variables and twinx make basically no sense

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Categorical variables and twinx make basically no sense

AJ M
Example code is pasted below. Basically, just wanted to say that this is either completely non-intuitive, or I am failing to understand something fundamental about matplotlib.

My impression would be that twinx would let me assign categorical values to however they were before I twinned them (i.e., using the original axis' order and coordinates). But, if you use categorical values, it will truncate them to the length of the categorical list/array. In the example code, if you comment out the x1/x2 categorical assignments and use integers instead (uncomment those), it works as you'd expect. You can even swap the order in which the integers are plotted, but NOT if you use the text assignments, despite the fact that the integer and text axes are the same.


Anyway, someone please let me know if there is some design principle I'm missing here, or if this is a special case. It's missing from any of the top level documentation (see: https://matplotlib.org/gallery/lines_bars_and_markers/categorical_variables.htmlhttps://matplotlib.org/examples/api/two_scales.html ) and took me the better part of this afternoon to figure out. The only conclusion I can come to is that matplotlib treats these values differently, and converts the text arrays to integers under the hood without trying to align them.

Cheers,

AJ

Code:

import matplotlib.pyplot as plt

x1 = ['apples', 'bananas', 'cheerios']
##x1 = [1,3,5]
y1 = [5, 6, 15]

x2 = ['apples','carrots','bananas','watermelon','cheerios']
##x2 = [1,2,3,4,5]
y2 = [100, 200, 300, 400, 500]

fig, ax = plt.subplots()

ax.scatter(x2, y2)
ax2 = ax.twinx()
ax2.scatter(x1, y1, color = "orange")

plt.show()

_______________________________________________
Matplotlib-users mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-users
Reply | Threaded
Open this post in threaded view
|

Re: Categorical variables and twinx make basically no sense

Jody Klymak


On Mar 13, 2019, at  21:40 PM, AJ M <[hidden email]> wrote:

The only conclusion I can come to is that matplotlib treats these values differently, and converts the text arrays to integers under the hood without trying to align them.

Thats basically correct to my understanding - the twin axes is a new axes that shares it’s xlimits with the old one, but we don’t have any way of passing “categories” from one axes to the next, so it carries its own list of category->integer conversion that gets made anew when you call scatter.  The first axes is the one that gets the tick labels.  

You *may* be able to pass the converter to the second axes, but I’m not sure.  

Easier would be to just do the following:

```python

import matplotlib.pyplot as plt
import numpy as np

y1 = [5, np.NaN, 6, np.NaN, 15]
y2 = [100, 200, 300, 400, 500]

x = ['apples','carrots','bananas','watermelon','cheerios']

fig, ax = plt.subplots()
ax.scatter(x, y2)
ax2 = ax.twinx()
ax2.scatter(x, y1, color = "orange")
plt.show()
```

Cheers, Jody

_______________________________________________
Matplotlib-users mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-users
Reply | Threaded
Open this post in threaded view
|

Re: Categorical variables and twinx make basically no sense

AJ M
Great, thanks for the reply. I tried the NaN solution during my tinkering but thought I should somehow be getting away without having to recreate the axes in the same dimensions, since it's not required when using integers. Just making sure I wasn't missing anything more obvious

On Thu, Mar 14, 2019 at 12:29 AM Jody Klymak <[hidden email]> wrote:


On Mar 13, 2019, at  21:40 PM, AJ M <[hidden email]> wrote:

The only conclusion I can come to is that matplotlib treats these values differently, and converts the text arrays to integers under the hood without trying to align them.

Thats basically correct to my understanding - the twin axes is a new axes that shares it’s xlimits with the old one, but we don’t have any way of passing “categories” from one axes to the next, so it carries its own list of category->integer conversion that gets made anew when you call scatter.  The first axes is the one that gets the tick labels.  

You *may* be able to pass the converter to the second axes, but I’m not sure.  

Easier would be to just do the following:

```python

import matplotlib.pyplot as plt
import numpy as np

y1 = [5, np.NaN, 6, np.NaN, 15]
y2 = [100, 200, 300, 400, 500]

x = ['apples','carrots','bananas','watermelon','cheerios']

fig, ax = plt.subplots()
ax.scatter(x, y2)
ax2 = ax.twinx()
ax2.scatter(x, y1, color = "orange")
plt.show()
```

Cheers, Jody

_______________________________________________
Matplotlib-users mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-users