[Matplotlib-devel] Units discussion...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Matplotlib-devel] Units discussion...

Jody Klymak
Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.  

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody



 
_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Drain, Theodore R (392P)
We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.    

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]> on behalf of Jody Klymak <[hidden email]>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Jody Klymak
Dear Ted,

Thanks so much for engaging on this.  

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.  

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.    
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]> on behalf of Jody Klymak <[hidden email]>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

dstansby
Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:
  • A mapping from your unit objects to floating point numbers
  • A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David


On 7 February 2018 at 06:06, Jody Klymak <[hidden email]> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]> on behalf of Jody Klymak <[hidden email]>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Hannah
I think what's also being proposed, and I think Ted also suggested, is an API audit to figure out how units are/are not being implemented in each function. Potentially we could even try to smooth out inconsistencies (like between plot and scatter).

On Feb 7, 2018 6:43 AM, "David Stansby" <[hidden email]> wrote:
Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:
  • A mapping from your unit objects to floating point numbers
  • A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David


On 7 February 2018 at 06:06, Jody Klymak <[hidden email]> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]> on behalf of Jody Klymak <[hidden email]>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Drain, Theodore R (392P)
In reply to this post by dstansby
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.    

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.  

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.  

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  *   A mapping from your unit objects to floating point numbers
  *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]>> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:[hidden email]>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Antony Lee
I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target).  In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  *   A mapping from your unit objects to floating point numbers
  *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]>> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:[hidden email]>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Jody Klymak
Thanks Antony, 



On Feb 8, 2018, at  8:09 AM, Antony Lee <[hidden email]> wrote:


From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

… or an indication that the astropy (for instance) use-case is good enough to base an API around.  

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

I was going to suggest that distinction as well.  Anything that requires `axes.add_artist` is deunitized since we use those artists all over the place internally and keeping track of whether we have units or not would be really hard.  

Cheers,   Jody

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  *   A mapping from your unit objects to floating point numbers
  *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]>> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:[hidden email]>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Drain, Theodore R (392P)
In reply to this post by Antony Lee
Sorry - that's not what I meant.  The unit conversions API that's in place works fine  I can't think of a better way to describe the use cases than the basic ones that seem (at least to me) to be obvious.  Numbers with units (5*km) and time classes (datetime or some other time class like we use) are the primary use case.   Another way to say it is that users have data where the normal representation is not float and they want to plot it, control how the transformation to float is done (plot in km or miles, in UTC or GPS time) and manipulate the plot after it's plotted (get bounds, change bounds, change units, move artists, edit data, etc) in the non-float representation that their data is already in.

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.  

This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.  

Ted

________________________________________
From: [hidden email] <[hidden email]> on behalf of Antony Lee <[hidden email]>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target).  In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<mailto:[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  *   A mapping from your unit objects to floating point numbers
  *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Antony Lee
The problem here (as you mentioned) is that essentially close to everything is a public API in Matplotlib, and I believe quite strongly that it is unreasonable to make every function check for unitized data (and what about attributes? are bboxes and transforms supposed to handle units too?).  For example, this leads to Line2D.get_data have the orig=True[False kwarg and the class needs to internally keep both unitized and deunitized data around; duplicating this support throughout all artists would be a *lot* of code.

While Axes methods can reasonably support units out of the box, I think it is more reasonable to have (up to bikeshedding) `Axes.unitize` and `Axes.deunitize` and then have people who need to play with the artists themselves do e.g. `artist.set_data(artist.axes.deunitize(unitized_data))` and `artist.axes.unitize(artist.get_data())`.  Yes, I realize this may be more work for you, but it's also a tradeoff of less work for us :-)  With this design, another possibility (which I guess Tom is not going to like, but I actually think is reasonable) would be for you to patch all the Artist classes yourself to support unitized data in all methods you want (using the proper wrapper methods).

The "moving target" part is basically that there has never been complete support for units everywhere in the code base, and because things are added in a piecemeal fashion rather than with a well thought-out design, I'm a bit tired of the constant stream of "oh, this function doesn't support datetimes, we need to fix it".  Again, I believe rethinking the design in a comprehensive fashion would help with that.

Antony

2018-02-08 18:13 GMT+01:00 Drain, Theodore R (392P) <[hidden email]>:
Sorry - that's not what I meant.  The unit conversions API that's in place works fine  I can't think of a better way to describe the use cases than the basic ones that seem (at least to me) to be obvious.  Numbers with units (5*km) and time classes (datetime or some other time class like we use) are the primary use case.   Another way to say it is that users have data where the normal representation is not float and they want to plot it, control how the transformation to float is done (plot in km or miles, in UTC or GPS time) and manipulate the plot after it's plotted (get bounds, change bounds, change units, move artists, edit data, etc) in the non-float representation that their data is already in.

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: [hidden email] <[hidden email]> on behalf of Antony Lee <[hidden email]>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target).  In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<mailto:[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  *   A mapping from your unit objects to floating point numbers
  *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Jody Klymak
In reply to this post by Drain, Theodore R (392P)

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data. 

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).  

2) write a developer’s guide explaining how units should be/are implemented 
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented.  The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.  

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.  

Ted

________________________________________
From: [hidden email] <[hidden email]> on behalf of Antony Lee <[hidden email]>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

 *   A mapping from your unit objects to floating point numbers
 *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> on behalf of Jody Klymak <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Drain, Theodore R (392P)
Sorry if it came across that way - I wasn't try to say that it doesn't need work.  I completely agree with you on 1) and 2).

As far as what to do about units in the draw stage - I'm not saying anything like that (thought that might be the result).  I'm saying that to support units, we should design the API's to support units and be explicit about which API's support units and which don't (new doc tag maybe?).  I'm not making any statements about the underlying affect of that statement on the code.  There are a probably a number of designs that meet that API goal but I don't know enough of the MPL internals to advocate for one over the other.

I think we can help with building a better toy unit system.  Or we can standardize on datetime and some existing unit package.  Whatever makes it easier for people to write test cases.

 ________________________________________
From: Jody Klymak <[hidden email]>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).

2) write a developer’s guide explaining how units should be/are implemented
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented.  The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: [hidden email]<mailto:[hidden email]> <[hidden email]<mailto:[hidden email]>> on behalf of Antony Lee <[hidden email]<mailto:[hidden email]>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

 *   A mapping from your unit objects to floating point numbers
 *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:matplotlib-devel-bounces+ted.drain=[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

dstansby
In reply to this post by Jody Klymak
*puts hand up* I'm (sort of...) a Matplotlib developer and use (am starting to use) units in my day to day research.

My proposal remains this:
  • The user provides a method for converting objects of their custom class to a floating point numbers
  • The user provides a method for converting floating point numbers to their custom class
  • Everything in Matplotlib that accepts data accepts custom type objects
  • The first thing it does is convert these objects to floats, and then everything we do internally is with those floats

Can anyone point out reasons that this isn't the right way to do it? This has the advantages:

  • Matplotlib takes no responsibility for the conversion
  • We only every calculate with floats
  • You can use whatever objects you want, as long as your converter goes object --> float

Having thought a little bit this seems like the obvious way to do it, but I may be missing something.

David


On 8 February 2018 at 17:39, Jody Klymak <[hidden email]> wrote:

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data. 

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).  

2) write a developer’s guide explaining how units should be/are implemented 
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented.  The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.  

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.  

Ted

________________________________________
From: [hidden email] <[hidden email]> on behalf of Antony Lee <[hidden email]>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

 *   A mapping from your unit objects to floating point numbers
 *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> on behalf of Jody Klymak <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel



_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Antony Lee
If you think you can make this work, I'm all for it.  This would definitely be a project where I think a large PR covering many changes would be nicer (well, it'd still be hell to review...) to convince us skeptics :-) that the approach is indeed viable.
Antony

2018-02-08 19:02 GMT+01:00 David Stansby <[hidden email]>:
*puts hand up* I'm (sort of...) a Matplotlib developer and use (am starting to use) units in my day to day research.

My proposal remains this:
  • The user provides a method for converting objects of their custom class to a floating point numbers
  • The user provides a method for converting floating point numbers to their custom class
  • Everything in Matplotlib that accepts data accepts custom type objects
  • The first thing it does is convert these objects to floats, and then everything we do internally is with those floats

Can anyone point out reasons that this isn't the right way to do it? This has the advantages:

  • Matplotlib takes no responsibility for the conversion
  • We only every calculate with floats
  • You can use whatever objects you want, as long as your converter goes object --> float

Having thought a little bit this seems like the obvious way to do it, but I may be missing something.

David


On 8 February 2018 at 17:39, Jody Klymak <[hidden email]> wrote:

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data. 

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).  

2) write a developer’s guide explaining how units should be/are implemented 
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented.  The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.  

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.  

Ted

________________________________________
From: [hidden email] <[hidden email]> on behalf of Antony Lee <[hidden email]>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

 *   A mapping from your unit objects to floating point numbers
 *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>> on behalf of Jody Klymak <[hidden email]<[hidden email]><[hidden email]<[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel



_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel



_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Drain, Theodore R (392P)
It sounds like the solution to not supporting units everywhere was to tell users to add a one line call to all inputs and a one line call to all outputs to convert to/from floats.  So if that's a viable solution, then doesn't it follow that having that code in the library is also a viable solution? And isn't that the point of a library to reduce the same code being written over and over again?  I'm not saying it's not work.  But I don't see any concrete problems with the approach being put forth.  

One possible plan might be:

1) Decide on the data management approach.  Starting with something like "units are removed and added at the function interface points and not kept internally" might be a fine.  So basically classes have to do the input->internal conversion for inputs and the internal->output conversion for return values but unitized data is not kept internally.  This does require that unit converters also handle non-unitized data as MPL classes will be calling other methods with their internal data (non-unitized) but that shouldn't be a huge problem.  This might also lead to a class hierarchy of unit converters where a user converter tries to handle the data and if it can't, it calls the base converter which handles floats, numpy, arrays, etc.

2) Update the doc processing (or just codify a standard) to identify unit supporting methods.  Add developer docs to explain how and where external<->internal data conversions should occur.

3) Implement "standard" date and unit test classes and converters to use for all test cases

4) Create a prioritized list of the API's to work on.  Start with Axes and add standardized unit handling in all of those methods first (or maybe prioritize the methods in Axes since there are a lot of them).  Ideally, this includes updating the docs and having unit and non-unit test cases for each.  Once that's done and proves the concept, start working through the underlying Artist's and Patches as time permits.  

I think this approach allows individual methods to be updated and tested.  They don't all have to be done at once and since the converter has to handle non-unitized data, this shouldn't break existing code.  It seems like each method in Axes can be it's own PR w/ test case to make review and merging simpler.  Working from the top (Axes) down also means that the internal classes don't need to be changed.

I can probably help w/ resources for doing this but I'll have to check on availability once a plan is finalized.

Ted



________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]> on behalf of Antony Lee <[hidden email]>
Sent: Thursday, February 8, 2018 10:16 AM
To: David Stansby
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

If you think you can make this work, I'm all for it.  This would definitely be a project where I think a large PR covering many changes would be nicer (well, it'd still be hell to review...) to convince us skeptics :-) that the approach is indeed viable.
Antony

2018-02-08 19:02 GMT+01:00 David Stansby <[hidden email]<mailto:[hidden email]>>:
*puts hand up* I'm (sort of...) a Matplotlib developer and use (am starting to use) units in my day to day research.

My proposal remains this:

  *   The user provides a method for converting objects of their custom class to a floating point numbers
  *   The user provides a method for converting floating point numbers to their custom class
  *   Everything in Matplotlib that accepts data accepts custom type objects
  *   The first thing it does is convert these objects to floats, and then everything we do internally is with those floats

Can anyone point out reasons that this isn't the right way to do it? This has the advantages:

  *   Matplotlib takes no responsibility for the conversion
  *   We only every calculate with floats
  *   You can use whatever objects you want, as long as your converter goes object --> float

Having thought a little bit this seems like the obvious way to do it, but I may be missing something.

David

On 8 February 2018 at 17:39, Jody Klymak <[hidden email]<mailto:[hidden email]>> wrote:

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).

2) write a developer’s guide explaining how units should be/are implemented
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented.  The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: [hidden email]<mailto:[hidden email]> <[hidden email]<mailto:[hidden email]>> on behalf of Antony Lee <[hidden email]<mailto:[hidden email]>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

 *   A mapping from your unit objects to floating point numbers
 *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:matplotlib-devel-bounces+ted.drain=[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel



_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Jody Klymak
In reply to this post by Drain, Theodore R (392P)


On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) <[hidden email]> wrote:

I think we can help with building a better toy unit system.  Or we can standardize on datetime and some existing unit package.  Whatever makes it easier for people to write test cases.

For me, the problem w/ datetime is that it is not fully featured units handling in that it doesn’t support multiple units.  Its really just a class of data that we have known conversion to float for. 

What we need an example of is how the following should work. 

```python
x = np.arange(10)
y = x*2 * myunitclass.in
ax.plot(x, y)
z = x*2 * myunitclass.cm
ax.plot(x, z)

```

So when a new feature is added, we can ask that its units support is made clear.  I guess I don’t mind if those are astropy units or yt units, or pint, or?? though there will be some pushback about including another test dependency.  

Would pint units work?  Its a very small dependency, but maybe not as full featured or structured wildly differently from the others?  

A test suite to my mind would 
 - test basic functionality
 - test mixing allowed dimensions (i.e. inches and centimeters)
 - test changing the axis units (so all the plotted data changes its values, *or* the tick locators/formatters change their values).
 - test that disallowed mixed dimensions fail.
 - ??

Cheers,  Jody









________________________________________
From: Jody Klymak <[hidden email]>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).

2) write a developer’s guide explaining how units should be/are implemented
a) in matplotlib modules
       b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: [hidden email]<[hidden email]> <[hidden email]<[hidden email]>> on behalf of Antony Lee <[hidden email]<[hidden email]>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<[hidden email]><[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<[hidden email]><[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

*   A mapping from your unit objects to floating point numbers
*   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>> on behalf of Jody Klymak <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Jody Klymak    






_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Nathan Goldbaum


On Thu, Feb 8, 2018 at 12:48 PM, Jody Klymak <[hidden email]> wrote:


On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) <[hidden email]> wrote:

I think we can help with building a better toy unit system.  Or we can standardize on datetime and some existing unit package.  Whatever makes it easier for people to write test cases.

For me, the problem w/ datetime is that it is not fully featured units handling in that it doesn’t support multiple units.  Its really just a class of data that we have known conversion to float for. 

What we need an example of is how the following should work. 

```python
x = np.arange(10)
y = x*2 * myunitclass.in
ax.plot(x, y)
z = x*2 * myunitclass.cm
ax.plot(x, z)

```

So when a new feature is added, we can ask that its units support is made clear.  I guess I don’t mind if those are astropy units or yt units, or pint, or?? though there will be some pushback about including another test dependency.  

Would pint units work?  Its a very small dependency, but maybe not as full featured or structured wildly differently from the others?  

One wrinkle: pint implements a "wrapper" array class rather than an ndarray subclass. Both astropy and yt use and ndarray subclass. There are some classes of errors that only happen for one style of unit arrays and other classes of errors that only happen for the other.
 

A test suite to my mind would 
 - test basic functionality
 - test mixing allowed dimensions (i.e. inches and centimeters)
 - test changing the axis units (so all the plotted data changes its values, *or* the tick locators/formatters change their values).
 - test that disallowed mixed dimensions fail.
 - ??

Cheers,  Jody









________________________________________
From: Jody Klymak <[hidden email]>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).

2) write a developer’s guide explaining how units should be/are implemented
a) in matplotlib modules
       b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: [hidden email]<[hidden email]> <[hidden email]<[hidden email]>> on behalf of Antony Lee <[hidden email]<[hidden email]>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<[hidden email]><[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<[hidden email]><[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

*   A mapping from your unit objects to floating point numbers
*   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>> on behalf of Jody Klymak <[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]><[hidden email]<[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]><[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Jody Klymak    






_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel



_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Jody Klymak


One wrinkle: pint implements a "wrapper" array class rather than an ndarray subclass. Both astropy and yt use and ndarray subclass. There are some classes of errors that only happen for one style of unit arrays and other classes of errors that only happen for the other.


OK. glad I asked.  Is that the case w/ JPL units as well?  If that stipulation makes things easier, maybe se could enforce it?  

Is there a smaller library that subclasses ndarray for units support?  I imagine we could vendorize a subset of whatever astropy or yt do.  Or maybe they aren’t so huge that they would be unreasonable to make as test dependencies.  yt is only 68 Mb.


Cheers,   Jody

--
Jody Klymak    






_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Drain, Theodore R (392P)
In reply to this post by Nathan Goldbaum
Does numpy subclassing really matter?  If the docs say the unit converter must convert from the external type to the internal type, then as long as the converter does that, it doesn't matter what the external type is or what it inherits from right?  The point is that the converter class is the only class manipulating the external data objects - MPL shouldn't care what they are or what they inherit from.

I think one issue is that data types are malleable in the API right now.  Lists, tuples, numpy, ints, floats, etc are all possible inputs in many/most cases.  IMO, the unit API should not be malleable at all.  The unit converter API should say that the return type of external->internal conversion is always a specific value type (e.g. list of float, numpy float 64 array).

Jody: IMO, your example should plot the data in inches in the first plot call, then convert the second input to inches and plot that.  The plot calls supports the xunits keyword argument which tells the converter what floating point unit conversion to apply.  If that keyword is not specified, then it defaults to the type of the input.  The example that needs to be more clear is if I do this:

ax.plot( x1, y1, xunits="km" )
ax.plot( x2, y2, xunits="miles" )

IMO, either the floats are km or miles, not both.  So either the first call sticks the converter to using km and the second xunits is ignored.  Or the second input overrides the first and requires that the first artists go back through a conversion to miles.  Either is a reasonable choice for behavior (but the first is much easier to implement).

________________________________________
From: Nathan Goldbaum <[hidden email]>
Sent: Thursday, February 8, 2018 10:52 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

On Thu, Feb 8, 2018 at 12:48 PM, Jody Klymak <[hidden email]<mailto:[hidden email]>> wrote:


On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]>> wrote:

I think we can help with building a better toy unit system.  Or we can standardize on datetime and some existing unit package.  Whatever makes it easier for people to write test cases.

For me, the problem w/ datetime is that it is not fully featured units handling in that it doesn’t support multiple units.  Its really just a class of data that we have known conversion to float for.

What we need an example of is how the following should work.

```python
x = np.arange(10)
y = x*2 * myunitclass.in<http://myunitclass.in>
ax.plot(x, y)
z = x*2 * myunitclass.cm<http://myunitclass.cm>
ax.plot(x, z)

```

So when a new feature is added, we can ask that its units support is made clear.  I guess I don’t mind if those are astropy units or yt units, or pint, or?? though there will be some pushback about including another test dependency.

Would pint units work?  Its a very small dependency, but maybe not as full featured or structured wildly differently from the others?

One wrinkle: pint implements a "wrapper" array class rather than an ndarray subclass. Both astropy and yt use and ndarray subclass. There are some classes of errors that only happen for one style of unit arrays and other classes of errors that only happen for the other.


A test suite to my mind would
 - test basic functionality
 - test mixing allowed dimensions (i.e. inches and centimeters)
 - test changing the axis units (so all the plotted data changes its values, *or* the tick locators/formatters change their values).
 - test that disallowed mixed dimensions fail.
 - ??

Cheers,  Jody









________________________________________
From: Jody Klymak <[hidden email]<mailto:[hidden email]>>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).

2) write a developer’s guide explaining how units should be/are implemented
a) in matplotlib modules
       b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.

Thanks a lot,   Jody


This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: [hidden email]<mailto:[hidden email]><mailto:[hidden email]> <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>> on behalf of Antony Lee <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

*   A mapping from your unit objects to floating point numbers
*   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:matplotlib-devel-bounces+ted.drain=[hidden email]><mailto:matplotlib-devel-bounces+ted.drain=[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.

Cheers,   Jody




_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]><mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Jody Klymak
http://web.uvic.ca/~jklymak/






_______________________________________________
Matplotlib-devel mailing list
[hidden email]<mailto:[hidden email]>
https://mail.python.org/mailman/listinfo/matplotlib-devel


_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: Units discussion...

Jody Klymak


> On 8 Feb 2018, at 11:08, Drain, Theodore R (392P) <[hidden email]> wrote:
>
> Does numpy subclassing really matter?  If the docs say the unit converter must convert from the external type to the internal type, then as long as the converter does that, it doesn't matter what the external type is or what it inherits from right?  The point is that the converter class is the only class manipulating the external data objects - MPL shouldn't care what they are or what they inherit from.
>
> I think one issue is that data types are malleable in the API right now.  Lists, tuples, numpy, ints, floats, etc are all possible inputs in many/most cases.  IMO, the unit API should not be malleable at all.  The unit converter API should say that the return type of external->internal conversion is always a specific value type (e.g. list of float, numpy float 64 array).

Yep, I think we all agree on this, but it ends up being messy….


> Jody: IMO, your example should plot the data in inches in the first plot call, then convert the second input to inches and plot that.  The plot calls supports the xunits keyword argument which tells the converter what floating point unit conversion to apply.  If that keyword is not specified, then it defaults to the type of the input.  The example that needs to be more clear is if I do this:
>
> ax.plot( x1, y1, xunits="km" )
> ax.plot( x2, y2, xunits="miles" )
>
> IMO, either the floats are km or miles, not both.  So either the first call sticks the converter to using km and the second xunits is ignored.  Or the second input overrides the first and requires that the first artists go back through a conversion to miles.  Either is a reasonable choice for behavior (but the first is much easier to implement).

That’d be great.  Thats not what our toy does now.  This way of setting the units is also not very flexible. I could imagine users wanting to change units at some point, either by setting the units in the `ax.plot` calls or explicitly on the `Axis` objects themselves.  If we carry the unitized objects around, and only convert at draw time, post-facto conversion is fine.  If we carry de-unitized data around, then we need an inverse so we can re-convert.  

*My* idea which some others have also esposed, is that the converter converts to floats (likely representing some base unit that makes sense, i.e in the example above “meters” or “kilometers”).  The “xunits” are maleable until draw time, at which point the Formatter and Locator decide how to format themselves.  The xdata is never changed.  Thats basically how our datetime formatting works - it is converted to days since epoch and then the Formatter and Locator decide how to format the axis.  I think this works equally well for other artists plotted in dataspace.  Because you have an inverse function, other tools that rely on getting data-space data like cursor position, making a box, etc, can still return the inverse in apprpriate units.

Cheers,  Jody




> ________________________________________
> From: Nathan Goldbaum <[hidden email]>
> Sent: Thursday, February 8, 2018 10:52 AM
> To: Jody Klymak
> Cc: Drain, Theodore R (392P); matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> On Thu, Feb 8, 2018 at 12:48 PM, Jody Klymak <[hidden email]<mailto:[hidden email]>> wrote:
>
>
> On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]>> wrote:
>
> I think we can help with building a better toy unit system.  Or we can standardize on datetime and some existing unit package.  Whatever makes it easier for people to write test cases.
>
> For me, the problem w/ datetime is that it is not fully featured units handling in that it doesn’t support multiple units.  Its really just a class of data that we have known conversion to float for.
>
> What we need an example of is how the following should work.
>
> ```python
> x = np.arange(10)
> y = x*2 * myunitclass.in<http://myunitclass.in>
> ax.plot(x, y)
> z = x*2 * myunitclass.cm<http://myunitclass.cm>
> ax.plot(x, z)
>
> ```
>
> So when a new feature is added, we can ask that its units support is made clear.  I guess I don’t mind if those are astropy units or yt units, or pint, or?? though there will be some pushback about including another test dependency.
>
> Would pint units work?  Its a very small dependency, but maybe not as full featured or structured wildly differently from the others?
>
> One wrinkle: pint implements a "wrapper" array class rather than an ndarray subclass. Both astropy and yt use and ndarray subclass. There are some classes of errors that only happen for one style of unit arrays and other classes of errors that only happen for the other.
>
>
> A test suite to my mind would
> - test basic functionality
> - test mixing allowed dimensions (i.e. inches and centimeters)
> - test changing the axis units (so all the plotted data changes its values, *or* the tick locators/formatters change their values).
> - test that disallowed mixed dimensions fail.
> - ??
>
> Cheers,  Jody
>
>
>
>
>
>
>
>
>
> ________________________________________
> From: Jody Klymak <[hidden email]<mailto:[hidden email]>>
> Sent: Thursday, February 8, 2018 9:39 AM
> To: Drain, Theodore R (392P)
> Cc: matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.
>
> OK, *for discussion*: A scope of work for JPL and Matplotlib might be:
>
> 1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see https://github.com/matplotlib/matplotlib/issues/9713 for why I’m a little dismayed with the state of things).
>
> 2) write a developer’s guide explaining how units should be/are implemented
> a) in matplotlib modules
>       b) by downstream developers (this is probably adequate already).
>
> It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists.  Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.
>
> OTOH, maybe Antony and I are poor people to lead this charge, given that we don’t need unit support.  But I don’t think we are being hypercritical in pointing out it needs work.
>
> Thanks a lot,   Jody
>
>
> This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.
>
> I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.
>
> Ted
>
> ________________________________________
> From: [hidden email]<mailto:[hidden email]><mailto:[hidden email]> <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>> on behalf of Antony Lee <[hidden email]<mailto:[hidden email]><mailto:[hidden email]>>
> Sent: Thursday, February 8, 2018 8:09 AM
> To: Drain, Theodore R (392P)
> Cc: matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.
>
> One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).
>
>> From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).
>
> As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).
>
> Antony
>
> 2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]>>:
> That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.
>
> Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.
>
> Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.
>
> That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.
>
> Ted
>
> ________________________________________
> From: David Stansby <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]>>
> Sent: Wednesday, February 7, 2018 3:42 AM
> To: Jody Klymak
> Cc: Drain, Theodore R (392P); matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:
>
> *   A mapping from your unit objects to floating point numbers
> *   A mapping from those floats back to your unit objects
>
> As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.
>
> Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.
>
> David
>
> On 7 February 2018 at 06:06, Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
> Dear Ted,
>
> Thanks so much for engaging on this.
>
> Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.
>
> FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.
>
> Cheers,   Jody
>
> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=[hidden email]<mailto:matplotlib-devel-bounces+ted.drain=[hidden email]><mailto:matplotlib-devel-bounces+ted.drain=[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> on behalf of Jody Klymak <[hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]><mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]><mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> --
> Jody Klymak
> http://web.uvic.ca/~jklymak/
>
>
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]<mailto:[hidden email]>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Jody Klymak    
http://web.uvic.ca/~jklymak/





_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
12