[Matplotlib-devel] plot confidence ellipse

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Matplotlib-devel] plot confidence ellipse

Carsten Schelp
Hello,

Quite some time ago I had to plot confidence ellipses for two-dimensional datasets.
There was no out-of-the-box solution in the libraries I used, then.
Also, all the solutions that I found when searching online seemed cumbersome or did not even work well.

So I brewed my own. You can see the result of my effort in this blogpost here (python, using numpy and matplotlib):
https://carstenschelp.github.io/2018/09/14/Plot_Confidence_Ellipse_001.html
- along with an explanation how and why it works.

If you guys think that this functionality might be useful within matplotlib, I would happily prepare this approach - pythonic-checks, tests and all - and send a patch or pull request.
I consider the confidence ellipse a kind of "cousin" to the "scatter(...)" function. Maybe it is appropriate to place it next to "scatter" - but that is still to be discussed, I think.
Do you think that a function to plot the confidence ellipse of a two-dimensional dataset should be part of matplotlib?
I am curious to hear your opinion. If the answer is "yes" I will make time to get the existing code "matplotlib-ready".

(Sorry, I am posting this for the second time. First time, I sent an html message which messed up the readability in the list)
Kind regards, Carsten
_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: plot confidence ellipse

Jody Klymak
Hi Carsten,

Thanks a lot for the cool example, and interest in contributing to matplotlib.

The only matplotlib-part of this is drawing an ellipse, which we already have code for.  From what I can tell, the rest is data processing and application of statistics.  While we have historically had some data processing and statistics in the package, we are actively trying to remove as much of that as possible and leave it for numpy, scipy, pandas etc.  Given that, I don’t think matplotlib is the correct home for this functionality.  But perhaps I’ve misunderstood your proposal.  Its certainly possible that other libraries would find this very appropriate.  

Thanks,   Jody

> On 21 Feb 2019, at 12:40, Carsten Schelp <[hidden email]> wrote:
>
> Hello,
>
> Quite some time ago I had to plot confidence ellipses for two-dimensional datasets.
> There was no out-of-the-box solution in the libraries I used, then.
> Also, all the solutions that I found when searching online seemed cumbersome or did not even work well.
>
> So I brewed my own. You can see the result of my effort in this blogpost here (python, using numpy and matplotlib):
> https://carstenschelp.github.io/2018/09/14/Plot_Confidence_Ellipse_001.html
> - along with an explanation how and why it works.
>
> If you guys think that this functionality might be useful within matplotlib, I would happily prepare this approach - pythonic-checks, tests and all - and send a patch or pull request.
> I consider the confidence ellipse a kind of "cousin" to the "scatter(...)" function. Maybe it is appropriate to place it next to "scatter" - but that is still to be discussed, I think.
> Do you think that a function to plot the confidence ellipse of a two-dimensional dataset should be part of matplotlib?
> I am curious to hear your opinion. If the answer is "yes" I will make time to get the existing code "matplotlib-ready".
>
> (Sorry, I am posting this for the second time. First time, I sent an html message which messed up the readability in the list)
> Kind regards, Carsten
> _______________________________________________
> Matplotlib-devel mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Jody Klymak    
http://web.uvic.ca/~jklymak/





_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: plot confidence ellipse

Antony Lee-3
In reply to this post by Carsten Schelp
Dear Carsten,

Thank you for sharing your example.
I personally think it looks like a reasonably common task, but Matplotlib is trying to move away from doing the statistical computations in favor of concentrating on the just providing the plotting parts (at least that's my personal take).  Hence, the example may be a better addition to the Matplotlib gallery (under examples/ in the source tree); feel free to open a PR for that.

Cheers,

Antony

On Thu, Feb 21, 2019 at 9:40 PM Carsten Schelp <[hidden email]> wrote:
Hello,

Quite some time ago I had to plot confidence ellipses for two-dimensional datasets.
There was no out-of-the-box solution in the libraries I used, then.
Also, all the solutions that I found when searching online seemed cumbersome or did not even work well.

So I brewed my own. You can see the result of my effort in this blogpost here (python, using numpy and matplotlib):
https://carstenschelp.github.io/2018/09/14/Plot_Confidence_Ellipse_001.html
- along with an explanation how and why it works.

If you guys think that this functionality might be useful within matplotlib, I would happily prepare this approach - pythonic-checks, tests and all - and send a patch or pull request.
I consider the confidence ellipse a kind of "cousin" to the "scatter(...)" function. Maybe it is appropriate to place it next to "scatter" - but that is still to be discussed, I think.
Do you think that a function to plot the confidence ellipse of a two-dimensional dataset should be part of matplotlib?
I am curious to hear your opinion. If the answer is "yes" I will make time to get the existing code "matplotlib-ready".

(Sorry, I am posting this for the second time. First time, I sent an html message which messed up the readability in the list)
Kind regards, Carsten
_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

Re: plot confidence ellipse

matplotlib - devel mailing list
In reply to this post by Jody Klymak
On Thu, Feb 21, 2019 at 12:59 PM Jody Klymak <[hidden email]> wrote:
Hi Carsten,

From what I can tell, the rest is data processing and application of statistics.  While we have historically had some data processing and statistics in the package, we are actively trying to remove as much of that as possible and leave it for numpy, scipy, pandas etc.  Given that, I don’t think matplotlib is the correct home for this functionality.

Maybe statsmodels is though:


If it's not already there.

Also -- there is a bit of maybe-tricky MPL code in there with applying transforms -- so it *may* make sense to add a utility to MPL to draw such ellipses.

-CHB

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel
Reply | Threaded
Open this post in threaded view
|

[Matplotlib-devel] plot confidence ellipse

Carsten Schelp
Thanks for the feedback, everybody! Appreciated.

It is definitely a good idea to guard that line "What should really be in MPL, and what not".

I remember my situation as a user, when I first had to plot those ellipses.
I thought "MPL gives me boxplots and histograms - I'll have a look whether I can also get a confidence ellipse, here."

Just like with scatter() or hist() I expected to put data in and get back a plot.
Not analysis, but mere visualisation.
And this basically is what confidence_ellipse() does. (Maybe I should then not return the matrix with the pearson coefficients, for clarity.)

In the function, numpy.cov() is the only thing that looks a bit like data analysis, from my perspective.
The rest of the code is addressing that non-trivial geometric problem to get the ellipse right.
And I think that MPL is also supposed to help users with geometrical problems, given it is about standard plotting functionality like histograms, boxplots and confidence ellipses?

hist() for instance uses numpy.histogram() to estimate the right number of bins. But only to get the plot right, not for analysis. And hist() is not to be considered data-aware, particularly.

The statsmodels module on the other hand has a lot of data-awareness and no geometry/plot awareness.
confidence_ellipse() would not really fit in there, I think.

Please don't get me wrong: I am not arguing. I just want to show you where my idea came from: "Hey this would be handy to have in MPL".

The MPS gallery is a great place to show the "howto", though. For the time being, I can submit an example there.

If the devel-community changes her mind, she knows how to find me :-)


Kind regards, Carsten




_______________________________________________
Matplotlib-devel mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/matplotlib-devel