Loading

Alison's New App is now available on iOS and Android! Download Now

Study Reminders
Support
Text Version

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

Different Types of Basic Matplotlib Charts

00:00 Speaker 1: What is going on, everybody, and welcome to the second part of the section 2 for data visualization with Python using matplotlib. So in this section we're going to be just talking about the basic plotting options, how to use matplotlib and all of that. Now one of the most basic versions of the graph that we can possibly do is one where we just have some points and then we connect those points by lines and this is generally just referred to as a line graph. This is also the default graph of matplotlib in most cases and it makes for a great introduction to the module. So first of all to use most of matplotlib typically you're gonna have the first liner and import, it's gonna be import "matplotlib.pyplot" as plt. Now the pyplot frameworks is one of the main frameworks for creating charts and graphs in matplotlib.

01:02 S1: Now as a side note the original idea for matplotlib was to have overall mimic the MATLAB plotting framework. MATLAB for those of you who do not know, it's a high level language with an interactive development style a lot like what Python can be, though MATLAB is almost solely used by engineering and scientific folks. It's not as generic or general as Python can be. And now really there's nothing MATLAB does that Python doesn't. So that was kind of why the creator of matplotlib or creators saw that there was a little bit of a hole on Python as far as scientific graphing was concerned. Now once you've imported this, you're ready to do basic graphs you don't actually need any other imports besides this pyplot module. Now what we're going to do is, let's just do a really simple plot so we can use plt and the reason why we import this as plt you could import from matplotlib import pyplot and that would be totally fine.

02:06 S1: It's just that most people tend to use plt as like the short hand for pyplot so it's just kind of a standard that people use so we'll continue to use that standard. Now with pyplot we can use the dotplot method so we can do "plt.P-L-O-T, plt.plot," and we can use this to create a graph in the background. Now with most graphics what ends up happening is the computer kind of draws everything in the background so it creates this image sort of behind the scenes as far as all graphics are concerned. And it draws all that stuff in the background. And then at the very end when we're all done modifying the image, it will bring it to the screen display it to the user. So right now we just have this one line that we're gonna plot but maybe later you've got eight lines that you're plotting on a graph or maybe you've got multiple figures and subplots and all this kind of stuff and maybe you've got a legend and labels and a title and all kinds of other stuff.

03:07 S1: So as you can imagine if you were showing it at each step of the way, this would be kind of inefficient also as far as graphics are concerned. It's the rendering of graphics that takes up the most processing in most cases. So "plt.plot," and then in here you put parameters and there's actually a lot of arguments and keyword arguments, or args and kwargs that you can pass through here but the most basic is you pass an X and a Y. Or you're X's and Y's, so for now we'll make these into lists and we'll just use one, two, three, four and a five, and then we'll put in... So those are out X variables and then we'll just make up some numbers. You can copy the same number's I'm choosing or you can do your own it really doesn't matter.

03:54 S1: Now once you're done plotting all the things that you wanna plot and conceivably doing all of the other things that you might wanna do, we're just gonna keep this one as simple as possible. What we wanna do is actually bring the graph up with a "plt.show. Now unless we're doing something with possibly the animate functionality of matplotlib there's really no modifying a graph once you show it. And when you call "plt.show" like this. When you call that, your script will pause. So for example we could say, "print got here," okay. So that will print out to the console when we actually arrive at this line. So let's go ahead and save and run that again, if you're not familiar with codding in idle, you can code and you can either press F5 and that's good enough or you can just go up to run and run module. I usually press F5 so I usually say, "save and run" or just run and then it will ask you if you wanna save the module it will pop up with a little window like this. I'll go ahead and say, "yes." And then from there it will run and we should get our graph popped up here in a second, there she is and that's that, so a real simple graph. Now I'll bring over the console here and as you can see we have not printed that "got here" yet, so we could close this and then it says, "got here." Now one thing that you might do is, you might plot a graph do some modifications we would...

05:20 S1: We'll just envision doing some modifications here show that graph okay cool, close, and you can see that nothing further happened, right? And the reason why is once you show and you exit that's gonna clear the plot so in order to get more or another graph up or like let's say you're graphing here and then you showed it and then you I don't know print it, we don't need to show that anymore I was just showing that it would pause the code at that point but you plot here let's say and you use plot show and when you close that graph it's gonna get rid of all that background stuff so you will actually need to re-call "plt.plot" like this to show it again, okay? Just in case you were looking to make modifications or anything like that.

06:09 S1: So anyways, that's it for a really basic introduction to how matplotlib is gonna work but again you need to import matplotlib in some form but for the most part right out of the gate all we really need is Pyplot. From there, you can use the dotplot method here and while you can pass a lot of other arguments and keyword arguments that we'll be talking about later, you just need to pass an X and a Y, and then finally you do a "plt.show."

06:36 S1: Some errors that you might have down the road, it would be something like this. Let's say right now X, it has five variables and Y has five variables, but what if X had six variables and we try to plot that? You see we get this nasty error and it's a value error and you get this... Mainly it's the X and Y must have the same first dimension. Basically what that means is length, so if that's an error, when you see that error you should just automatically know that, "Oh, X and Y have a different length." And a lot of times X and Y are gonna be dynamic variables so they're not as... It's as black and white as this is where you can visually look at it and be like one, two, three, four, five. No.

07:16 S1: You might have thousands of points and you wouldn't even look they're saved variables and you never actually look at them. But if you see that error, usually what I'll do is I'll just print the len of X and then you can do the same thing here, printlenY. Now obviously we don't have X and Y but in theory you would have X equal to this list and Y equal to this list. And then if these came up as wildly different or maybe sometimes there's only one number, it's like one greater or something like that, you can work on fixing your graph. But just keep that in mind that if these are not the same length you're gonna get that not the same first dimension error.

07:56 S1: Now, this was done three lines, really simple stuff. Matplotlib isn't too bad. Though anyone who's taking a math class or whatever knows that we have to have other things like titles and stuff like this in our graphs, that's what we're gonna be talking about in the next part is doing titles and labels and text and stuff like that so stay tuned for that.
00:00 Speaker 1: Hello, everyone, and welcome to part three of Section 2 of our data visualization with Python and matplotlib tutorial series. In this part, what we're gonna be talking about is titles and labels to our graphs. So we'll just run this real quick, and show our current... This is our graph. So pretty a basic graph, nothing too fancy here. Now, if you've ever taken a math class in school, chances are you know that graphs require titles and labels, otherwise, you missed points on your homework. So while sometimes it's super obvious what your graph represents, your viewers... And maybe it's super obvious to you, but your viewers might not even know. And people tend to glaze over data and stuff. So it's useful if we can add labels and titles, and stuff like that. So people expect, at least, to see a title of what does this data represent. And generally, we also wanna see X and Y labels. Now sometimes, the X label is so obvious, maybe it's a time stamp, or something like that, so we know these are times.

01:04 S1: But generally, we wanna label the Y-axis too, simply because, a lot of times, it's the Y-axis that we don't really know what it represents. But sometimes you can have time on the Y-axis, too, or something that is obvious. But it's always best to be safe and label them. Anyways, labeling and adding titles, and stuff in matplotlib is pretty simple. So moving forward, what we're gonna go ahead and do is kind of convert this to a slightly more realistic example of typically how you'll have plots. You'll have an X and then a Y, and that's gonna equal something. In our case, we'll just keep the lists that we have here. But generally you might have some sort of function, or something that's assigning these values. I'm gonna change this so that our graph looks a little more interesting. There we go. And then when we plot, we'll plot X, and then Y like that. So now we have an M, anyway.

[chuckle]

02:07 S1: So generally, when you go to plot something, you're not gonna hard-code everything. So if you're hard-coding things, why are we using programs anyways. So now that we've plotted, we've got plt.plot and then what we'll do is we'll come here, and we'll just say, "plt.xlabel," and this is exactly what it sounds like. It's the label for our x-axis. So we'll say, the X label is equal to, and then we just pass a string here, plot number. There's other parameters who too that we'll probably talk about later, but for now we'll just pass some simple text. Then we can also pass a "plt.ylabel," and this will be for the y-axis. The x-axis is the bottom axis, the y-axis is the one that goes up and down. So you've got plot number, and then for the y label, we'll just say, "A random number," because that's what it is. So now we can save and run that. And what we get here is a graph, and it might be kind of hard to see on the screen for y'all, but you should be able to see it on yours.

03:13 S1: That now you've got these marks here, and you also have this label on the X and Y, and also your Y-axis is automatically formatted to be up and down with the axis. You didn't even need to do any coding for that. So it's pretty cool. Next we can add a title as well. And doing that it's pretty simple. We just do "plt.title." Bam. And what we're gonna do is add the title, and we'll just call this an "Epic graph," because that's what it is. So save and run that, and sure enough, there we go. We've got a title here which is just automatically slightly larger than the other labels, and stuff like that. Now, sometimes you might have a really, really long title. Epic graph, and I'm trying to think of something we can add here. "Epic graph tutorial for dataviz," 'cause we're cool, "In Python with matplotlib." Let's see if that's long enough to be a problem. No. [chuckle] "Tutorial showing labels and titles." Okay, that should be long enough. So we'll save and run that. And sure enough, we see, okay, yes, it's running off the screen, and that's not good, and nobody wants a really long title like that anyways.

04:36 S1: So what we can do is we can use the new line so we can get rid of that extra space there, and then use back slash. So that's the slash that's above your enter key, not the one that's to the left of your shift. So n, and this is a new line character, and matplotlib is gonna recognize that, and put it on a new line for us. So now we can run this, and you'll see here that now you've got "Epic graph tutorial for dataviz in Python and matplotlib tutorial showing labels and titles." So now it's a little easier to read and it's not running off the page. Okay. Actually I want that graph back. And the other thing I want us to go ahead and cover before we get too deep, is what ought we can do in these windows. So let's make this a little bigger here. So you can see there's a bunch of buttons down here on the bottom left. And each button does a fairly different things. But really, until we start doing anything, the first three buttons are basically worthless. So let's play with this fourth one, so this one that's like a cross. Click on that and you'll see your cursor has changed a little bit. And what we can do is, when we do this, we can click-and-hold on the graph, and basically click-and-drag the graph about.

05:56 S1: So by default, matplotlib is gonna try to make sure that your graph is nice and centered on the page and you can see all the points. And that's about it. But you might wanna move things around or whatever and that's how you would do it. Now, the next thing to the right of that is this zoom button and it does exactly what it sounds like. You can zoom. So, you get that and then you click and you basically drop box. So, click and drag the box. Bam. And you'll zoom in to that point. The next one is these configured subplots that's to the right of the search. We'll click on that and we get this little slider window. And this will be a little more useful later on, but this is kind of the... I don't know. Alignments in kind of principles of the graph. But you'll see here. So, left, we can adjust that and that's how much space is to the left of the graph. So, you can see that we can completely knock off all the ticks and even the spine there gets lost by zero-zero. Or we can create a ton of space that's just absurd. And you can see where the old default was. But we can do this to make it use all the space, let's say. Same thing with the bottom. You can do the same thing. And then you've got right and top. And then you've got this W space and H space. And with this graph, we can do this all day and we notice nothing really happens.

07:16 S1: This W space and H space will actually come into play is when you have multiple subplots. We'll get there. But between subplots, there's also a space between them. You can think of it, if you're familiar with maybe HTML. Or you can think about it like padding between all the elements. So, anyways, that's this configure. Finally, you have the save the figure. You can click that and you can save an image of that figure somewhere. So, you can save it to the desktop. You can save it as PNG. You can also save it as vector graphics, stuff like that. And then, going back over to these first three. You can use these arrows like this would be the back. So, it's like a back button. So, you can hit back and you go back to the original look. And you can hit it again and again and eventually, you'll get to the beginning. But then, you can go back forward again to where we were. And then, if you just happen to get so lost and you don't wanna figure out where home is again, you click on the home. And bam, it'll take you to the original view. So, that's kind of useful.

08:17 S1: So, that's it for this part. What we're gonna be doing in the next part is gonna be talking about adding legends. So, a little more information about the plot that we're looking at. It's really useful though to get lines, especially if you have multiple lines on a graph. Basically, it becomes essential at that point. So, anyway, that's what we're gonna be covering in the next tutorial so stay tuned for that.

00:00 Speaker 1: What is going on everyone and welcome to the fourth part of our section two with Data Visualization with Python in Matplotlib tutorial series. In this section, what we're talking about is the basics of matplotlib and in this specific part, we're gonna be talking about legends. So, aside from titles and labels, another pretty integral part to graphs is a legend which is almost certainly needed if the graph has more than one line, bar, or whatever. So, adding legends with matplotlib is pretty straight forward at least at the outset, but like everything in matplotlib, the legends can be highly customized. For now, we're gonna stick with a really simple legend, but you might find pretty quickly that legends can get in the way a lot.

00:49 S1: So, you can move them around in the graph or you can move them outside the graph, but then you've got maybe if you have another subplot that now they're getting in the way of that other subplot and blah blah blah blah blah. So, knowing how to work with them is gonna be really important and we'll talk more about customizing them down the line. But at least for now, we're gonna cover just a really basic subplot with a basic legend. So, in order to have a legend, we need to have some way to tell the legend what the lines are. So, the legend isn't just gonna know what the label of a line is. We have to pass that and the way we pass that is through the plot method of pie plot. So, what we're gonna need now though is another line.

01:38 S1: So, let's go ahead and create a second line to grab. So, right now, we're plotting these X and Y. Let's go ahead and let's just copy the Y here, come down, paste, and then we'll just make this Y2, and then we'll just pass some new numbers in here. Okay. That'll be good enough. So, then we'll go ahead and do plt.plot. Just copy this again, paste, and then X can stay the same, but this time plot Y2. Now, let's go ahead and plot that and just look at it. So, you can see that we have two lines, but we really don't know which line is which. So, that's kind of the whole point of a legend. So, we can close out of this. And now, what we'll do is we'll just do a comma, comma, and we add this time a label and label equals, and then we pass a string argument here. So, this will be the initial line and then here, we'll have a label equals and this one will be new line.

02:41 S1: So, now we can run that, right? And nothing really happens. So, we've got the labels, but we don't actually have a legend yet. So, we can close this and now we're ready to add a legend. So, what we can do is basically right before plt.show, we can call this legend in. Now, legend just like everything else does take parameters and that's kinda where we go about customizing a legend. But at the most basic level, it has defaults and the reason why you can just do plt.legend. And then I'll just... That'll cause a legend to show up. So, we can run this and we'll see now up at the top right here, we have a legend and you'll have... We have a little short line here and then that's the initial line and a new line.

03:31 S1: And if you have say scatter plots or different markers that we'll talk about later on, they'll be marked here which is pretty useful especially down the road if you've got various different types of lines and all that. It's not just a line. Like if it's a thick line, it'll be thicker and stuff like that. So, it's pretty useful. But as you can see, the legend is almost in the way. I mean, we can see that that goes from this point and then to this point, but you can see the legend is just kind of in the way. I mean, it would be nice if the legend was like, I don't know, here or always not on the line. And so, you can put a legend over here, on this side, or under the graph, or on top of the graph, or something like that. You can do that, it just gets a little hairy as you start moving it around, but it's all possible.

04:17 S1: So anyway, let's close out of this. And with the legend and with anything, like when you call plot for example, you call some plotting into being, it's just there was no plot. But with legend, since we've already got the labels here, that's why you're able to get away with just calling plt.legend empty parameters and something still pops up because we basically have already given the legend all the stuff it needs.

04:42 S1: Okay. So, that's really all there is to it with legends. So, we'll go ahead and cut it off here. In the next tutorial, what we're gonna be talking about is bar charts and how to do a simple bar chart with matplotlib. So, stay tuned for that.
00:00 Speaker 1: What's going on everybody? Welcome to part five of section two of our data visualization with Python and matplotlib tutorial series. In this part, what we're gonna be talking about is bar charts. So, bar charts are real simple graph type where you've got bars of data. So, if you don't know what a bar chart is, you're about to see one. But chances are, you already know what one is. So, what we're gonna go ahead and do is, we can leave this X, Y and really we can probably get away with using this exact data here. We'll get rid of these labels. We'll just probably rewrite those and just delete everything else. The legend, we can keep. I don't really see any problem with that. Yeah. So, let's go ahead and plot a bar.

00:46 S1: So, to do bar graphs, what you have to do with matplotlib is, you have to just notify matplotlib of your intentions, basically, no matter what you're doing. So, if you're doing a regular graph, it's fine. You do plt.plot and everything's great. And all you do is specify a line type, if you want to and a marker type, and you're good to go. Okay. It just so happens the default line type is a straight line, so we didn't have to do any of that. If you wanna do a scatter plot, you have to tell matplotlib, "Hey, I'm about to throw scatter plot data at you." If you wanna do a bar chart, you have to tell it in advance, "Hey, I'm about to do a bar." So, the way that you do that is plt., and if you couldn't guess, it's bar. [chuckle] And so with bar, you'd pass again Xs and Ys. So, we can do something really simple, one, two, three and then, we can pass a five, a three, and a four. Whoops. Five, three, and a four. And it's really that, that's all you have to do, so we could actually graph that really quick. And there you go, you've got some bars.

01:48 S1: And you probably saw an error, and this is the error. You got no labeled objects found. And the reason why we're getting that error is we're asking the legend to show up but it has nothing to show, so we get that error. It's not a game stopping error or anything like that. The code will still run and the script will still continue but that's what you're getting. So, if you see that error, it's because you're asking for a legend and you haven't told it; you haven't given the legend anything to display. Now, we've already got Xs and Ys predefined, so let's go ahead and do X, Y. Good enough. And then, let's give it a label. And the label will just be 1. And then, let's go ahead and plot another one, plt.bar, and then keep X, but this time we'll do Y, too. And the label here will be 2. Now, we can save and run that, and we'll see a problem. [chuckle]

02:45 S1: Sure enough, there's your problem. You can see that... Well, first of all, we've got some overlapping data, so that's not the best. But also, we can see in our legend that these are different colors. [chuckle] So, that's not really helping us either. So, for example, what we could do is, we could change X. Let's change X just for the sake of examples. Two, four, six, eight, 10 and then we'll make an X2 is equal to one, three, five, seven, and nine. Any other stuff can stay the same, so X, X2, this will be X2. Let's save and run that one. There we go. So now, you've got lots of bars, but again, we can't really see which part is which here, so the next thing that we need to change is, we can change the color of the bar, specifically. So, we can use color equals... And for now, let's just use G. Okay. So, G is for green and we can change this one, too. We can say color equals... Whoops. Color equals M for magenta, and we'll save and run that. And there you go, you can see that we've changed the color and you can see the new color via the legend and everything's a lot easier to read here.

04:03 S1: Now, of course, we probably should keep our... Let's just add a quick plt and then X label, and this would be maybe bar number... And then, plt.ylabelbarheight. And don't forget your L to label and then, plt.titlebarcharttutorial. It's always a good idea to make sure you always have labels 'cause otherwise people get confused. So, we got the colors, and we've got the labels, and we got the legend, so that's a pretty decent bar chart there, just a nice simple bar chart. And then, soon down the road, a couple of sections later, we'll be talking about 3D bar charts. So, that'll be pretty cool. So, that's basically, all there is to it for a simple bar chart.

04:56 S1: Now, in the next tutorial, we're gonna be talking about a histogram, which is a lot of people consider them to be the same thing, but they are pretty fundamentally different. So, that's what we'll be talking about next. And if you don't know the difference between a bar chart and a histogram, you're about to. So, stay tuned for that and thanks for watching.
00:00 Speaker 1: Hello everybody and welcome to part six of section two of our data visualization with matplotlib and Python tutorial series. In this part what we're gonna be talking about is histograms. Now where they're really similar, bar charts and histograms serve pretty different purposes. A bar chart is great for comparing things in major categories but pretty poor for illustrating distributions of data at least out of the box. Consider showing maybe a bunch of test scores. How might you show the distribution of grades, right? So you could show each grade, but this wouldn't be really visually that appealing and it wouldn't even really do that good at showing a distribution because you could have like a bar per grade or something that a person achieved but this would result in possibly a hundred different bars and that would be really messy. Now what we kinda wanna do generally when we do some sort of a distribution is maybe we wanna combine into groups like As, Bs, Cs and so on.

01:07 S1: With bar charts, we could code the logic that would handle for something like that and we could still use bar charts, but a histogram just makes more sense to use and matplotlib has histograms built in. So if that's your goal, then you might as well use the built-in histogram functionality. So let's go ahead and basically remove everything except for the matplotlib.pyplot stuff, we still need that. And let's say you're a teacher and you've got test scores, so test_scores equals and then this will just be a list of test scores, and really we could do it in numerical order but it really doesn't matter. So I'm just gonna throw in some numbers here and you guys can feel free to copy me exactly or just throw in any sort of test scores you want, but I'm just gonna put in some numbers.

02:04 S1: We definitely wanna have a good amount of numbers and basically anything from, I don't know, 40s onward, hopefully people didn't score worse than a 40 on your test but it's conceivable. And once you're done making up a bunch of numbers, that'll be good for now. So we've got some test scores and then what we wanna do is plot up these test scores. Now, first of all, we could make a bunch of Xs by doing something like this. So we can say X equals and then we'll do a list here. So we'll do and then X and test scores, so we could do something like this and run that. Oh, I never did the plt.show. Let me just call it right here, plt.show, there we go. And so these are our test scores, right? So X is just the number. So, so far we actually don't even know really that well, this is just the student number and the score that they got. So we can kind of see a few people scored here, almost everybody did better than a 40 or everyone did better than 40. So you can kinda see that, but it's really hard to gather any insights from a representation this way. So instead we can use a histogram and let me go ahead and add that plt.show before I forget.

03:31 S1: I'll leave this bar just simply because it can be useful. Now, well actually let's show that one and then we'll plot down here the other one. So for a histogram, generally you have what are called bins and bins are containers basically, it's like you throw something into a bin. So we're gonna say bins equals, and then in here we'll put the bins of possible, right now the lowest was a 40. In theory, we could've had anything from zero to 10 and on so we can just do to 10, 20, 30, 40, 50, 60, 70, 80, 90, and I suppose we could've had people that scored a 100. So those are our possible bins. And then what we can do is, we can say plt.hist, for histogram, and what we want is first we do the test scores themselves then we put them in the various bins, and then we'll do 'hist type' and then we'll say equals a bar and then the r width, this is just how wide are the bars... We'll just do 0.8 so they're not completely touching, and then we can do a plt.show, and let's go ahead and run that. And so this is the old version so we close out of that, and here's our new one. So you can see we've got 40, 50, 60, 70, 80, 90, 100, pretty cool. So it's a little easier for representation purposes.