A14: Matplotlib Essentials
This article is a part of “Data Science from Scratch — Can I to I Can”, A Lecture Notes Book Series.
✅ A Suggestion: Open a new jupyter notebook and type the code while reading this article, doing is learning, and yes, “PLEASE Read the comment, they are very useful…..!”
Welcome to the Matplotlib Essentials lecture. Matplotlib is the most popular plotting library for Python.
- They say on their website, Matplotlib tries to make easy things easy and hard things possible.
Matplotlib was originally written by John Hunter to visualize Electrocorticography (ECoG) data of epilepsy patients during his post-doctoral research in Neurobiology. He created this library to replicate MatLab’s plotting capabilities in Python (if you have ever worked with Matlab, matplotlib will feel natural to you). Later on, this open-source library emerged as the most widely used plotting library for the Python programming language, and a core component of the scientific Python stack, along with Numpy, Scipy and IPython.
Along with providing great control on every element in a figure, Matplotlib is very easy to get started for simple plots. Matplotlib is very customizable in general and, with just a few lines of code, it generate high-quality plots, histograms, power spectra, bar charts, error-charts, scatter-plots, etc.
The official documentation of matplotlib is provided on its website. This is a great idea to explore Matplotlib examplesand tutorials on its official website, to learn more about this state of the art plotting library.
✅ Matplotlib source code on github
✅ This is another great matplotlib tutorial.
In this section, we will learn matplotlib’s key features with examples. Let’s get started!
Let’s check the version of our matplotlib first.
First thing first, we need to import the librarymatplotlib.pyplot
is commonly imported as plt
We are working in jupyter notebook, which provides a convenient way of printing the plots within the notebook using %matplotlib inline
command. This is only for jupyter notebooks, if you are using another editor, you'll use: plt.show()
at the end of all your plotting commands.
We need data to work with.
Let’s start with a simple example using two numpy arrays to plot numbers along x
and their squares along y
axis.
(We can use lists, however, I opted to work with numpy array. Most likely we will be using numpy arrays or pandas columns in this course, which essentially also behave like arrays)
Basic Plotting
Creating a basic line plot.
We have data in x
and y
. Let's create some plots using this data.
☞ Do you remember the trick <Shift+Tab>
-- documentation of the function in jupyter notebook along the way!.
Reference: List of the Key matlab Commandsplot()
is the basic method that plots y versus x as lines and/or markers.
Creating multiple plot on the same canvas.
subplot()
: provides a convenient way of creating multi-plots on same canvas!tight_layout()
: Automatically adjust subplot parameters to give specified padding.
Matplotlib “Object Oriented” approach
We have seen the basic plotting which is very quick and easy to generate plots. However, it is recommended to use object-oriented approach for more control and customization of our plots.
Let’s break down and learn the formal introduction of Matplotlib’s Object Oriented API for plotting the data.
The idea behind Object Oriented approach is, we create figure objects and then call methods or attributes from that object. This elegant approach is greatly helpful when we are dealing with a canvas that has multiple plots on it.
To start with, let’s create a figure instance "fig1"
and add axes to that figure:
Let’s revise, what we did:
- Created an object (empty canvas) “fig1” —
plt.figure()
- Added “axes” on “fig1” —
fig1.add_axes()
- Plotted data on “axes” —
axes.plot()
- Set labels and title —
axes.set_xlabel/title()
The code is little more and might look complicated in the beginning. However, the advantage is that we have a full control of where the plot axes are placed, and we can easily add more than one axis to the figure.
Let’s learn how to create an inset plot using Object-Oriented approach!
Let’s revise, once again, what we did:
- Created an object (empty canvas)
"fig2"
. - Added main and inset “axes” on
"fig2"
. - Plotted data on “main axes” and set the
labels
. - Plotted data on “inset axes” and set the
labels
.
So, this is the flow that we will be working with matplotlib in the coming lectures for data plotting.
Creating a figure and a set of subplots — “Object Oriented” approach
Notice the difference, subplot is different then subplots (with additional s).subplots()
: Create a figure and a set of subplots. A very convenient way to create layouts of subplots, including the enclosing figure object, in a single call.
☞ <Shift+Tab> for doc string
Let’s start with a basic use case:
Let’s plot some data
Let’s create two empty canvases (fig1
and fig2
) with two axes on each canvas to do some stuff
We can also access the plots on axes1
and axes2
individually using their index values!
Figure size, aspect ratio and DPI
While creating Figure Object, matplotlib allows the aspect ratio, DPI and figure size to be specified.
figsize
: width and height of the figure in inchesdpi
: dots-per-inch (pixel per inch).
To set the figure size, we can pass the same arguments to subplots(figsize=(10,5))
.
Saving figures
savefig()
: This method provides range of formats including .jpg, .pdf, .png, .eps
etc are possible to save high-quality figures in matplotlib.
Let's try to save the figure above, we have the object fig
for that. We need to call .savefig()
on fig
.
Decorating the figures
So, we have learned that we can add x_label, y_label and plot_title using:
axes.set_xlabel("x_label")
axes.set_xlabel("y_label")
ax.set_title("plot_title")
We can use the label="label_text"
keyword argument when plots or other objects are added to the figure. legends()
method, without arguments, to add the legend to the figure.
Let's learn with example:
The position of the legend can be specified using an optional keyword argument loc
in legend()
function. -- <Sfift+Tab>
to see the options.
Official documentation page for details.
most common loc
values are:
Colors, linewidths, linetypes, marker styles etc.
There are lots of options available in matplotlib to customize the plot.
Let’s explore few and we will learn more and more along with this course.
Official Documentation
The color and other graphical elements can be defined in number of ways. MATLAB-like syntax, 'r'
means red, 'g'
means green, etc can be used. MATLAB API for selecting line styles are also supported, for example, 'r.-' means a red line with dots.
The appropriate way is to use Colors with the color = parameter
Colors by their names or RGB hex codes
can also be used. There is another very useful optional parameter, alpha
that can be used along with color
to control the opacity (useful when data points are on top of each other!) .
Below is another example using range of related parameters to make your plot beautiful!
Let’s move on and explore little more to make the figure attractive, its important in storytelling!
- How to change line width with
linewidth
orlw
keyword argument - How to change the line style with
linestyle
orls
keyword arguments - How to set the marker with
marker
andmarkersize
keyword arguments
Code and the figure below could be a good reference for you while creating attractive plots!
Matplotlib conveniently allows the control on axis
- Set the x and y limits using
set_xlim
andset_ylim
methods axis('tight')
for automatically getting"tightly fitted"
axes ranges
Let’s learn with examples:
While doing data science, we create several plots, however, some Commonly used plots histograms, scatter plots, barplots, pie chart etc. It’s very easy to create plots using this state-of-the art python library. Let’s look at some example plots and if you want to explore more than this, please Explore official documentation of matplotlib for more examples.
With time and practice, you will get familiar with more plots and when to use them.
Keep practicing to brush-up and add new skills.
Excellent work!
Your clap and share can help us to reach to someone who is struggling to learn these concepts.
Good luck!
See you in the next lecture on “A15: Matplotlib Advance”.
Note: This complete course, including video lectures and jupyter notebooks, is available on the following links:
Dr. Junaid Qazi is a Subject Matter Specialist, Data Science & Machine Learning Consultant and a Team Builder. He is a Professional Development Coach, Mentor, Author, and Invited Speaker. He can be reached for consulting projects and/or professional development training via LinkedIn.