What data visualization library is the best?

Rachel Beery
3 min readFeb 12, 2021

No matter how good your findings are they won’t have their intended impact without being able to convey your knowledge to others. As data scientists, we will need to harness the power of communicating our data and findings. Data visualization is one of the most important tools to communicate data science and educate non-technical audiences. In this project, I experimented with and utilized 3 different data visualization libraries that I will give a brief overview of.

1. Matplotlib

The foundation of data visualization libraries using Python is undoubtedly Matplotlib. It has incredible power and complexity which is why there are many other libraries that are built on top of the preexisting features of this visualization library. This library can be used as a base for first understanding the data and exploring phenomena in the data. The styling and very basic and simple which can be useful as a beginner data science. The following libraries utilize the tools of this foundation to build visualizations with less code and more customization options.

2. Seaborn

One of the most popular libraries for building a well put together and aesthetically pleasing visualizations is Seaborn. Fewer lines of code are necessary to create basic or complex visualizations that can be used for any presentation. Additionally, Seaborn has the power to make more advanced plots including categorical plots. An example below shows a horizontal bar chart where the gross movie profits are compared by genre category. Additionally, the bars of the graph shows the portion of the domestic gross profits in dark blue with the overall worldwide gross profits. To also optimize both libraries one can create more advanced visualizations using Seaborn code and then customize using the code from matplotlib.

3. Plotly

If interactivity and customization are what you are looking for then look no further than to Plotly! When thinking about and exploring what would be the best visualization to show details for certain data points on a specific graph I found that plotly was the best option for this.

Plotly express is additionally a built-in on top of the already existing plotly library that as its name implies makes code even more efficient and quick to make than plotly itself.

The following display below shows plotly and how it can be used to answer the question of whether IMDb ratings affect the profitability of a given movie. With plotly express, we were able to make the visualization interactive so that the audience is able to toggle over any point on the plot and see the specific move name and details.

With different datasets and conclusions to make there are many choices in data visualizations. To keep a consistent style though it is highly recommended that the data scientist chooses one library to keep the styling consistent throughout a project. In closing, it is always the data scientist's final decision on what data visualization they will use but it is always a good practice to be open to new ways of visualizing your data.

--

--

Rachel Beery
0 Followers

Flatiron Data Science Program Student