News from this site

 Rental advertising space, please contact the webmaster if you need cooperation


+focus
focused

classification  

no classification

tag  

no tag

date  

no datas

The first chapter uses matplotlib to draw a line chart

posted on 2023-05-21 17:53     read(354)     comment(0)     like(27)     collect(2)


Series Article Directory

Chapter 1 draws a line chart
using matplotlib Chapter 2 draws a bar chart
using matplotlib Chapter 3 draws a histogram using matplotlib
Chapter 4 draws a scatter chart using matplotlib
Chapter 5 draws a pie chart using matplotlib
Chapter 6 draws a heat map using matplotlib
Chapter 7 uses matplotlib to draw stacked bar charts
Chapter 8 uses matplotlib to draw multiple graphs in one canvas



foreword

As the saying goes, a picture is worth a thousand words. Data visualization is to display data in a graphical way, which makes it easier for us to observe the laws contained in the data. After gaining insight into the laws contained in the data, we can make better business decisions.


1. What is a line chart?

A line chart is a statistical chart composed of points and lines, often used to represent changes in values ​​over continuous time intervals or ordered categories. In a line chart, the x-axis is often used as a continuous time interval or as an ordered category (such as stage 1, stage 2, stage 3). The y-axis is used for quantified data, and if it is negative, it will be drawn below the y-axis. Links are used to connect two adjacent data points.

Line charts are used to analyze trends in things over time or in ordered categories. If there are multiple sets of data, it is used to analyze the interaction and influence of multiple sets of data over time or ordered categories. The direction of the broken line indicates positive/negative change. The slope of the broken line indicates the degree of change.


Second, the drawing of the line chart

1. Use default styles

In this lesson we will see how to make a line chart. First we need to install the matplotlib library, which can pip install matplotlibbe . After the matplotlib library is installed, we can use the matplotlib library to draw. For example:

from matplotlib import pyplot as plt

dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]

plt.plot(dev_x, dev_y)
plt.show()

The first line of the above code imports pyplot from matplotlib and aliases plt. dev_x、dev_yThey are two lists, each containing 11 data, and then call the plot function to draw the graph. The parameters of the plot function are two sets of data. After calling the plot function, you need to call the show function, otherwise the graph will not be displayed. The graph obtained by the above code is as follows:
insert image description here

The meanings represented by the abscissa and ordinate and the title of the entire graph are not identified in the above graph. Let's add the above information:

from matplotlib import pyplot as plt

dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]

plt.plot(dev_x, dev_y)

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.show()

The above code adds the meaning represented by the abscissa through the xlabel function, adds the meaning represented by the ordinate through the ylabel function, and adds the title of the graph through the title function. The graph after adding the above information is shown in the figure below:
insert image description here
the graph above contains only one polyline, now let’s add another curve. code show as below:

from matplotlib import pyplot as plt

dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(dev_x, dev_y)

py_dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(py_dev_x, py_dev_y)

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.show()

The graph drawn is as follows:
insert image description here
In the above code, we added a polyline through another set of data. Looking at the code carefully, we can find that dev_xand py_dev_xare the same. For the sake of code brevity, we can only keep one copy, for example:

from matplotlib import pyplot as plt

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y)

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y)

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.show()

The graph generated after executing the above code is shown in the figure below:
insert image description here
Since there are two broken lines in the graph, the meaning of each broken line is not marked in the graph, so we cannot distinguish the two broken lines. In order to distinguish the two broken lines To distinguish, we need to label the two polylines separately. For example:

from matplotlib import pyplot as plt

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y)

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y)

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend(['全部开发者','Python开发者'])

plt.show()

The graph drawn is as follows:
insert image description here
In the above code, we use the legend function to label the two polylines. The parameter of the legend function is a list, and the order of the elements in the list must be consistent with the drawing order of the polylines, otherwise it will cause confusion. When we change the drawing order of the polylines, we need to change the order of the elements in the list that is the parameter of the legend function simultaneously. This is the defect of the above method, let's look at another method.

from matplotlib import pyplot as plt

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y, label="全部开发者")

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y, label="Python开发者")

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend()

plt.show()

The above method adds the label by adding the parameter label in the plot function. After adding the parameter label, we also need to call the legend function, otherwise the label will not be displayed, but at this time calling the legend function does not need to pass in parameters. The graphic after adding labels to the polyline is shown in the following figure:
insert image description here

2. Style setting

The styles of the above graphics are default, and we can also set the style of the graphics through parameters, including the color, shape and thickness of the polyline, etc. For example:

from matplotlib import pyplot as plt

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y, label="全部开发者", color="blue", marker=".", linestyle="-")

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y, label="Python开发者", color="green", marker=".", linestyle="--")

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend()

plt.show()

In the above code, we specify the color of the graph through the color parameter, the marker parameter specifies the mark of each data point, and the linestyle parameter specifies the shape of the graph. The graphic after executing the above code is shown in the figure below:
insert image description here
The value of the color parameter can also be a Hex value, and the Hex value is composed of six hexadecimal numbers, the first two hexadecimal numbers represent the intensity of red, and the middle two hexadecimal numbers represent the intensity of red. One-digit hexadecimal number represents the intensity of green, and the last two hexadecimal digits represent the intensity of blue. For example:

from matplotlib import pyplot as plt

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y, label="全部开发者", color="#FF0000", marker=".", linestyle="-")

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y, label="Python开发者", color="#00FF00", marker=".", linestyle="--")

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend()

plt.show()

In the above code, the color #FF0000specified is red, and #00FF00the color specified by the Hex value is green. The graph after executing the above code is shown in the figure below:
insert image description here
we can also add grid lines to the graph, for example:

from matplotlib import pyplot as plt

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y, label="全部开发者", color="#FF0000", marker=".", linestyle="-")

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y, label="Python开发者", color="#00FF00", marker=".", linestyle="--")

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend()

plt.grid(True)

plt.show()

In the above code, call the grid function and pass in the True parameter. After executing the above code, the graph is shown in the figure below:
insert image description here
In order to make the blank area in the graph less, we call the tight_layout function to automatically fill the blank area, for example:

from matplotlib import pyplot as plt

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y, label="全部开发者", color="#FF0000", marker=".", linestyle="-")

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y, label="Python开发者", color="#00FF00", marker=".", linestyle="--")

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend()

plt.grid(True)

plt.tight_layout()

plt.show()

After adding the call of the tight_layout function, the generated graphics are shown in the figure below:
insert image description here
Above we change the style of the graphics by setting parameters, and pyplot also comes with some styles for us to use directly, for example:

from matplotlib import pyplot as plt

plt.style.use('fivethirtyeight')

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y, label="全部开发者")

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y, label="Python开发者")

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend()

plt.grid(True)

plt.tight_layout()

plt.show()

The above code uses the fivethirtyeight style, and the generated graphics are as shown in the figure below:
insert image description here
There are other styles, such as ggplot:

from matplotlib import pyplot as plt

plt.style.use('ggplot')

ages_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]

dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
plt.plot(ages_x, dev_y, label="全部开发者")

py_dev_y = [45372, 48876, 53850, 57287, 63016, 65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(ages_x, py_dev_y, label="Python开发者")

plt.xlabel("年龄")
plt.ylabel("年薪")
plt.title("年龄和薪水的关系")

plt.legend()

plt.grid(True)

plt.tight_layout()

plt.show()

The resulting graph is shown below:
insert image description here


3. Application scenarios

1. Applicable scenarios

Trends in the same variable over time or ordinal categories, such as the trend in salary over age in the example above.

2. Inapplicable scenarios

  • Too many x-axis nodes.
  • There are too many data samples, resulting in the accumulation of polylines, making it difficult to focus on the key points.
  • The value of the variable is 0 in most cases.

Summarize

In this chapter, we introduce the drawing of the line graph, including the setting of the graphic style and the applicable and unapplicable scenarios of the line graph.

Next Chapter Drawing Bar Charts with matplotlib



Category of website: technical article > Blog

Author:Disheartened

link:http://www.pythonblackhole.com/blog/article/25318/21f9c6fd046dfe39465d/

source:python black hole net

Please indicate the source for any form of reprinting. If any infringement is discovered, it will be held legally responsible.

27 0
collect article
collected

Comment content: (supports up to 255 characters)