News from this site

 Rental advertising space, please contact the webmaster if you need cooperation


+focus
focused

classification  

no classification

tag  

no tag

date  

no datas

Python crawls weather data and performs visual analysis

posted on 2023-05-21 18:07     read(1141)     comment(0)     like(16)     collect(5)


  1. Data Acquisition Logic

  1. data schema

Historical weather data schema

{

'Information of the day': '2023-01-01 Sunday',

'Maximum temperature': 8℃'',

'Minimum temperature': '5℃',

'Weather': 'Cloudy',

'Wind direction information': 'North wind 3'

}

  1. data crawling

1. Import library

  1. import numpy as np
  2. import pandas as pd
  3. import requests
  4. from bs4 import BeautifulSoup
  5. from matplotlib import pyplot as plt
  6. from pandas import Series, DataFrame

2. Disguise the program

  1. headers = {
  2. 'Host': 'lishi.tianqi.com',
  3. 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.63'
  4. }

3. Capture weather data

  1. url = 'https://lishi.tianqi.com/shanghai/202301.html' # 上海 2023年1月天气
  2. res = requests.get(url, headers=headers)
  3. res.encodind = 'utf-8'
  4. html = BeautifulSoup(res.text, 'html.parser')
  5. data_all = []
  6. tian_three = html.find("div", {"class": "tian_three"})
  7. lishi = tian_three.find_all("li")
  8. for i in lishi:
  9. lishi_div = i.find_all("div")
  10. data = []
  11. for j in lishi_div:
  12. data.append(j.text)
  13. data_all.append(data)
  14. print(data_all)

4. Data storage

Before the data is stored, the data is processed to facilitate the later data analysis. Split the "Today Information" field above into two fields, "Date" and "Week", and the same goes for "Wind Information". Finally, save the data as a csv file.

  1. weather = pd.DataFrame(data_all)
  2. weather.columns = ["当日信息", "最高气温", "最低气温", "天气", "风向信息"]
  3. weather_shape = weather.shape
  4. print(weather)
  5. weather['当日信息'].apply(str)
  6. result = DataFrame(weather['当日信息'].apply(lambda x: Series(str(x).split(' '))))
  7. result = result.loc[:, 0:1]
  8. result.columns = ['日期', '星期']
  9. weather['风向信息'].apply(str)
  10. result1 = DataFrame(weather['风向信息'].apply(lambda x: Series(str(x).split(' '))))
  11. result1 = result1.loc[:, 0:1]
  12. result1.columns = ['风向', '级数']
  13. weather = weather.drop(columns='当日信息')
  14. weather = weather.drop(columns='风向信息')
  15. weather.insert(loc=0, column='日期', value=result['日期'])
  16. weather.insert(loc=1, column='星期', value=result['星期'])
  17. weather.insert(loc=5, column='风向', value=result1['风向'])
  18. weather.insert(loc=6, column='级数', value=result1['级数'])
  19. weather.to_csv("上海23年1月天气.csv", encoding="utf_8")

5. Data analysis

Note: The data analysis uses the weather data of Beijing in January 2023, as shown in the figure below:

1. Weather conditions in Beijing in January 2023

  1. # 数据处理
  2. plt.rcParams['font.sans-serif'] = ['SimHei']
  3. plt.rcParams['axes.unicode_minus'] = False
  4. weather['最高气温'] = weather['最高气温'].map(lambda x: int(x.replace('℃', '')))
  5. weather['最低气温'] = weather['最低气温'].map(lambda x: int(x.replace('℃', '')))
  6. dates = weather['日期']
  7. highs = weather['最高气温']
  8. lows = weather['最低气温']
  9. # 画图
  10. fig = plt.figure(dpi=128, figsize=(10, 6))
  11. plt.plot(dates, highs, c='red', alpha=0.5)
  12. plt.plot(dates, lows, c='blue', alpha=0.5)
  13. plt.fill_between(dates, highs, lows, facecolor='blue', alpha=0.2)
  14. # 图表格式
  15. # 设置图标的图形格式
  16. plt.title('2023北京1月天气情况', fontsize=24)
  17. plt.xlabel('', fontsize=6)
  18. fig.autofmt_xdate()
  19. plt.ylabel('气温', fontsize=12)
  20. plt.tick_params(axis='both', which='major', labelsize=10)
  21. # 修改刻度
  22. plt.xticks(dates[::5])
  23. # 显示
  24. plt.show()

2. Pie chart of climate distribution in Beijing in January 2023

There are 31 days in January 2023, pay attention to the number of cycles when looping through.

  1. # 天气可视化饼图
  2. weather = list(weather['天气'])
  3. dic_wea = {}
  4. for i in range(0, 31):
  5. if weather[i] in dic_wea.keys():
  6. dic_wea[weather[i]] += 1
  7. else:
  8. dic_wea[weather[i]] = 1
  9. print(dic_wea)
  10. explode = [0.01] * len(dic_wea.keys())
  11. color = ['lightskyblue', 'silver', 'yellow', 'salmon', 'grey', 'lime', 'gold', 'red', 'green', 'pink']
  12. plt.pie(dic_wea.values(), explode=explode, labels=dic_wea.keys(), autopct='%1.1f%%', colors=color)
  13. plt.title('北京23年1月天气候分布饼图')
  14. plt.show()

3. Wind Scale Chart

Customize the change_wind function to convert the wind direction information into a value, and calculate the average wind speed of each wind direction.

  1. def change_wind(wind):
  2. """改变风向"""
  3. for i in range(0, 31):
  4. if wind[i] == "北风":
  5. wind[i] = 90
  6. elif wind[i] == "南风":
  7. wind[i] = 270
  8. elif wind[i] == "西风":
  9. wind[i] = 180
  10. elif wind[i] == "东风":
  11. wind[i] = 360
  12. elif wind[i] == "东北风":
  13. wind[i] = 45
  14. elif wind[i] == "西北风":
  15. wind[i] = 135
  16. elif wind[i] == "西南风":
  17. wind[i] = 225
  18. elif wind[i] == "东南风":
  19. wind[i] = 315
  20. return wind
  21. # 风向雷达图
  22. wind = list(weather['风向'])
  23. weather['级数'] = weather['级数'].map(lambda x: int(x.replace('级', '')))
  24. # weather['级数']=pd.to_numeric(weather['级数'])
  25. wind_speed = list(weather['级数'])
  26. wind = change_wind(wind)
  27. degs = np.arange(45, 361, 45)
  28. temp = []
  29. for deg in degs:
  30. speed = []
  31. # 获取 wind_deg 在指定范围的风速平均值数据
  32. for i in range(0, 31):
  33. if wind[i] == deg:
  34. speed.append(wind_speed[i])
  35. if len(speed) == 0:
  36. temp.append(0)
  37. else:
  38. temp.append(sum(speed) / len(speed))
  39. print(temp)
  40. N = 8
  41. theta = np.arange(0. + np.pi / 8, 2 * np.pi + np.pi / 8, 2 * np.pi / 8)
  42. # 数据极径
  43. radii = np.array(temp)
  44. # 绘制极区图坐标系
  45. plt.axes(polar=True)
  46. # 定义每个扇区的RGB值(R,G,B),x越大,对应的颜色越接近蓝色
  47. colors = [(1 - x / max(temp), 1 - x / max(temp), 0.6) for x in radii]
  48. plt.bar(theta, radii, width=(2 * np.pi / N), bottom=0.0, color=colors)
  49. plt.title('风级图', x=0.2, fontsize=20)
  50. plt.show()



Category of website: technical article > Blog

Author:python98k

link:http://www.pythonblackhole.com/blog/article/25350/b11af16226bed7f2defa/

source:python black hole net

Please indicate the source for any form of reprinting. If any infringement is discovered, it will be held legally responsible.

16 0
collect article
collected

Comment content: (supports up to 255 characters)