News from this site

 Rental advertising space, please contact the webmaster if you need cooperation


+focus
focused

classification  

no classification

tag  

no tag

date  

2024-11(5)

Python epidemic data visualization (crawler + data visualization) (Jupyter environment)

posted on 2023-06-06 11:46     read(347)     comment(0)     like(11)     collect(2)


Table of contents

1 Project background

2 Project goals

3 Project Analysis

3.1 Data Acquisition

3.1.1 Analysis Website

3.1.2 Find the url where the data is located

3.1.3 Get data

3.1.4 Analyzing data

3.1.5 Saving data

3.2 Data Visualization

3.2.1 Read data

3.2.2 Bar chart of the number of confirmed cases and the number of deaths in each region

3.2.3 Map of the number of confirmed cases in each region

3.2.4 Ring diagram of the distribution of the number of confirmed cases in each region

3.2.4 Line chart of the distribution of the number of confirmed cases in each region

Project source code:


1 item background

At the end of 2019 , pneumonia (COVID-19) broke out in the world, which was later confirmed to be caused by a new type of coronavirus (SARS-CoV-2).

2 project goals

Under the condition of crawling public data, we have carried out some visualization work, hoping to help everyone better understand the current development of the epidemic, and have more confidence to defeat the raging virus together.

3 Item Analysis

3.1 Data Acquisition

3.1.1 Analysis Website

Go first and find the target data to be crawled today:

https://news.qq.com/zt2020/page/feiyan.htm#/

3.1.2 Find the url where the data is located

urlClick to jump to view

url='https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'

3.1.3 Get data

Get its json data through a crawler:

  1. url='https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
  2. response = requests.get(url, verify=False)
  3. json_data = response.json()['data']
  4. china_data = json_data['diseaseh5Shelf']['areaTree'][0]['children'] # 列表

3.1.4 Analyzing data

Value our list through a for loop and store it in our dictionary

  1. data_set = []
  2. for i in china_data:
  3. data_dict = {}
  4. # 地区名称
  5. data_dict['province'] = i['name']
  6. # 新增确认
  7. data_dict['nowConfirm'] = i['total']['nowConfirm']
  8.  # 死亡人数
  9. data_dict['dead'] = i['total']['dead']
  10.     # 治愈人数
  11. data_dict['heal'] = i['total']['heal']
  12. data_set.append(data_dict)

3.1.5 Saving data

df = pd.DataFrame(data_set)

df.to_csv('yiqing_data.csv')

3.2 Data Visualization

3.2.1 Read data

df2 = df.sort_values(by=['nowConfirm'],ascending=False)[:9]

df2

3.2.2 Bar chart of the number of confirmed cases and the number of deaths in each region

  

  1. bar = (
  2.     Bar()
  3.     .add_xaxis(list(df['province'].values)[:6])
  4.     .add_yaxis("死亡", df['dead'].values.tolist()[:6])
  5.     .add_yaxis("治愈", df['heal'].values.tolist()[:6])
  6.     .set_global_opts(
  7.         title_opts=opts.TitleOpts(title="各地区确诊人数与死亡人数情况"),
  8.         datazoom_opts=[opts.DataZoomOpts()],
  9.         )
  10. )
  11. bar.render_notebook()

3.2.3 Map of the number of confirmed cases in each region

  1. china_map = (
  2.     Map()
  3.     .add("现有确诊", [list(i) for i in zip(df['province'].values.tolist(),df['nowConfirm'].values.tolist())], "china")
  4.     .set_global_opts(
  5.         title_opts=opts.TitleOpts(title="各地区确诊人数"),
  6.         visualmap_opts=opts.VisualMapOpts(max_=600, is_piecewise=True),
  7.     )
  8. )
  9. china_map.render_notebook()

3.2.4 Ring diagram of the distribution of the number of confirmed cases in each region

  1. pie = (
  2.     Pie()
  3.     .add(
  4.         "",
  5.         [list(i) for i in zip(df2['province'].values.tolist(),df2['nowConfirm'].values.tolist())],
  6.         radius = ["10%","30%"]
  7.     )
  8.     .set_global_opts(
  9.             legend_opts=opts.LegendOpts(orient="vertical", pos_top="70%", pos_left="70%"),
  10.     )
  11.     .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
  12. )
  13. pie.render_notebook()

  

3.2.4 Line chart of the distribution of the number of confirmed cases in each region

  1. line = (
  2.     Line()
  3.     .add_xaxis(list(df['province'].values))
  4.     .add_yaxis("治愈", df['heal'].values.tolist())
  5.     .add_yaxis("死亡", df['dead'].values.tolist())
  6.     .set_global_opts(
  7.         title_opts=opts.TitleOpts(title="死亡与治愈"),
  8.     )
  9. )
  10. line.render_notebook()

Project source code:

  1. import requests # 发送网络请求模块
  2. import json
  3. import pprint # 格式化输出模块
  4. import pandas as pd # 数据分析当中一个非常重要的模块
  5. from pyecharts import options as opts
  6. from pyecharts.charts import Bar,Line,Pie,Map,Grid
  7. import urllib3
  8. from pyecharts.globals import CurrentConfig, NotebookType
  9. # 配置对应的环境类型
  10. CurrentConfig.NOTEBOOK_TYPE = NotebookType.JUPYTER_NOTEBOOK
  11. CurrentConfig.ONLINE_HOST='https://assets.pyecharts.org/assets/'
  12. urllib3.disable_warnings()#解决InsecureRequestWarning: Unverified HTTPS request is being made to host 'api.inews.qq.com'. 问题
  13. url = 'https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
  14. response = requests.get(url, verify=False)
  15. json_data = response.json()['data']
  16. china_data = json_data['diseaseh5Shelf']['areaTree'][0]['children'] # 列表
  17. data_set = []
  18. for i in china_data:
  19. data_dict = {}
  20. # 地区名称
  21. data_dict['province'] = i['name']
  22. # 新增确认
  23. data_dict['nowConfirm'] = i['total']['nowConfirm']
  24. # 死亡人数
  25. data_dict['dead'] = i['total']['dead']
  26. # 治愈人数
  27. data_dict['heal'] = i['total']['heal']
  28. data_set.append(data_dict)
  29. df = pd.DataFrame(data_set)
  30. df.to_csv('yiqing_data.csv')
  31. df2 = df.sort_values(by=['nowConfirm'],ascending=False)[:9]
  32. df2
  33. # bar = (
  34. # Bar()
  35. # .add_xaxis(list(df['province'].values)[:6])
  36. # .add_yaxis("死亡", df['dead'].values.tolist()[:6])
  37. # .add_yaxis("治愈", df['heal'].values.tolist()[:6])
  38. # .set_global_opts(
  39. # title_opts=opts.TitleOpts(title="各地区确诊人数与死亡人数情况"),
  40. # datazoom_opts=[opts.DataZoomOpts()],
  41. # )
  42. # )
  43. # bar.render_notebook()
  44. # china_map = (
  45. # Map()
  46. # .add("现有确诊", [list(i) for i in zip(df['province'].values.tolist(),df['nowConfirm'].values.tolist())], "china")
  47. # .set_global_opts(
  48. # title_opts=opts.TitleOpts(title="各地区确诊人数"),
  49. # visualmap_opts=opts.VisualMapOpts(max_=600, is_piecewise=True),
  50. # )
  51. # )
  52. # china_map.render_notebook()
  53. # pie = (
  54. # Pie()
  55. # .add(
  56. # "",
  57. # [list(i) for i in zip(df2['province'].values.tolist(),df2['nowConfirm'].values.tolist())],
  58. # radius = ["10%","30%"]
  59. # )
  60. # .set_global_opts(
  61. # legend_opts=opts.LegendOpts(orient="vertical", pos_top="70%", pos_left="70%"),
  62. # )
  63. # .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
  64. # )
  65. # pie.render_notebook()
  66. line = (
  67. Line()
  68. .add_xaxis(list(df['province'].values))
  69. .add_yaxis("治愈", df['heal'].values.tolist())
  70. .add_yaxis("死亡", df['dead'].values.tolist())
  71. .set_global_opts(
  72. title_opts=opts.TitleOpts(title="死亡与治愈"),
  73. )
  74. )
  75. line.render_notebook()



Category of website: technical article > Blog

Author:Fiee

link:http://www.pythonblackhole.com/blog/article/80209/df5b972574692ebf6c02/

source:python black hole net

Please indicate the source for any form of reprinting. If any infringement is discovered, it will be held legally responsible.

11 0
collect article
collected

Comment content: (supports up to 255 characters)