วิธีอ่านวัตถุ BeautifulSoup Div Tag เป็นพจนานุกรม

Aug 18 2020

ใหม่สำหรับ HTML และ BeautifulSoup ที่นี่ดังนั้น appologies . . ฉันอ่านเว็บไซต์อสังหาริมทรัพย์ที่มี BS4 และได้รับข้อมูลที่ฉันต้องการใน Div Class โดยเฉพาะ

list_1_divs = soup.find_all('div', class_="ListingCell-AllInfo ListingUnit")

BS4 พบ 29 Parent Divs แต่ละตัวมี Divs ย่อยจำนวนมาก แต่ข้อมูลทั้งหมดที่ฉันต้องการดูเหมือนจะอยู่ในพาเรนต์ดังนั้นฉันจึงลบ Divs ลูกทั้งหมดออก ผลลัพธ์ของพาเรนต์ Div ในตัวแปร " s_row " ดูเหมือนสตริงเมื่อฉันพิมพ์ แต่โหมดดีบั๊กอธิบาย " s_row " เป็น{Tag: 3} ที่มี attrs = {dict: 13}จากนั้นแสดงรายการองค์ประกอบที่ฉันต้องการใน nice รายการที่มีโครงสร้างในหน้าต่าง Debug

ฉันจะพิมพ์ (หรือส่งผ่านไปยัง Pandas) พจนานุกรมที่อยู่ภายในออบเจ็กต์ {Tag} ได้อย่างไร เป้าหมายสุดท้ายของฉันคือการมีตาราง 13 องค์ประกอบในพจนานุกรมเป็นคอลัมน์โดยมี 29 แถวที่มีค่าจาก " s_row " แต่ละรายการ ขอบคุณล่วงหน้า.

รหัส:

import urllib.request
from bs4 import BeautifulSoup
wiki = "https://www.lamudi.com.ph/metro-manila/makati/rockwell-1/buy/"
page = urllib.request.urlopen(wiki)
soup = BeautifulSoup(page, features='html.parser')
list_divs = soup.find_all('div', class_="ListingCell-AllInfo ListingUnit")
for s_row in list_divs:
    for child in s_row.find_all("div"):
        child.decompose()
    print(s_row)

คำตอบ

1 AndrejKesely Aug 18 2020 at 20:03

ถ้าฉันเข้าใจคุณถูกต้องคุณต้องการแยกทุกแอตทริบิวต์เป็นคอลัมน์ใน dataframe:

import pandas as pd
import urllib.request
from bs4 import BeautifulSoup


wiki = "https://www.lamudi.com.ph/metro-manila/makati/rockwell-1/buy/"
page = urllib.request.urlopen(wiki)
soup = BeautifulSoup(page, features='html.parser')
list_divs = soup.find_all('div', class_="ListingCell-AllInfo ListingUnit")
all_data = []
for s_row in list_divs:
    all_data.append({})
    for a in s_row.attrs:
        if a == 'class':
            continue
        all_data[-1][a] = s_row[a]

df = pd.DataFrame(all_data)
print(df)

พิมพ์:

   data-price data-category                data-subcategories data-car_spaces data-bedrooms  ... data-price_range data-sqm_range data-rooms_total data-land_size data-subdivisionname
0    82000000   condominium       ["condominium","3-bedroom"]               2             3  ...              NaN            NaN              NaN            NaN                  NaN
1     9800000   condominium          ["condominium","studio"]             NaN             1  ...              NaN            NaN              NaN            NaN                  NaN
2    48990000   condominium  ["condominium","double-bedroom"]             NaN             2  ...      37.8M-48.9M     93-121 sqm              NaN            NaN                  NaN
3    73730000   condominium       ["condominium","3-bedroom"]             NaN             3  ...      45.3M-73.7M    126-202 sqm              NaN            NaN                  NaN
4    26600000   condominium  ["condominium","single-bedroom"]             NaN             1  ...            26.6M         62 sqm              NaN            NaN                  NaN
5    27500000   condominium  ["condominium","double-bedroom"]               1             2  ...              NaN            NaN              NaN            NaN                  NaN
6   130000000   condominium     ["condominium","penthouse-1"]             NaN             4  ...              NaN            NaN              NaN            NaN                  NaN
7    78000000   condominium       ["condominium","3-bedroom"]               2             3  ...              NaN            NaN              NaN            NaN                  NaN
8    55000000   condominium       ["condominium","3-bedroom"]               2             3  ...              NaN            165                3            NaN                  NaN
9    19000000   condominium  ["condominium","single-bedroom"]               1             1  ...              NaN             64                1            NaN                  NaN
10   30000000   condominium  ["condominium","double-bedroom"]             NaN             2  ...              NaN            NaN              NaN            NaN                  NaN
11   14000000   condominium  ["condominium","single-bedroom"]             NaN             1  ...              NaN            NaN              NaN            NaN                  NaN
12   50000000   condominium       ["condominium","3-bedroom"]             NaN             3  ...              NaN            NaN              NaN            NaN                  NaN
13   48000000   condominium       ["condominium","3-bedroom"]             NaN             3  ...              NaN            NaN              NaN            NaN                  NaN
14   27000000   condominium  ["condominium","double-bedroom"]             NaN             2  ...              NaN            NaN              NaN            NaN                  NaN
15   36000000   condominium       ["condominium","3-bedroom"]             NaN             3  ...              NaN            NaN              NaN            NaN                  NaN
16   52000000         house   ["house","single-family-house"]               4             3  ...              NaN            NaN              NaN            110         Palm Village
17   48000000   condominium       ["condominium","3-bedroom"]               2             3  ...              NaN            NaN                4            NaN                  NaN
18   37500000   condominium  ["condominium","double-bedroom"]               2             2  ...              NaN            NaN              NaN            NaN                  NaN
19   19000000   condominium  ["condominium","double-bedroom"]               1             2  ...              NaN            NaN              NaN            NaN                  NaN
20   66700000   condominium       ["condominium","3-bedroom"]               2             3  ...              NaN            NaN              NaN            NaN                  NaN
21   16500000   condominium  ["condominium","double-bedroom"]               1             2  ...              NaN            NaN              NaN            NaN                  NaN
22   12900000   condominium  ["condominium","single-bedroom"]               1             1  ...              NaN            NaN              NaN            NaN                  NaN
23   20000000   condominium  ["condominium","double-bedroom"]               1             2  ...              NaN            NaN              NaN            NaN                  NaN
24   17300000   condominium  ["condominium","single-bedroom"]             NaN             1  ...              NaN            NaN              NaN            NaN                  NaN
25   25000000   condominium  ["condominium","double-bedroom"]             NaN             2  ...              NaN            NaN              NaN            NaN                  NaN
26   14000000   condominium  ["condominium","single-bedroom"]             NaN             1  ...              NaN            NaN              NaN            NaN                  NaN
27   32000000   condominium  ["condominium","double-bedroom"]             NaN             2  ...              NaN            NaN              NaN            NaN                  NaN
28   38000000   condominium  ["condominium","double-bedroom"]               1             2  ...              NaN            NaN              NaN            NaN                  NaN

[29 rows x 17 columns]