代码之家 › 专栏 › 技术社区 › Davide Lorino

用pandas将suds对象转换为数据帧

pandas python

Davide Lorino · 技术社区 · 7 年前

我有一个列表,如下所示:

`[(deliveryObject){
   id = "0bf003ee0000000000000000000002a11cb6"
   start = 2019-01-02 09:30:00
   messageId = "68027b94b892396ed29581cde9ad07ff"
   status = "sent"
   type = "normal"
   }, (deliveryObject){
   id = "0bf0BE3ABFFDF8744952893782139E82793B"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113404"
   status = "sent"
   type = "transactional"
 }, (deliveryObject){
   id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113403"
   status = "sent"
   type = "transactional"
   }
]`

当我呼唤 type() python告诉我这是一个列表。

当我用 pd.DataFrame(df) ,结果如下:

my list as a dataframe

有人能帮我吗?数据帧应该有列名称,如“id”、“start”、“messageid”等,但它们只是作为每个观察的第一个元素出现,列名称显示为0、1、2等。

感谢您的帮助,谢谢!

3 回复 | 直到 7 年前

fcsr 7 年前

如果这是针对Bronto的,并且正在使用SOAP和SUDS实现。那么deliverobject只是一个suds对象。

你可以做到

from suds.client import Client

list_of_deliveryObjects = [(deliveryObject){
   id = "0bf003ee0000000000000000000002a11cb6"
   start = 2019-01-02 09:30:00
   messageId = "68027b94b892396ed29581cde9ad07ff"
   status = "sent"
   type = "normal"
   }, (deliveryObject){
   id = "0bf0BE3ABFFDF8744952893782139E82793B"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113404"
   status = "sent"
   type = "transactional"
 }, (deliveryObject){
   id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113403"
   status = "sent"
   type = "transactional"
   }
]


data = [Client.dict(suds_object) for suds_object in list_of_deliveryObjects]
df = pd.DataFrame(data)

vladsiv 7 年前

好吧,这个看起来不好看,但它起作用了。我将您的列表转换为字符串:

import re
import pandas as pd

x = """[(deliveryObject){
   id = "0bf003ee0000000000000000000002a11cb6"
   start = 2019-01-02 09:30:00
   messageId = "68027b94b892396ed29581cde9ad07ff"
   status = "sent"
   type = "normal"
   }, (deliveryObject){
   id = "0bf0BE3ABFFDF8744952893782139E82793B"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113404"
   status = "sent"
   type = "transactional"
 }, (deliveryObject){
   id = "0bf0702D03CB42D848CBB0B0AF023A87FA65"
   start = 2018-12-29 23:00:00
   messageId = "0bc403eb0000000000000000000000113403"
   status = "sent"
   type = "transactional"
   }
]"""

然后我用regex列出了一些字典:

a = re.sub(' =', ':', x)
a = re.sub('\(deliveryObject\)', '', a)

for x in ['id', 'start', 'messageId', 'status', 'type']:
    a = re.sub(x, '\''+x+'\'', a)

a = re.sub("(?<=[\"0])\n(?= +?[\'])", '\n,', a)
a = re.sub('(?<=[0])\n(?=,)', '\"\n', a)
a = re.sub('(?<=[:]) (?=[0-9])', ' \"', a)
a = re.sub('(?<= )\"(?=[\w])', '[\"', a)
a = re.sub('(?<=[\w])\"(?=\n)', '\"]', a)

现在您有一个字典列表。第一排像这样

list_of_dict = eval(a)
df = pd.DataFrame(list_of_dict[0])
print(df.head())

                                     id                start                         messageId status    type
0  0bf003ee0000000000000000000002a11cb6  2019-01-02 09:30:00  68027b94b892396ed29581cde9ad07ff   sent  normal

从字典列表中添加其余字典。

拜托,请随意改善我的雷吉克斯,我知道它看起来很糟糕。

IftahP 7 年前

我这样做了:

import pandas as pd
lst =[{
   'id':"0bf003ee0000000000000000000002a11cb6",
   'start' : "2019-01-02 09:30:00",
   'messageId': "68027b94b892396ed29581cde9ad07ff",
   'status' : "sent",
   'type' : "normal"
   },{
   'id' :  "0bf0BE3ABFFDF8744952893782139E82793B",
   'start' :  "2018-12-29 23:00:00",
   'messageId' :  "0bc403eb0000000000000000000000113404",
   'status' :  "sent",
   'type' :  "transactional"
 }, {
   'id' :  "0bf0702D03CB42D848CBB0B0AF023A87FA65",
   'start' :  "2018-12-29 23:00:00",
   'messageId' :  "0bc403eb0000000000000000000000113403",
   'status' :  "sent",
   'type' :  "transactional"
   }]
df = pd.DataFrame(lst)
df

得到了这个(也见附图):

    id  messageId   start   status  type
0   0bf003ee0000000000000000000002a11cb6    68027b94b892396ed29581cde9ad07ff    2019-01-02 09:30:00 sent    normal
1   0bf0BE3ABFFDF8744952893782139E82793B    0bc403eb0000000000000000000000113404    2018-12-29 23:00:00 sent    transactional
2   0bf0702D03CB42D848CBB0B0AF023A87FA65    0bc403eb0000000000000000000000113403    2018-12-29 23:00:00 sent    transactional

Result