代码之家  ›  专栏  ›  技术社区  ›  saravanatn

python 2.7 split和.col concatenate

  •  1
  • saravanatn  · 技术社区  · 6 年前

    我使用的是python 2.7版本。

    我正在尝试使用python提取数组列名。

    数组列如下所述:

    `col` array<struct< columnname:string,columnname1:int,columnname2:decimal(10,0),
    columnname3:decimal(9,2)>>
    

    我到目前为止所做的尝试:

    import re
    str=input("enter any string:")
    fields=str.split(",")
    for x in fields:
      name=x.split(":")
      seminame=name[0]+','
      firstname=seminame.find('`')
      lastname=seminame.rfind('`')
      fullname=seminame[(firstname+1):lastname]
      replacename1=fullname.replace(')', '')
      replacename2=fullname.replace('2', '')
      replacename3=fullname.replace('9', '')
      replacename4=fullname.replace('10', '')
      replacename5=fullname.replace('0', '')
      finalname='.'+replacename5
      print(finalname)
    

    Input:
    '`col` array<struct< columnname:string,columnname1:int,columnname2:decimal(10,0),
    columnname3:decimal(9,2)>>'
    

    我希望输出为

    Actual output
    .col,
    .columnname1,
    .columnname2,
    .),
    
    Expected output
    col.columnname,
    col.columnname1,
    col.columnname2,
    col.columnname3
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   Madhan Varadhodiyil    6 年前

    为什么不用re来做同样的事情呢?

    import re
    str = "'`col` array<struct< columnname:string,columnname1:int,columnname2:decimal(10,0),columnname3:decimal(9,2)>>'"
    word = re.findall("`\w+`",str,) # match for columns 
    word = " ".join(word)
    word = re.sub(r'\W+', '', word) # strip `` 
    columnnames = re.findall(r"(\w+):",str) # find all words before `:`
    for c in columnnames:
      c = re.sub(r'\W+', '', c) # to remove `:`
      print  "%s.%s," %( word,c)
    

    输出:

    col.columnname,
    col.columnname1,
    col.columnname2,
    col.columnname3,
    

    从您可以使用的文件中读取 open(filename,mode) 方法

    import re
    with open("test.txt","r") as h:
     str = h.read()
     word = re.findall(r"`\w+`",str,)
     word = " ".join(word)
     word = re.sub(r'\W+', '', word)
     columnnames = re.findall(r"(\w+):",str)
     for c in columnnames:
        c = re.sub(r'\W+', '', c)
        print  "%s.%s," %( word,c)
    

    要写入文件:

    import re
    with open("test.txt","r") as h:
    with open("output.dat","a") as w:
    str = h.read()
    word = re.findall(r"`\w+`",str,)
    word = " ".join(word)
    word = re.sub(r'\W+', '', word)
    columnnames = re.findall(r"(\w+):",str)
    for c in columnnames:
        c = re.sub(r'\W+', '', c)
        data =  "%s.%s," %( word,c)
        w.write(data+"\n")
    w.close()
    h.close()