代码之家  ›  专栏  ›  技术社区  ›  data princess

如何使用awk将带有标题的新列添加到csv

  •  6
  • data princess  · 技术社区  · 7 年前

    我在处理CSV的bash脚本中使用了一些awk。awk做到了这一点:

    ORIG_FILE="score_model.csv"   
    NEW_FILE="updates/score_model.csv"    
    awk -v d="2017_01" -F"," 'BEGIN {OFS = ","} {$(NF+1)=d; print}' $ORIG_FILE > $NEW_FILE 
    

    这一转变是什么

    # before
    model_description,      type,    effective_date, end_date
    Inc <= 40K,             Retired, 08/05/2016,     07/31/2017
    Inc > 40K Age <= 55 V5, Retired, 04/30/2016,     07/31/2017
    Inc > 40K Age > 55 V5 , Retired, 04/30/2016,     07/31/2017
    
    # after, bad
    model_description,      type,    effective_date, end_date,   2017_01  
    Inc <= 40K,             Retired, 08/05/2016,     07/31/2017, 2017_01
    Inc > 40K Age <= 55 V5, Retired, 04/30/2016,     07/31/2017, 2017_01
    Inc > 40K Age > 55 V5 , Retired, 04/30/2016,     07/31/2017, 2017_01
    

    我希望新列有一个标题,这样新的CSV看起来像

    # after, desired
    model_description,      type,    effective_date, end_date,   cmpgn_group  
    Inc <= 40K,             Retired, 08/05/2016,     07/31/2017, 2017_01
    Inc > 40K Age <= 55 V5, Retired, 04/30/2016,     07/31/2017, 2017_01
    Inc > 40K Age > 55 V5 , Retired, 04/30/2016,     07/31/2017, 2017_01
    

    我知道有一种方法可以分别在第一行中指定要做什么,但我还没有弄清楚。

    3 回复  |  直到 7 年前
        1
  •  6
  •   Rahul Verma    7 年前

    使用 sed公司

    $ sed '1s/$/,\tcmpgn_group/; 2,$s/$/,\t2017_01/' file
    

    i、 e代表 1st line :追加 ,\tcmpgn_group
    2 to $ :追加 ,\t2017_01

    使用 awk公司

    $ awk -v d="2017_01" -F"," 'FNR==1{a="cmpgn_group"} FNR>1{a=d} {print $0",\t"a}' f1
    

    输出:

    model_description,      type,    effective_date, end_date,      cmpgn_group
    Inc <= 40K,             Retired, 08/05/2016,     07/31/2017,    2017_01
    Inc > 40K Age <= 55 V5, Retired, 04/30/2016,     07/31/2017,    2017_01
    Inc > 40K Age > 55 V5 , Retired, 04/30/2016,     07/31/2017,    2017_01
    
        2
  •  4
  •   RavinderSingh13 Nikita Bakshi    7 年前

    遵循awk(解决方案中有点变化)应该适合您。

    ORIG_FILE="score_model.csv"   
    NEW_FILE="updates/score_model.csv"    
    awk -v d="2017_01" -F"," 'BEGIN {OFS = ","} FNR==1{$(NF+1)="cmpgn_group"} FNR>1{$(NF+1)=d;} 1' $ORIG_FILE > $NEW_FILE 
    

    解决方案2: 或者我们去掉这个 $(NF+1)( 创建新的现场方法),并尝试直接打印。

    awk -v d="2017_01" -F"," 'BEGIN {OFS = ","} {printf("%s%s",$0,FNR>1?d RS:"cmpgn_group" RS)}' $ORIG_FILE > $NEW_FILE
    

    上述命令说明:

    awk -v d="2017_01" -F"," ' ##Setting valur of variable named d as 2017_01 and setting field separator as comma.
    BEGIN{                     ##Starting BEGIN section of awk here.
      OFS = ","                ##Setting Output field separator as comma here.
    }                          ##Closing BEGIN block here.
    {
      printf("%s%s",$0,FNR>1?d RS:"cmpgn_group" RS) ##Using printf here to print the lines. So %s%s means to print 2 strings here. First I am simply printing $0(current line). Then while printing second string using condition FNR>1(when line number is greater than 1) then print variable d(which we want to add at last) with RS(to print a new line here). Else(if condition FNR>1 is not true) then it means it is very first line of Input_file and print string "cmpn_groups" with RS(record separator) whose default value is a new line.
    }
    ' $ORIG_FILE > $NEW_FILE   ##Mentioning Input_file named #ORIG_FILE and redirecting it's output to $NEW_FILE here.
    
        3
  •  4
  •   Ed Morton    7 年前
    awk -v d="2017_01" 'BEGIN{FS=OFS=","} {print $0, (NR>1?d:"cmpgn_group")}' file