代码之家  ›  专栏  ›  技术社区  ›  Erin

使用正则表达式清理JSON字符串

  •  0
  • Erin  · 技术社区  · 5 年前

    我目前拥有以下JSON字符串:

    '{"ECommerce ":{" Shopify ":," Magento ":," WooCommerce ":," Squarespace ":},"Tools ":{" Grunt ":," Gulp ":," Vagrant ":},"Containers ":{" LXC ":," Docker ":," Rocket ":},"Digital ":{" SEO ":," Email Marketing ":," Inbound Marketing ":," Crowdfunding ":," Content Distribution ":," Display Advertising ":," Ad Planning and Buying ":," Article Writing ":," SEM ":," Customer Relationship Management ":," Viral Marketing ":," Market Research ":," Social Media ":," Affiliate Marketing ":," Lead Generation ":},"Performance ":{" LoadStorm ":," httperf ":," JMeter ":," LoadUI ":," Blazemeter ":," LoadImpact ":," Nouvola ":," LoadRunner ":," Soasta CloudTest ":},
    

    在{}中混合了分号、引号和额外的花括号。我想去掉这些字符,以便将其转换为Python dict,我的问题是,有没有一种方法可以使用正则表达式来去掉无关字符(因此 ": { 字符)可在这些括号内找到 {} (以便在第一个键“电子商务”后留下第一个分号)。

    我已将我认为会引发JSONDECODEROR的字符加粗:

    {“电子商务”: {" 购物 ": , " 马根托 ": " 吴哥商业 ": , 平方空间 ": }

    如果这是不可能的,我可以用什么其他方法来处理这个问题?

    0 回复  |  直到 5 年前
        1
  •  1
  •   GirkovArpa    5 年前

    const string = '{"ECommerce ":{" Shopify ":," Magento ":," WooCommerce ":," Squarespace ":},"Tools ":{" Grunt ":," Gulp ":," Vagrant ":},"Containers ":{" LXC ":," Docker ":," Rocket ":},"Digital ":{" SEO ":," Email Marketing ":," Inbound Marketing ":," Crowdfunding ":," Content Distribution ":," Display Advertising ":," Ad Planning and Buying ":," Article Writing ":," SEM ":," Customer Relationship Management ":," Viral Marketing ":," Market Research ":," Social Media ":," Affiliate Marketing ":," Lead Generation ":},"Performance ":{" LoadStorm ":," httperf ":," JMeter ":," LoadUI ":," Blazemeter ":," LoadImpact ":," Nouvola ":," LoadRunner ":," Soasta CloudTest ":},';
    
    const json = string
      .replace(/ /g, '') // remove excess spaces
      .replace(/(?!^){/g, '[') // replace braces (except the first) with brackets
      .replace(/}/g, ']') // replace closing braces with brackets
      .replace(/:]/g, ']') // remove erroneous colons before brackets
      .replace(/:,/g, ',') // remove erroneous colons before commas
      .replace(/.$/, '}'); // replace last comma with bracket
    
    console.log(JSON.parse(json));