代码之家  ›  专栏  ›  技术社区  ›  Edward Tanguay

什么regex可以从字符串左侧删除“note:”和“firstname:”?

  •  2
  • Edward Tanguay  · 技术社区  · 15 年前

    我需要把“标签”从弦的前面去掉,例如

    注意:这是一个注意事项

    需要返回:

    笔记

    这是一张便条

    我已经生成了下面的代码示例,但在正则表达式方面遇到了问题。

    在这两个代码中我需要什么代码???????????以下区域,以便我获得评论中显示的预期结果?

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace TestRegex8822
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<string> lines = new List<string>();
                lines.Add("note: this is a note");
                lines.Add("test:    just a test");
                lines.Add("test:\t\t\tjust a test");
                lines.Add("firstName: Jim"); //"firstName" IS a label because it does NOT contain a space
                lines.Add("She said this to him: follow me."); //this is NOT a label since there is a space before the colon
                lines.Add("description: this is the first description");
                lines.Add("description:this is the second description"); //no space after colon
                lines.Add("this is a line with no label");
    
                foreach (var line in lines)
                {
                    Console.WriteLine(StringHelpers.GetLabelFromLine(line));
                    Console.WriteLine(StringHelpers.StripLabelFromLine(line));
                    Console.WriteLine("--");
                    //note
                    //this is a note
                    //--
                    //test
                    //just a test
                    //--
                    //test
                    //just a test
                    //--
                    //firstName
                    //Jim
                    //--
                    //
                    //She said this to him: follow me.
                    //--
                    //description
                    //this is the first description
                    //--
                    //description
                    //this is the first description
                    //--
                    //
                    //this is a line with no label
                    //--
    
                }
                Console.ReadLine();
            }
        }
    
        public static class StringHelpers
        {
            public static string GetLabelFromLine(this string line)
            {
                string label = line.GetMatch(@"^?:(\s)"); //???????????????
                if (!label.IsNullOrEmpty())
                    return label;
                else
                    return "";
            }
    
            public static string StripLabelFromLine(this string line)
            {
                return ...//???????????????
            }
    
            public static bool IsNullOrEmpty(this string line)
            {
                return String.IsNullOrEmpty(line);
            }
        }
    
        public static class RegexHelpers
        {
            public static string GetMatch(this string text, string regex)
            {
                Match match = Regex.Match(text, regex);
                if (match.Success)
                {
                    string theMatch = match.Groups[0].Value;
                    return theMatch;
                }
                else
                {
                    return null;
                }
            }
        }
    }
    

    补充

    @Keltex,我将您的想法合并如下,但它不匹配任何文本(所有条目都是空白的),我需要在regex中调整什么?

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace TestRegex8822
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<string> lines = new List<string>();
                lines.Add("note: this is a note");
                lines.Add("test:    just a test");
                lines.Add("test:\t\t\tjust a test");
                lines.Add("firstName: Jim"); //"firstName" IS a label because it does NOT contain a space
                lines.Add("first name: Jim"); //"first name" is not a label because it contains a space
                lines.Add("description: this is the first description");
                lines.Add("description:this is the second description"); //no space after colon
                lines.Add("this is a line with no label");
    
                foreach (var line in lines)
                {
                    LabelLinePair llp = line.GetLabelLinePair();
                    Console.WriteLine(llp.Label);
                    Console.WriteLine(llp.Line);
                    Console.WriteLine("--");
                }
                Console.ReadLine();
            }
        }
    
        public static class StringHelpers
        {
            public static LabelLinePair GetLabelLinePair(this string line)
            {
                Regex regex = new Regex(@"(?<label>.+):\s*(?<text>.+)");
                Match match = regex.Match(line); 
                LabelLinePair labelLinePair = new LabelLinePair();
                labelLinePair.Label = match.Groups["label"].ToString();
                labelLinePair.Line = match.Groups["line"].ToString();
                return labelLinePair;
            }
        }
    
        public class LabelLinePair
        {
            public string Label { get; set; }
            public string Line { get; set; }
        }
    
    }
    

    解决了的:

    好的,我让它工作了,加上一点黑客来处理带有空格的标签,这正是我想要的,谢谢!

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace TestRegex8822
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<string> lines = new List<string>();
                lines.Add("note: this is a note");
                lines.Add("test:    just a test");
                lines.Add("test:\t\t\tjust a test");
                lines.Add("firstName: Jim"); //"firstName" IS a label because it does NOT contain a space
                lines.Add("first name: Jim"); //"first name" is not a label because it contains a space
                lines.Add("description: this is the first description");
                lines.Add("description:this is the second description"); //no space after colon
                lines.Add("this is a line with no label");
                lines.Add("she said to him: follow me");
    
                foreach (var line in lines)
                {
                    LabelLinePair llp = line.GetLabelLinePair();
                    Console.WriteLine(llp.Label);
                    Console.WriteLine(llp.Line);
                    Console.WriteLine("--");
                }
                Console.ReadLine();
            }
        }
    
        public static class StringHelpers
        {
            public static LabelLinePair GetLabelLinePair(this string line)
            {
                Regex regex = new Regex(@"(?<label>.+):\s*(?<text>.+)");
                Match match = regex.Match(line); 
                LabelLinePair llp = new LabelLinePair();
                llp.Label = match.Groups["label"].ToString();
                llp.Line = match.Groups["text"].ToString();
    
                if (llp.Label.IsNullOrEmpty() || llp.Label.Contains(" "))
                {
                    llp.Label = "";
                    llp.Line = line;
                }
    
                return llp;
            }
    
            public static bool IsNullOrEmpty(this string line)
            {
                return String.IsNullOrEmpty(line);
            }
        }
    
        public class LabelLinePair
        {
            public string Label { get; set; }
            public string Line { get; set; }
        }
    
    }
    
    3 回复  |  直到 15 年前
        1
  •  3
  •   Keltex    15 年前

    它可能看起来像这样:

    Regex myreg = new Regex(@"(?<label>.+):\s*(?<text>.+)");
    
    Match mymatch = myreg.Match(text); 
    
    if(mymatch.IsMatch) 
    { 
        Console.WriteLine("label: "+mymatch.Groups["label"]); 
        Console.WriteLine("text: "+mymatch.Groups["text"]); 
    }
    

    我在上面用过命名的火柴,但你可以不用它们。另外,我认为这比执行两个方法调用效率要高一些。一个regex同时获取文本和标签。

        2
  •  5
  •   jball    15 年前

    你不能简单地在第一个冒号上拆分字符串,或者如果没有冒号就没有标签吗?

    public static class StringHelpers 
    { 
        public static string GetLabelFromLine(this string line) 
        { 
             int separatorIndex = line.IndexOf(':');
             if (separatorIndex > 0)
             {
                string possibleLabel = line.Substring(0, separatorIndex).Trim();
                if(possibleLabel.IndexOf(' ') < 0) 
                {
                    return possibleLabel;
                }
             }
             else
             {
                return string.Empty;
             }        
         } 
    
        public static string StripLabelFromLine(this string line) 
        { 
            int separatorIndex = line.IndexOf(':');
             if (separatorIndex > 0)
             {
                return line.Substring(separatorIndex + 1, 
                       line.Length - separatorIndex - 1).Trim();
             }
             else
             {
                return line;
             }      
        } 
    
        public static bool IsNullOrEmpty(this string line) 
        { 
            return String.IsNullOrEmpty(line); 
        } 
    } 
    
        3
  •  1
  •   polygenelubricants    15 年前

    这个雷杰克斯作品( see it in action on rubular ):

    (?: *([^:\s]+) *: *)?(.+)
    

    这会将标签(如果有)捕获到 \1 把身体变成 \2 .

    它有足够的空白空间,所以标签可以缩进,等等。