注意:这是.NET正则表达式。
type Name(type arg1, type arg2, type arg3)
为了与此匹配,我提出了以下正则表达式:
^(\w+)\s+(\w+)\s*\(\s*((\w+)\s+(\w+)(:?,\s+)?)*\s*\)$
这种混乱会产生一个匹配对象,如下所示:
Group 0: type Name(type arg1, type arg2, type arg3)
Capture 0: type Name(type arg1, type arg2, type arg3)
Group 1: type
Capture 0: type
Group 2: Name
Capture 0: Name
Group 3: type arg3
Capture 0: type arg1,
Capture 1: type arg2,
Capture 2, type arg3
Group 4: type
Capture 0: type
Capture 1: type
Capture 2: type
Group 5: arg3
Capture 0: arg1
Capture 1: arg2
Capture 2: arg3
Group 6:
Capture 0: ,
Capture 1: ,
然而,这并不是全部投入。其中一些行可能如下所示:
type Name(type arg1, type[] arg2, type arg3)
注意arg2之前的括号。
^(\w+)\s+(\w+)\s*\(\s*((\w+)\s*(\[\])?\s+(\w+)(:?,\s+)?)*\s*\)$
这会产生如下匹配:
Group 0: type Name(type arg1, type arg2, type arg3)
Capture 0: type Name(type arg1, type arg2, type arg3)
Group 1: type
Capture 0: type
Group 2: Name
Capture 0: Name
Group 3: type arg3
Capture 0: type arg1,
Capture 1: type arg2,
Capture 2, type arg3
Group 4: type
Capture 0: type
Capture 1: type
Capture 2: type
Group 5: []
Capture0: []
Group 6: arg3
Capture 0: arg1
Capture 1: arg2
Capture 2: arg3
Group 7:
Capture 0: ,
Capture 1: ,
有什么方法可以将这个捕获与适当的组关联起来,还是我找错了树?
编辑:
为了澄清这一点,我并不是在构建一个语言解析器。我正在为脚本语言转换旧的文本api文档,如下所示:
--- foo object ---
void bar(int baz)
* This does something.
* Remember blah blah blah.
int getFrob()
* Gets the frob
变成一种新的格式,我可以导出到HTML等。
编辑mkII:
m = Regex.Match(line, @"^(\w+)\s+(\w+)\s*\((.*?)\)$");
if (m.Success) {
if (curMember != null) {
curType.Add(curMember);
}
curMember = new XElement("method");
curMember.Add(new XAttribute("type", m.Groups[1].Value));
curMember.Add(new XAttribute("name", m.Groups[2].Value));
if (m.Groups[3].Success) {
XElement args = new XElement("arguments");
MatchCollection matches = Regex.Matches(m.Groups[3].Value, @"(\w+)(\[\])?\s+(\w+)");
foreach (Match m2 in matches) {
XElement arg = new XElement("arg");
arg.Add(new XAttribute("type", m2.Groups[1].Value));
if (m2.Groups[2].Success) {
arg.Add(new XAttribute("array", "array"));
}
arg.Value = m2.Groups[3].Value;
args.Add(arg);
}
curMember.Add(args);
}
}
首先,它符合
type Name(*)
一部分,当它得到它,它匹配
type Name
在参数部分重复。