Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any one write an AST aware regex engine?

Examples: replace s/(?ast:comment)foo/bar to replace foo with bar but only in comments. Or s/(\w+): (?ast:openparent)(.?)(?ast:closeparen)\s+(?ast:openbracked)(.?)(?ast:closebracket)/function $1($2)\n{\n$3\n}\n/ to convert from

    foo: (args) {
      codeblock
    }
to

    function foo(args) {
      codeblock
    }
etc..?


semgrep will at least do the search portion, not sure about replace. Posted a few days ago:

https://news.ycombinator.com/item?id=23919313

Slightly related:

How to Build Static Checking Systems Using Orders of Magnitude Less Code

https://web.stanford.edu/~mlfbrown/paper.pdf

They use "microgrammars" and found that the technique is surprisingly effective despite not being "correct".


Perl 6 extends the regex system to be a full fledged recursive descent parser, which lets you do this kind of stuff. But as is pointed out in a sibling comment, a purely regular regex engine won't really do it since it's non-regular language.

Of course, some regex engines do let you do this (notably Perl 5), but it requires code of the "just because you can doesn't mean you should" variety.


That's awkward because parsing ASTs can't usually be done with a regular language (for most programming languages), so a language-specific parser would be needed on top anyway.

This stuff is useful, though, and easy to do in lisps and schemes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: