~groovestomp/c-parser

f3fb07eeec5757ecaa11d0aba434d876b4fe068b — Aaron Oman 2 years ago 4c15910
Remove unicode support from TODO and update README
2 files changed, 6 insertions(+), 23 deletions(-)

M README.md
M TODO
M README.md => README.md +6 -20
@@ 1,40 1,26 @@
# C Parser

## Motivation
My original intent with this project, back in 2015 was to start writing a compiler.

I got to the point of a functioning C parser and then stopped.

Since then, my motivation to write a compiler has almost entirely evaporated; but I still see a lot of value in having _good_ C Parser that I understand; so I intend to maintain this.
Provide a library to interact with C source code.  This might include writing a
full-blown compiler, building a language server, or manipulating source code,
integrating with an IDE, better debugging facilities, or any of a number of
other possibilities.

## Caveats

Only tested in Linux.

There is no support for the C preprocessor at all.  Any preprocessor commands
are simply ignored, as are comments.

The parser does parse some C programs, like the provided sample.c.
There's a really naive parse tree implementation to aid with debugging.
The `parse' subcommand has a `--show-parse-tree' option that will display this
parse tree.  Warning! It's really big!

## Dependencies

- linux
- gcc
- bash
  This is explicitly invoked in the Makefile, but I also test with tcc and pcc.

This might work with other shells and other C compilers, but no effort has been
made to be POSIX shell compliant, nor to follow the C standard. (Specifically
ignoring GNU extensions.)

## Installation

Clone this repo and chdir to it.
- posix-compliant shell

## Build

These commands output an executable named `cparser`

    $ make debug

M TODO => TODO +0 -3
@@ 1,9 1,6 @@
- Integer lexing is not correct? Double check C reference and
  integer-constants.txt test file.

- Unicode support? C unicode libraries: icu, utf8proc, *microutf8*, *nunicode*
  Both microutf8 and nunicode look quite good.

- Support the C preprocessor.

- Standard arg parsing.