Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to have an optional first line? #320

Closed
prokie opened this issue Apr 5, 2024 · 11 comments
Closed

How to have an optional first line? #320

prokie opened this issue Apr 5, 2024 · 11 comments

Comments

@prokie
Copy link

prokie commented Apr 5, 2024

Hi!

I am new to parsing and trying to learn by writing a spice netlist parser. In the first line of a spice netlist, there is an optional title string that gives the title to the circuit. How would I write that in my parole grammar file?

This is what I have so far.

%start SpiceParser
%title "SpiceParser grammar"
%comment "Empty grammar generated by `parol`"
%line_comment "//"
%auto_newline_off

%%
SpiceParser
    : { Resistor | Capacitor | Inductor | VoltageSource | CurrentSource } END
    ;

Resistor
    : "R"^ Identifier Identifier Identifier Identifier "\n"
    ;

Capacitor
    : "C"^ Identifier Identifier Identifier Identifier "\n"
    ;

Inductor
    : "L"^ Identifier Identifier Identifier Identifier "\n"
    ;

VoltageSource
    : "V"^ Identifier Identifier Identifier Identifier Identifier "\n"
    ;

CurrentSource
    : "I"^ Identifier Identifier Identifier Identifier "\n"
    ;

END : ".END"
    | ".END" "\n"^
    ;

Identifier
    : /[a-zA-Z0-9_]+/
    ;

@jsinger67
Copy link
Owner

Hi @prokie,
It's great to see you're using parol for learning.

To fill me in, could you please give me an example of the beginning of an input, your new parser should be able to parse? This could help me to understand the situation.

@jsinger67
Copy link
Owner

jsinger67 commented Apr 5, 2024

Ok, I did some Internet research.
When I get it right the optional titles are comments, aren't they? If so you can use the %line_comment directive to define the line comment's start.

%line_comment '*'

@prokie
Copy link
Author

prokie commented Apr 5, 2024

Oh, I actually messed up. The first line in a spice netlist is always the title. So I somehow need to have parol match the first line of the file to the Title of the spice circuit.

Example netlist
v1 1 0 dc 15
r1 1 0 2.2k
r2 1 2 3.3k     
r3 2 0 150
.end

@jsinger67
Copy link
Owner

Understood.
I would define a non-terminal Title that stands before the repetitions in your start symbol.

Just a shot:

Title: /[^\n\r]+/ Newline;

Keep one thing in mind:
You switched auto newline off thus you have to handle newlines in your grammar by your own.

@jsinger67
Copy link
Owner

Here is a starting point.
Have fun 🚀

%start SpiceParser
%title "SpiceParser grammar"
%comment "Empty grammar generated by `parol`"
%line_comment "\*"
%auto_newline_off

%%

SpiceParser
    : Title { Element } End
    ;

Title
    : /[^\n\r]+/ Newline^
    ;

Element
    : Resistor
    | Capacitor
    | Inductor
    | VoltageSource
    | CurrentSource
    ;

Resistor
    : 'R'^ Identifier Identifier Identifier Identifier Newline^
    ;

Capacitor
    : 'C'^ Identifier Identifier Identifier Identifier Newline^
    ;

Inductor
    : 'L'^ Identifier Identifier Identifier Identifier Newline^
    ;

VoltageSource
    : 'V'^ Identifier Identifier Identifier Identifier Identifier Newline^
    ;

CurrentSource
    : 'I'^ Identifier Identifier Identifier Identifier Newline^
    ;

End : /(?i)\.END/ [ Newline^ ]
    ;

Identifier
    : /[a-zA-Z0-9_]+/
    ;

Newline
    : /[\n\r]+/
    ;

@prokie
Copy link
Author

prokie commented Apr 5, 2024

Thanks for the starting point, I am still struggling with what I currently have, but the starting point is great for reference. I will continue to work on it. I will let you know if I have any questions. Thanks again.

@jsinger67
Copy link
Owner

jsinger67 commented Apr 5, 2024

I think you have to modify a few things to make this parser work.
First you should consider to ignore the case for the r, c, l, v, i just like I did it for the non-terminal End.
If the syntax for these elements (r, c, l, v, i) specifies that a number follows, you should consider to include this number into the regex as well (e.g. /(?i)r\d+/) otherwise you'll struggle with whitespaces too.
Unfortunately I'm no expert of spice netlists.

@prokie
Copy link
Author

prokie commented Apr 6, 2024

Hi, again. I made my problem smaller to try and get a better starting point. But I dont really know how to get around the following issue.

I am only using resistor now and just skipped the nodes and identifiers, I guess, somehow I need to tell to parser to not look for another title after finding the first one.

%start Spice
%title "Spice grammar"
%comment "Empty grammar generated by `parol`"
%line_comment "\*"

%%

Spice
    : Title { Resistor } End
    ;

Title
    : "[a-zA-Z0-9]+"
    ;

End : ".END"
    ;

Resistor
    : ResistorIdentifier
    ;

ResistorIdentifier
    : "R[a-zA-Z0-9]+"
    ;
Blaa
R1
.END

This gives me the error that it expected R1 to be a title.

@jsinger67
Copy link
Owner

Yes, I understand.
The problem is a token conflict. Title eats up the ResistorIdentifier.
One solution could be to move it behind the Identifier.
The other one is more complicated and involves using Scanner States.

@jsinger67
Copy link
Owner

Hi @prokie

here is a grammar that worked for me with your first example.
I hope this can help you

%start SpiceParser
%title "SpiceParser grammar"
%comment "Empty grammar generated by `parol`"
%line_comment "\*"
%auto_newline_off

%scanner TitleScanner {
    %line_comment "\*"
    %auto_newline_off
}

%%

SpiceParser
    : Title { Element } End
    ;

Title
    : [ Newline^ ] %push(TitleScanner) NonNewline Newline^ %pop()
    ;

Element
    : Resistor
    | Capacitor
    | Inductor
    | VoltageSource
    | CurrentSource
    ;

Resistor
    : RElem^ Identifier Identifier Identifier Newline^
    ;

Capacitor
    : CElem^ Identifier Identifier Identifier Newline^
    ;

Inductor
    : LElem^ Identifier Identifier Identifier Newline^
    ;

VoltageSource
    : VElem^ Identifier Identifier Identifier Identifier Newline^
    ;

CurrentSource
    : IElem^ Identifier Identifier Identifier Newline^
    ;

End : /(?i)\.END(?-i)/ [ Newline^ ]
    ;

RElem
    : /(?i)R\d+(?-i)/
    ;

CElem
    : /(?i)C\d+(?-i)/
    ;

LElem
    : /(?i)L\d+(?-i)/
    ;

VElem
    : /(?i)V\d+(?-i)/
    ;

IElem
    : /(?i)I\d+(?-i)/
    ;

Identifier
    : /[a-zA-Z0-9_\.]+/
    ;

Newline
    : <INITIAL, TitleScanner>/[\n\r]+/
    ;

NonNewline
    : <TitleScanner>/[^\n\r]+/
    ;

Keep in mind that you have to extract the first identifier (which comes directly after the r, c, l, v, i part) in your grammar processing from the token's text itself.
Let me know when you need further assistance.

@jsinger67
Copy link
Owner

I close the issue.
If you need further help please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants