Fortran grammar for tree sitter #122

stadelmanma · 2018-01-04T01:33:58Z

Hi, I took notice of the pull request to use this type of parser for Atom syntax highlighting. Fortran is a language that would greatly benefit from this type of parsing for a lot of the same reasons as other lower level languages C, C++, etc. I am not experienced with parsing a language based on a CFG but I have been doing a lot of reading and digging into the existing code base to get a handle on things.

I have setup a repository to start working on the grammar, modeling it after the ones already under this group (at the time of this post only the basic dot files and the like are in place).

I was going to base it off the Waite/Cordy grammar provided in the Grammar Zoo since it seemed the easiest to work with. I noticed the C grammar was based off content in the same website so I thought it would be a good starting point. I'll cross reference this with the syntax highlight grammar defined in the language-fortran package to try and reduce the odds of missing anything from newer standards. The main thing I am not sure of how to handle would be the differences between Free Form and Fixed Form Fortran.

If you have any tips beyond the example in the README on how best to proceed with this process they would be greatly appreciated. Or if there is somewhere better to pull an existing grammar from I will gladly use as the starting point instead.

Cheers!
Matt

maxbrunsfeld · 2018-01-04T01:49:11Z

Hi! Great to hear that you're working on a Fortran grammar. I wrote a little Fortran 90 myself in college, though I've mostly forgotten the language by now.

I think Fortran 90 (free format) should be fairly easy to handle with Tree-sitter. Fixed-Format Fortran will probably be more difficult and require the use of an external scanner. External scanners are a feature that allow you to add custom C/C++ code to Tree-sitter's generated scanner. The python uses one in order to handle Python's indentation-sensitive syntax.

I'd suggest ignoring fixed-format Fortran for the time being, and concentrating on Fortran 90 for now. As you get started, feel free to ping me on any specific issues that you hit. There's definitely a learning curve to using Tree-sitter if you haven't used other LR-type parsing tools before, and unfortunately I haven't yet had the time to write good documentation; as you said, the other existing grammars are currently the best way to understand how to use the tool.

stadelmanma · 2018-01-09T03:55:39Z

That sounds fair ignoring fixed form in the mean time will certainly help getting a minimally functional project out to the community quicker. Does the Parser support case insensitive regexes of the form /program/i? Since all of Fortran is case insensitive and we will otherwise get stuck with awkward stuff like /[Pp][Rr][Oo][Gr][Aa][Mm]/.

maxbrunsfeld · 2018-01-09T18:24:33Z

Unfortunately, case-insensitive regexes aren't supported right now. We could totally support them. In the meantime, you could approximate them yourself with a helper function:

function caseInsensitive (keyword) {
  return new RegExp(keyword
    .split('')
    .map(letter => `[${letter}${letter.toUpperCase()}]`)
    .join('')
  )
}

which you could use like this:

program: $ => seq(
  caseInsensitive('program'),
  $.identifier,
  // ...
),

stadelmanma · 2018-01-13T02:22:16Z

@maxbrunsfeld could you provide a little more guidiance on when to use the DSL helpers exported by tree-sitter-cli?

module.exports = {
  alias: alias, // I think I get it but tips on the intended usage here would be nicc 
  grammar: grammar, // no questions here
  blank: blank, // no questions here
  choice: choice, // no questions here
  err: err, // seems simple enough but not sure of the intended use case?
  optional: optional, // no questions here
  prec: prec, // I think I get this one but tips would be nice as well
  repeat: repeat, // this seems simple enough, repeat the given rule indefinitely 
  repeat1: repeat1, // why is there this version? is it just to repeat once as the name suggests? 
  seq: seq, // no questions here
  sym: sym, // not sure what this means
  token: token // not sure of the intended use case here either
};

stadelmanma · 2018-02-10T22:57:41Z

ping @maxbrunsfeld, see above

maxbrunsfeld · 2018-03-02T19:11:36Z

I finally started some official docs about creating parsers. There's a section that explains each public function here. There's still a lot that needs to be explained; this is just a start. Let me know what you think of these, and what you think needs to be added.

maxbrunsfeld · 2018-03-02T19:12:50Z

@stadelmanma I'm going to close this issue out. If you have additional questions, you could just @ mention me on issues/PRs in tree-sitter-fortran. It's very cool to see the progress you've made so far!

stadelmanma · 2018-03-03T14:30:00Z

@maxbrunsfeld the documentation is already a great help thanks!

Uses the suggestion from the following comment: tree-sitter/tree-sitter#122 (comment)

See tree-sitter/tree-sitter#122 (comment)

APerricone · 2019-04-26T13:22:47Z

In my parser I use an improved version of function:

function toCaseInsensitive(a) {
  var ca = a.charCodeAt(0);
  if (ca>=97 && ca<=122) return `[${a}${a.toUpperCase()}]`;
  if (ca>=65 && ca<= 90) return `[${a.toLowerCase()}${a}]`;
  return a;
}

function caseInsensitive (keyword) {
  return new RegExp(keyword
    .split('')
    .map(toCaseInsensitive)
    .join('')
  )
}

so I can use it with groups, like:

    procedure_definition: $ => seq(
      caseInsensitive("proc(e(d(u(r(e)?)?)?)?)?"),
      $.identifier,
      $.parameter_list,
      $._endline,
      repeat($.local_list),
      repeat($._statementProc)
    ),

maxbrunsfeld closed this as completed Mar 2, 2018

jrsconfitto added a commit to jrsconfitto/tree-sitter-powershell that referenced this issue Mar 20, 2018

Support case insensitivity through a function

68cf8a8

Uses the suggestion from the following comment: tree-sitter/tree-sitter#122 (comment)

richjyoung mentioned this issue Mar 12, 2019

Case insensitive grammar support #299

Closed

nickolay mentioned this issue Apr 14, 2019

Support case-insensitive regex flag #261

Closed

dgarroDC added a commit to dgarroDC/tree-sitter-ldpl that referenced this issue Apr 18, 2019

Case insensitivity

5c7f249

See tree-sitter/tree-sitter#122 (comment)

alemuller mentioned this issue Dec 14, 2020

Fields of NewRegex #848

Closed

kristijanhusak mentioned this issue Sep 15, 2021

Add support for lowercase block keywords. milisims/tree-sitter-org#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fortran grammar for tree sitter #122

Fortran grammar for tree sitter #122

stadelmanma commented Jan 4, 2018

maxbrunsfeld commented Jan 4, 2018 •

edited

Loading

stadelmanma commented Jan 9, 2018

maxbrunsfeld commented Jan 9, 2018

stadelmanma commented Jan 13, 2018

stadelmanma commented Feb 10, 2018

maxbrunsfeld commented Mar 2, 2018

maxbrunsfeld commented Mar 2, 2018 •

edited

Loading

stadelmanma commented Mar 3, 2018

APerricone commented Apr 26, 2019 •

edited

Loading

Fortran grammar for tree sitter #122

Fortran grammar for tree sitter #122

Comments

stadelmanma commented Jan 4, 2018

maxbrunsfeld commented Jan 4, 2018 • edited Loading

stadelmanma commented Jan 9, 2018

maxbrunsfeld commented Jan 9, 2018

stadelmanma commented Jan 13, 2018

stadelmanma commented Feb 10, 2018

maxbrunsfeld commented Mar 2, 2018

maxbrunsfeld commented Mar 2, 2018 • edited Loading

stadelmanma commented Mar 3, 2018

APerricone commented Apr 26, 2019 • edited Loading

maxbrunsfeld commented Jan 4, 2018 •

edited

Loading

maxbrunsfeld commented Mar 2, 2018 •

edited

Loading

APerricone commented Apr 26, 2019 •

edited

Loading