Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add schema validation code to C generator #1098

Merged
merged 1 commit into from
Oct 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 11 additions & 8 deletions BUILD.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,9 @@ following its website's instructions. This is necessary to install the
[libmxml-devel][Mini-XML] package.

You can use the `dnf` package manager to install most of the tools
needed to build Daffodil:
used to develop Daffodil:

sudo dnf install clang gcc git java-11-openjdk-devel llvm make mxml-devel pkgconf
sudo dnf install clang gcc git iwyu java-11-openjdk-devel llvm make mxml-devel pkgconf

If you want to use clang instead of gcc, you'll have to set your
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why give people the choice?

We don't have to provide this flexibility unless we or users depend on it for some reason.

This works, so "it aint broke" so no need to change anything. I just wanted to comment that we don't really have to provide options like this unless they're really really required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're talking about the paragraph If you want to use clang instead of gcc, ..., I originally added that paragraph because you told me you didn't want to require GNU-licensed tools to develop or use any part of Daffodil. I even made sure that the GitHub Action workflow (main.yml) would set these environment variables to build Daffodil with clang and lvvm-ar instead of gcc and ar. I can reword the paragraph to say you can set these environment variables to use clang and llvm-ar instead of gcc and ar if you have a policy forbidding GNU-licensed tools.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, gcc is GPLv3 which is category X, so I had concerns about implying it was required. I did find this open Jira LEGAL issue that discusses this:

https://issues.apache.org/jira/browse/LEGAL-390?jql=project+%3D+LEGAL+AND+text+%7E+%22gcc%22+ORDER+BY+updated+DESC%2C+priority+DESC

Sounds like the resolution it's it's fine to use. I'd suggest we still do mention clang support so that people can use it if they want and don't have to figure it all out. But maybe that's an addendum or something, and everything else implies the use of gcc if the goal is to simplify this?

environment variables `CC` and `AR` to the clang binaries' names:
Expand All @@ -68,15 +68,17 @@ commands you type will be able to call the C compiler.

## Ubuntu

You can use the `apt` package manager to install most of the tools
needed to build Daffodil:
You can use the `apt` package manager to install all of the tools
used to develop Daffodil:

sudo apt install build-essential clang clang-format default-jdk git libmxml-dev
sudo apt install build-essential clang clang-format default-jdk git iwyu libcriterion-dev libmxml-dev
# If "iwyu -print-resource-dir" prints /usr/lib/clang/13.0.1 and it doesn't exist:
sudo apt install libclang-common-13-dev

If you want to use clang instead of gcc, you'll have to set your
environment variables `CC` and `AR` to the clang binaries' names:

export CC=clang AR=llvm-ar-14 # or llvm-ar-10 or whatever you have
export CC=clang AR=llvm-ar-14 # or whatever llvm-ar-* version you have

However, Ubuntu has no [sbt] package in its own repositories.
You'll have to install the latest [sbt] version following its
Expand All @@ -95,7 +97,7 @@ Install [MSYS2] following its website's instructions and open a new
libraries.

You can use the `pacman` package manager to install most of the tools
needed to build Daffodil:
used to develop Daffodil:

pacman -S clang diffutils gcc git make pkgconf

Expand Down Expand Up @@ -130,9 +132,10 @@ Install the Xcode Command Line Tools:
xcode-select --install

You can use the [Homebrew] package manager to install most of the tools
needed to build Daffodil:
used to develop Daffodil:

brew install clang-format
brew install criterion
brew install git
brew install llvm # needed by iwyu
brew install include-what-you-use
Expand Down
39 changes: 30 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,32 +59,53 @@ Compile source code:

sbt compile

### Tests
### Test

Run unit tests:
Check all unit tests pass:

sbt test

Run slower integration tests:
Check all integration tests pass:

sbt daffodil-test-integration/test

### Command Line Interface
### Format

Build the command line interface (Linux and Windows shell scripts in
`daffodil-cli/target/universal/stage/bin/`; see the [Command Line
Interface] documentation for details on their usage):
Check format of source and sbt files:

sbt scalafmtCheckAll scalafmtSbtCheck

Reformat source and sbt files if necessary:

sbt scalafmtAll scalafmtSbt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding commands for what you use to format h C code?

Maybe for a future PR, it might be worth adding an sbt task that executes those commands with the right options (e.g. sbt cfmtAll) so we don't have to remember them. Could also be useful to add a step in our github actions lint job to run that and error if git diff reports and changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can write a sbt task which runs clang-format on the C files and errors if it finds and reports any diffs. However, some people may get an error running the task (especially if the task's purpose is to run all the language formatters from a single command) since clang-format is not commonly installed on many (most?) machines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. As long as the error is something useful that should be fine. I recommend doing it later though, doesn't have to be part of this PR.


### Build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need another build section for building and locally publishing the jars? Anyone developing something integrating Daffodil may want that. Maybe something like this?

Build the Daffodil jars and publish to the a Maven or Ivy repository

Maven:

    sbt publishM2

Ivy:

    sbt publishLocal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll document the publish commands too and explain Ivy is also used by sbt to build schema-only projects.


Build the Daffodil command line interface (Linux and Windows shell
scripts in `daffodil-cli/target/universal/stage/bin/`; see the
[Command Line Interface] documentation for details on their usage):

sbt daffodil-cli/stage

### License Check
Publish the Daffodil jars to a Maven repository (for Java projects) or
Ivy repository (for Scala or schema projects).

Maven (for Java or mvn):

sbt publishM2

Ivy (for Scala or sbt):

sbt publishLocal

### Check Licenses

Run [Apache RAT] (license audit report in `target/rat.txt` and error
if any unapproved licenses are found):

sbt ratCheck

### Test Coverage Report
### Check Coverage

Run [sbt-scoverage] (report in `target/scala-ver/scoverage-report/`):

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,12 @@
# See the License for the specific language governing permissions and
# limitations under the License.

AlignConsecutiveDeclarations: true
AllowShortFunctionsOnASingleLine: None
AllowShortIfStatementsOnASingleLine: true
AllowShortIfStatementsOnASingleLine: WithoutElse
AlwaysBreakAfterReturnType: TopLevelDefinitions
BasedOnStyle: llvm
BasedOnStyle: LLVM
BreakBeforeBraces: Allman
ColumnLimit: 110
IndentWidth: 4
KeepEmptyLinesAtTheStartOfBlocks: false
SortIncludes: false
SortIncludes: Never
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ clean:
# $ make iwyu

FMT = clang-format -i
IWYU = iwyu -Xiwyu --max_line_length=999 -Xiwyu --update_comments
IWYU = iwyu -Xiwyu --max_line_length=111 -Xiwyu --update_comments

format:
$(FMT) $(HEADERS) $(SOURCES) $(TSOURCES)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
* limitations under the License.
*/

// auto-maintained by iwyu
// clang-format off
#include "cli_errors.h"
#include <assert.h> // for assert
Expand All @@ -31,9 +32,9 @@
static const ErrorLookup *
error_lookup(uint8_t code)
{
static const ErrorLookup table[CLI_ZZZ - ERR_ZZZ] = {
static const ErrorLookup table[CLI__NUM_CODES - ERR__NUM_CODES] = {
{CLI_DIAGNOSTICS, "parse failed with %" PRId64 " diagnostics\n", FIELD_D64},
{CLI_FILE_CLOSE, "error closing file\n", FIELD_ZZZ},
{CLI_FILE_CLOSE, "error closing file\n", FIELD__NO_ARGS},
{CLI_FILE_OPEN, "error opening file '%s'\n", FIELD_S},
{CLI_HELP_USAGE,
"Usage: %s [OPTION...] <command> [infile]\n"
Expand Down Expand Up @@ -61,36 +62,36 @@ error_lookup(uint8_t code)
{CLI_INVALID_INFOSET, "invalid infoset type -- '%s'\n" USAGE, FIELD_S},
{CLI_INVALID_OPTION, "invalid option -- '%c'\n" USAGE, FIELD_C},
{CLI_INVALID_VALIDATE, "invalid validate mode -- '%s'\n" USAGE, FIELD_S},
{CLI_MISSING_COMMAND, "missing command\n" USAGE, FIELD_ZZZ},
{CLI_MISSING_COMMAND, "missing command\n" USAGE, FIELD__NO_ARGS},
{CLI_MISSING_VALUE, "option requires an argument -- '%c'\n" USAGE, FIELD_C},
{CLI_PROGRAM_ERROR,
"unexpected getopt code %" PRId64 "\n"
"Check for program error\n",
FIELD_D64},
{CLI_PROGRAM_VERSION, "%s\n", FIELD_S_ON_STDOUT},
{CLI_STACK_EMPTY, "stack empty, stopping program\n", FIELD_ZZZ},
{CLI_STACK_OVERFLOW, "stack overflow, stopping program\n", FIELD_ZZZ},
{CLI_STACK_UNDERFLOW, "stack underflow, stopping program\n", FIELD_ZZZ},
{CLI_STACK_EMPTY, "stack empty, stopping program\n", FIELD__NO_ARGS},
{CLI_STACK_OVERFLOW, "stack overflow, stopping program\n", FIELD__NO_ARGS},
{CLI_STACK_UNDERFLOW, "stack underflow, stopping program\n", FIELD__NO_ARGS},
{CLI_STRTOBOOL, "error converting XML data '%s' to boolean\n", FIELD_S},
{CLI_STRTOD_ERRNO, "error converting XML data '%s' to number\n", FIELD_S},
{CLI_STRTOI_ERRNO, "error converting XML data '%s' to integer\n", FIELD_S},
{CLI_STRTONUM_EMPTY, "found no number in XML data '%s'\n", FIELD_S},
{CLI_STRTONUM_NOT, "found non-number characters in XML data '%s'\n", FIELD_S},
{CLI_STRTONUM_RANGE, "number in XML data '%s' out of range\n", FIELD_S},
{CLI_UNEXPECTED_ARGUMENT, "unexpected extra argument -- '%s'\n" USAGE, FIELD_S},
{CLI_XML_DECL, "error making new XML declaration\n", FIELD_ZZZ},
{CLI_XML_DECL, "error making new XML declaration\n", FIELD__NO_ARGS},
{CLI_XML_ELEMENT, "error making new XML element '%s'\n", FIELD_S},
{CLI_XML_ERD, "unexpected ERD typeCode %" PRId64 " while reading XML data\n", FIELD_D64},
{CLI_XML_GONE, "ran out of XML data\n", FIELD_ZZZ},
{CLI_XML_INPUT, "unable to read XML data from input file\n", FIELD_ZZZ},
{CLI_XML_GONE, "ran out of XML data\n", FIELD__NO_ARGS},
{CLI_XML_INPUT, "unable to read XML data from input file\n", FIELD__NO_ARGS},
{CLI_XML_LEFT, "did not consume all of the XML data, '%s' left\n", FIELD_S},
{CLI_XML_MISMATCH, "found mismatch between XML data and infoset '%s'\n", FIELD_S},
{CLI_XML_WRITE, "error writing XML document\n", FIELD_ZZZ},
{CLI_XML_WRITE, "error writing XML document\n", FIELD__NO_ARGS},
};

if (code >= ERR_ZZZ && code < CLI_ZZZ)
if (code >= ERR__NUM_CODES && code < CLI__NUM_CODES)
{
const ErrorLookup *lookup = &table[code - ERR_ZZZ];
const ErrorLookup *lookup = &table[code - ERR__NUM_CODES];

// Double check that we looked up correct row
assert(code == lookup->code);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,16 @@
#ifndef CLI_ERRORS_H
#define CLI_ERRORS_H

// auto-maintained by iwyu
// clang-format off
#include "errors.h" // for ERR_ZZZ
#include "errors.h" // for ERR__NUM_CODES
// clang-format on

// CliCode - identifiers of libcli errors

enum CliCode
{
CLI_DIAGNOSTICS = ERR_ZZZ,
CLI_DIAGNOSTICS = ERR__NUM_CODES,
CLI_FILE_CLOSE,
CLI_FILE_OPEN,
CLI_HELP_USAGE,
Expand Down Expand Up @@ -59,7 +60,7 @@ enum CliCode
CLI_XML_LEFT,
CLI_XML_MISMATCH,
CLI_XML_WRITE,
CLI_ZZZ,
CLI__NUM_CODES,
};

// CliLimits - limits on how many elements static arrays can hold
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
* limitations under the License.
*/

// auto-maintained by iwyu
// clang-format off
#include "daffodil_getopt.h"
#include <string.h> // for strcmp, strrchr
Expand All @@ -29,23 +30,15 @@ struct daffodil_cli daffodil_cli = {
DAFFODIL_MISSING_COMMAND, // default subcommand
};

// Initialize our "daffodil parse" CLI options
// Initialize our "daffodil parse/unparse" CLI options

struct daffodil_parse_cli daffodil_parse = {
struct daffodil_pu_cli daffodil_pu = {
"xml", // default infoset type
"-", // default infile
"-", // default outfile
false, // default validate
};

// Initialize our "daffodil unparse" CLI options

struct daffodil_unparse_cli daffodil_unparse = {
"xml", // default infoset type
"-", // default infile
"-", // default outfile
};

// Parse our command line interface. Note there is NO portable way to
// parse "daffodil [options] command [more options] arguments" with
// getopt. We will have to put all options before all arguments,
Expand All @@ -63,7 +56,7 @@ parse_daffodil_cli(int argc, char *argv[])

// We expect callers to put all non-option arguments at the end
int opt = 0;
while ((opt = getopt(argc, argv, ":hI:o:V:v")) != -1)
while ((opt = getopt(argc, argv, ":hI:o:r:s:V:v")) != -1)
{
switch (opt)
{
Expand All @@ -78,17 +71,21 @@ parse_daffodil_cli(int argc, char *argv[])
error.arg.s = optarg;
return &error;
}
daffodil_parse.infoset_converter = optarg;
daffodil_unparse.infoset_converter = optarg;
daffodil_pu.infoset_converter = optarg;
break;
case 'o':
daffodil_parse.outfile = optarg;
daffodil_unparse.outfile = optarg;
daffodil_pu.outfile = optarg;
break;
case 'r':
// Ignore "-r root" option/optarg
break;
case 's':
// Ignore "-s schema" option/optarg
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this, why of options to just ignore them? Seems that could be very confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A workflow I sometimes use with code generators, DFDL schemas, and TDML tests is to run daffodil generate on the DFDL schemas, run a pair of daffodil parse and c/daffodil parse commands on the same data files, and check that they both produce the same exit status and similar infosets. This manual workflow is finer-grained than and sometimes easier to debug than running daffodil test on TDML tests using the same data and infosets. I thought it could be nice if I could run the same c/daffodil parse command as the daffodil parse command by appending only c/ to daffodil without having to change any of the options.

I thought I could hide this behavior from users (the usage message doesn't mention the -r and -s options) while making life easier for developers and testers who know about this hidden behavior for greater compatibility between daffodil and c/daffodil. I can revert this source file if you think users would be surprised and confused that c/daffodil allows -r and -s options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, makes sense. As long as the optinos aren't in the usage I think it's fine. But it would be worth adding that reasoning as a comment.

case 'V':
if (strcmp("limited", optarg) == 0 || strcmp("on", optarg) == 0)
{
daffodil_parse.validate = true;
daffodil_pu.validate = true;
}
else if (strcmp("off", optarg) != 0)
{
Expand Down Expand Up @@ -122,25 +119,14 @@ parse_daffodil_cli(int argc, char *argv[])
{
const char *arg = argv[i];

if (DAFFODIL_PARSE == daffodil_cli.subcommand)
{
if (strcmp("-", daffodil_parse.infile) == 0)
{
daffodil_parse.infile = arg;
}
else
{
error.code = CLI_UNEXPECTED_ARGUMENT;
error.arg.s = arg;
return &error;
}
}
else if (DAFFODIL_UNPARSE == daffodil_cli.subcommand)
if (DAFFODIL_MISSING_COMMAND != daffodil_cli.subcommand)
{
if (strcmp("-", daffodil_unparse.infile) == 0)
// Set infile only once
if (strcmp("-", daffodil_pu.infile) == 0)
{
daffodil_unparse.infile = arg;
daffodil_pu.infile = arg;
}
// Error if infile is followed by another arg
else
{
error.code = CLI_UNEXPECTED_ARGUMENT;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#ifndef DAFFODIL_GETOPT_H
#define DAFFODIL_GETOPT_H

// auto-maintained by iwyu
// clang-format off
#include <stdbool.h> // for bool
#include "errors.h" // for Error
Expand All @@ -35,24 +36,15 @@ extern struct daffodil_cli
} subcommand;
} daffodil_cli;

// Declare our "daffodil parse" CLI options
// Declare our "daffodil parse/unparse" CLI options

extern struct daffodil_parse_cli
extern struct daffodil_pu_cli
{
const char *infoset_converter;
const char *infile;
const char *outfile;
bool validate;
} daffodil_parse;

// Declare our "daffodil unparse" CLI options

extern struct daffodil_unparse_cli
{
const char *infoset_converter;
const char *infile;
const char *outfile;
} daffodil_unparse;
bool validate;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change 👍 . I've never been a huge fan of aligning variable names in C.

} daffodil_pu;

// Parse our command line interface

Expand Down
Loading