-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C# Design Notes for Apr 1 and 8, 2015 #2119
Comments
Would this be useful beyond simple cases like this? If the parameter/variable has a deeper structure only the "root" can be guaranteed to be null or not null. |
In (Microsoft) code bases we often see code like this:
How would this work with flow analysis? Would it follow through methods? Across assembly boundaries? Into precompiled / binary code? Or would you support a contract annotation like ReSharpers StringAnnotation? |
Every person working on program analysis should know dataflow analysis is orthogonal to state analysis. Pointer aliasing complicates things a lot, aliasing problems can only be solved via some kind of heap modelling like 'shape analysis'. Even In ReSharper the analysis of "dotted chains" is solved via a lot heuristics (like dropping knowledge of dotted chains if some method is called) and it kind-a works, but produces false positives sometimes. |
I have spent some time researching non nullable reference types and i remember the CodeContracts ugly implementation. I have found more general variant of solution (may be just way to think to) for both problems: 1) nullability and 2) contracts. public constraint class NonZeroLengthString: String
{
public bool InvariantConstraints()
{
return this != null && this.Length > 0; // Analyzed by compile-time inference engine..
}
} then we can define C# 7.0 signature: public delegate NonZerLengthString DoSomething(object obj); It is compiled to CLR code: [return: CS7Constraint["NonZeroLengthString"]]
public delegate string DoSomething(object obj); More examples: public constraint class NonNullReference<T>: class T
{
public bool IvariantConstraints()
{
return this != null;
}
}
public class Foo
{
public Foo(NonNullReference<string> arg1)
{
}
// The same signature, but with C# 7.0 syntax sugar.
public void Bar(string! arg1)
{
}
}
// C# 7.0 strict mode (all reference types are non nullable by default)
strict public class Foo
{
public Foo(string arg1, string? canBeNullVariable)
{
}
} This approach may be a first step on injecting powerful static code check and language built-in inference engine. All variables can be fulfilled with additional semantic and language inference engine can help to protect developers from their mistakes. In short words this approach can be called: Variables semantic markup |
Design notes have been archived at https://github.com/dotnet/roslyn/blob/future/docs/designNotes/2015-04-01%2C8%20C%23%20Design%20Meeting.md but discussion can continue here. |
C# Design Meeting Notes for Apr 1 and Apr 8, 2015
Agenda
Matt Warren wrote a Roslyn analyzer as a low cost way to experiment with nullability semantics. In these two meetings we looked at evolving versions of this analyzer, and what they imply for language design.
The analyzer is here: https://github.com/mattwar/nullaby.
Flow-based nullability checking
At the design review on Mar 25 (#1921), there was strong support for adding nullability support to the language, but also the advice to make it easy to transition into using nullability checking by recognizing current patterns for null checking.
This suggests a flow-based approach, where the "null state" of variables is tracked by the compiler, and may be different in different blocks of code. Comparisons of the variable, and assignments to the variable, would all change its null state within the scope of effect of those operations.
An inherent danger with flow-based checks like that is that a variable may change in an untracked way. The risk of that for parameters and local variables is limited: the variable would have to be captured and modified by a lambda, and that lambda would have to be executed elsewhere during the running of the current function. Given that any null-checking machinery we build would have to be somewhat approximate anyway, we can probably live with this risk.
It gets gradually worse if we try to track fields of
this
or other objects, or even properties or array elements. In all likelihood, tracking just parameters and local variables would deliver the bulk of the value, but at least "dotted chains" would certainly also be useful.Attributes versus syntax
The analyzer makes use of attributes to denote when a variable is "nullable" (
[CouldBeNull]
) or "non-nullable" ([ShouldNotBeNull]
). Compared to a built-in syntax, this has several disadvantages:These limitations are inherent to the nature of the experiment, but we know how to counter them if we add language syntax, even if that syntax is encoded with the use of attributes. (We learned all the tricks when we introduced
dynamic
.)Analyzers versus built-in rules
Providing nullability diagnostics by means of an analyzer comes with a number of pros and cons compared to having those diagnostics built in to the language.
On the other hand:
!
and?
syntax to the language, without adding the accompanying semanticsThe analyzer has several shortcomings due to the lack of a public flow analysis API in Roslyn. That would be a great addition, regardless of what we do for nullability checking.
With a cruder/simpler analysis built-in to the language, you can imagine folks building enhancement analyzers on top of it. Those may not just add new diagnostics, but might want to turn off compiler warnings where more heuristics can determine that they are in fact not warranted. The analyzer infrastructure doesn't currently support this.
Taking the analyzer for a spin
Given the following declaration
The following statements would yield warnings because
s
is declared to be nullable, but is used in a way that requires it not to be:However, all the following methods are ok, because the flow analysis can determine that
s
is not null at the point where it is used:This seems hugely useful, because current code will just continue to work in the vast majority of cases.
It is a change of thinking from where nullability is strongly part of the type of a variable, and is established at declaration time. In that paradigm, establishing that a given variable is not null doesn't help you; you have to capture its value in a new, more strongly typed variable, which is more cumbersome, and which would require existing code to be extensively rewritten to match new patterns.
Conclusion
Matt's experiment is great, and we are very interested in the nullability tracking approach, because it has the potential to make the bulk of existing code work in the face of new annotations.
The text was updated successfully, but these errors were encountered: