Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Homoglyphs detection in Minder #2312

Merged
merged 9 commits into from
Feb 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions docs/docs/ref/proto.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions internal/engine/eval/eval.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import (
"fmt"
"os"

"github.com/stacklok/minder/internal/engine/eval/homoglyphs/application"
"github.com/stacklok/minder/internal/engine/eval/jq"
"github.com/stacklok/minder/internal/engine/eval/rego"
"github.com/stacklok/minder/internal/engine/eval/trusty"
Expand Down Expand Up @@ -58,6 +59,8 @@ func NewRuleEvaluator(rt *pb.RuleType, cli *providers.ProviderBuilder) (engif.Ev
trustyEvalConfig.Endpoint = os.Getenv("MINDER_UNSTABLE_TRUSTY_ENDPOINT")
}
return trusty.NewTrustyEvaluator(trustyEvalConfig, cli)
case application.HomoglyphsEvalType:
return application.NewHomoglyphsEvaluator(e.GetHomoglyphs(), cli)
default:
return nil, fmt.Errorf("unsupported rule type engine: %s", rt.Def.Eval.Type)
}
Expand Down
123 changes: 123 additions & 0 deletions internal/engine/eval/homoglyphs/application/homoglyphs_service.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
// Copyright 2023 Stacklok, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// Package application contains the application logic for the homoglyphs rule type
package application

import (
"context"
"fmt"
"strings"

"github.com/google/go-github/v56/github"

"github.com/stacklok/minder/internal/engine/eval/homoglyphs/communication"
"github.com/stacklok/minder/internal/engine/eval/homoglyphs/domain"
engif "github.com/stacklok/minder/internal/engine/interfaces"
"github.com/stacklok/minder/internal/providers"
pb "github.com/stacklok/minder/pkg/api/protobuf/go/minder/v1"
)

const (
// HomoglyphsEvalType is the type of the homoglyphs evaluator
HomoglyphsEvalType = "homoglyphs"

invisibleCharacters = "invisible_characters"
mixedScript = "mixed_scripts"
)

// NewHomoglyphsEvaluator creates a new homoglyphs evaluator
func NewHomoglyphsEvaluator(
reh *pb.RuleType_Definition_Eval_Homoglyphs,
pbuild *providers.ProviderBuilder,
) (engif.Evaluator, error) {
if pbuild == nil {
return nil, fmt.Errorf("provider builder is nil")
}
if reh == nil {
return nil, fmt.Errorf("homoglyphs configuration is nil")
}

switch reh.Type {
case invisibleCharacters:
return NewInvisibleCharactersEvaluator(pbuild)
case mixedScript:
return NewMixedScriptEvaluator(pbuild)
default:
return nil, fmt.Errorf("unsupported homoglyphs type: %s", reh.Type)
}
}

// evaluateHomoglyphs is a helper function to evaluate the homoglyphs rule type
func evaluateHomoglyphs(
ctx context.Context,
processor domain.HomoglyphProcessor,
res *engif.Result,
reviewHandler *communication.GhReviewPrHandler,
) error {
if res == nil {
return fmt.Errorf("result is nil")
}

//nolint:govet
prContents, ok := res.Object.(pb.PrContents)
if !ok {
return fmt.Errorf("invalid object type for homoglyphs evaluator")
}

if prContents.Pr == nil || prContents.Files == nil {
return fmt.Errorf("invalid prContents fields: %v, %v", prContents.Pr, prContents.Files)
}

if len(prContents.Files) == 0 {
return nil
}

// Note: This is a mandatory step to reassign certain fields in the handler.
// This is a workaround to avoid recreating the object.
reviewHandler.Hydrate(ctx, prContents.Pr)

for _, file := range prContents.Files {
for _, line := range file.PatchLines {
violations := processor.FindViolations(line.Content)
if len(violations) == 0 {
continue
}

var commentBody strings.Builder
commentBody.WriteString(processor.GetSubCommentText())

for _, v := range violations {
commentBody.WriteString(processor.GetLineCommentText(v))
}

reviewComment := &github.DraftReviewComment{
Path: github.String(file.Name),
Body: github.String(commentBody.String()),
Line: github.Int(int(line.LineNumber)),
}

reviewHandler.AddComment(reviewComment)
}
}

var reviewText string
if len(reviewHandler.GetComments()) > 0 {
reviewText = processor.GetFailedReviewText()
} else {
reviewText = processor.GetPassedReviewText()
}

return reviewHandler.SubmitReview(ctx, reviewText)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
// Copyright 2023 Stacklok, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package application

import (
"context"
"fmt"

"github.com/stacklok/minder/internal/engine/eval/homoglyphs/communication"
"github.com/stacklok/minder/internal/engine/eval/homoglyphs/domain"
engif "github.com/stacklok/minder/internal/engine/interfaces"
"github.com/stacklok/minder/internal/providers"
)

// InvisibleCharactersEvaluator is an evaluator for the invisible characters rule type
type InvisibleCharactersEvaluator struct {
processor domain.HomoglyphProcessor
reviewHandler *communication.GhReviewPrHandler
}

// NewInvisibleCharactersEvaluator creates a new invisible characters evaluator
func NewInvisibleCharactersEvaluator(pbuild *providers.ProviderBuilder) (*InvisibleCharactersEvaluator, error) {
if pbuild == nil {
return nil, fmt.Errorf("provider builder is nil")
}

ghClient, err := pbuild.GetGitHub(context.Background())
if err != nil {
return nil, fmt.Errorf("could not fetch GitHub client: %w", err)
}

return &InvisibleCharactersEvaluator{
processor: domain.NewInvisibleCharactersProcessor(),
reviewHandler: communication.NewGhReviewPrHandler(ghClient),
}, nil
}

// Eval evaluates the invisible characters rule type
func (ice *InvisibleCharactersEvaluator) Eval(ctx context.Context, _ map[string]any, res *engif.Result) error {
return evaluateHomoglyphs(ctx, ice.processor, res, ice.reviewHandler)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
// Copyright 2023 Stacklok, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package application

import (
"context"
"fmt"

"github.com/stacklok/minder/internal/engine/eval/homoglyphs/communication"
"github.com/stacklok/minder/internal/engine/eval/homoglyphs/domain"
engif "github.com/stacklok/minder/internal/engine/interfaces"
"github.com/stacklok/minder/internal/providers"
)

// MixedScriptsEvaluator is the evaluator for the mixed scripts rule type
type MixedScriptsEvaluator struct {
processor domain.HomoglyphProcessor
reviewHandler *communication.GhReviewPrHandler
}

// NewMixedScriptEvaluator creates a new mixed scripts evaluator
func NewMixedScriptEvaluator(pbuild *providers.ProviderBuilder) (*MixedScriptsEvaluator, error) {
if pbuild == nil {
return nil, fmt.Errorf("provider builder is nil")
}

ghClient, err := pbuild.GetGitHub(context.Background())
if err != nil {
return nil, fmt.Errorf("could not fetch GitHub client: %w", err)
}

msProcessor, err := domain.NewMixedScriptsProcessor()
if err != nil {
return nil, fmt.Errorf("could not create mixed scripts processor: %w", err)
}

return &MixedScriptsEvaluator{
processor: msProcessor,
reviewHandler: communication.NewGhReviewPrHandler(ghClient),
}, nil
}

// Eval evaluates the mixed scripts rule type
func (mse *MixedScriptsEvaluator) Eval(ctx context.Context, _ map[string]any, res *engif.Result) error {
Copy link
Contributor

@dmjb dmjb Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Eval method of the these two evaluation strategies seem to share mostly identical code. I suspect it's possible to refactor them to share a common function. From what I can see, the only things which seem to change between the two Eval methods is:

  1. The type of processor and the method of the processor which carries out the evaluation.
  2. The message written in the PR comment
  3. The status/message in the PR review

One solution for this would be to add a common interface for MixedScriptsEvaluator and InvisibleCharactersEvaluator which looks something like this:

type HomoglyphEvaluator interface {
  FindViolations(codeLine string) InfoType
  GetLineCommentText() string
  GetFailedReviewText() string
  GetPassedReviewText() string
}

(I am not 100% sure of the types involved here, and there may be better names than the ones in my example)

At which point you could write a common function with a signature such as:

func checkForHomoglyphViolation(ctx context.Context, evaluator HomoglyphEvaluator, res *engif.Result) errror

Which will mostly look like the current code, except that it delegates the methods of the HomoglyphEvaluator struct instead of having hard-coded evaluator-specific references. The Eval methods of both structs simply calls into the common function with the appropriate arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the descriptive suggestion and the discussion that followed in Slack!
I've addressed your comments.

return evaluateHomoglyphs(ctx, mse.processor, res, mse.reviewHandler)
}
Loading
Loading