Language Server Architecture

This is a summary of some main components of the language server, aiming to help maintainers and contributors with navigating the codebase.

Decoder

Majority of the language server functionality such as completion, hover, document links, semantic tokens, symbols etc. are provided by the decoder package of hashicorp/hcl-lang. hcl-lang is generally considered a reusable component for any HCL2-based language server (that is not just Terraform). Any functionality which other HCL2-based language server may reuse should be contributed there, not into terraform-ls.

The decoder essentially takes in directories of parsed HCL files + schemas and uses both to walk the AST to provide completion candidates, hover data and other relevant data.

Schema

Decoder needs schema to produce relevant completion candidates, hover data etc. hashicorp/terraform-schema houses most of the Terraform Core schema (such as terraform, resource or variable blocks) + helpers to combine that Core schema with provider schemas (such as inner parts of resource or data blocks) and help assemble schemas for modules.

Global State

Most of the global state is maintained within various go-memdb tables under state package, passed around via state.StateStore.

This includes

documents - documents open by the client (see Document Storage)
jobs - pending/running jobs (see Job Scheduler)
modules - AST and other metadata about Terraform modules collected by indexing jobs ^
provider_schemas - provider schemas pre-baked or obtained via Terraform CLI by indexing jobs ^
provider_ids & module_ids - mapping between potentially sensitive identifiers and randomly generated UUIDs, to enable privacy-respecting telemetry

Document Storage

documents package, and document.Document struct in particular represents open documents server receives from the client via LSP text synchronization methods such as textDocument/didOpen, textDocument/didChange, stored as an entry in the documents memdb table. The textDocument/didClose method removes the document from state, making other components assume that it then matches OS filesystem.

AST representation of these documents is passed to the decoder, which in turn ensures that all completion candidates, hover data etc. is relevant to what the user sees in their editor window even if the file/document is not saved.

Each document also maintains line-separated version, to enable line-based diffing and to enable conversion between LSP's representation of position (line:column) to HCL's representation (hcl.Pos) which mostly uses byte offsets.

Filesystem

filesystem package provides an io/fs compatible interface primarily for any jobs which need to operate on the whole directory (Terraform module) regardless of where the file contents comes from (virtual document or OS filesystem).

LSP/RPC Layer

langserver package represents the RPC layer responsible for processing any incoming and outgoing LSP (RPC JSON) requests/responses between the server and client. The langserver/handlers package generally follows a pattern of 1 file per LSP method. The package also contains E2E tests which exercise the language server from client's perspective. service.go represents the "hot path" of the LSP/RPC layer, basically mapping functions to method names which the server supports.

protocol package represents the structs reflecting LSP spec, i.e. the structure of request and response JSON bodies. Given that there is no other complete and/or well-maintained representation of the LSP spec for Go (at the time of writing), majority of this is copied from within gopls, which in turn generates these from the TypeScript SDK - practically the only officially maintained and most complete implementation of LSP spec to date.

Mentioned protocol request/response representations may not always be practical throughout the codebase and within hcl-lang, therefore lsp package contains various helpers to convert the protocol types from and to other internal types we use to represent the same data. It also filters and checks the data using client and server capabilities, such that other parts of the codebase don't have to.

"Features"

The internal/features package tries to group certain "dialects" of the Terraform language into self-contained features. A feature manages its own state, jobs, decoder, and file parsing logic.

We currently have several features:

*.tf and *.tf.json files are handled in the modules feature
*.tfvars and *.tfvars.json files are handled in the variables feature
.terraform/ and .terraform.lock.hcl related operations are handled in the rootmodules feature
*.tfstack.hcl and *.tfdeploy.hcl files are handled in the stacks feature

A feature can provide data to the external consumers through methods. For example, the variables feature needs a list of variables from the modules feature. There should be no direct import from feature packages (we could enforce this by using internal/, but we won't for now) into other parts of the codebase. The "hot path" service mentioned above takes care of initializing each feature at the start of a new LS session.

The jobs package of each feature contains all the different indexing jobs needed to retrieve all kinds of data and metadata, to perform completion, hover, go-to-definition, and so on. The jobs are scheduled on the global job scheduler as a result of various events (e.g. didOpen).

Modules Feature Jobs

ParseModuleConfiguration - parses *.tf files to turn []byte into hcl types (AST)
LoadModuleMetadata - uses earlydecoder to do early TF version-agnostic decoding to obtain metadata (variables, outputs etc.) which can be used to do more detailed decoding in hot-path within hcl-lang decoder
PreloadEmbeddedSchema – loads provider schemas based on provider requirements from the bundled schemas
DecodeReferenceTargets - uses hcl-lang decoder to collect reference targets within *.tf
DecodeReferenceOrigins - uses hcl-lang decoder to collect reference origins within *.tf
GetModuleDataFromRegistry - obtains data about any modules (inputs & outputs) from the Registry API based on module calls
SchemaModuleValidation - does schema-based validation of module files (*.tf) and produces diagnostics associated with any "invalid" parts of code
ReferenceValidation - does validation based on (mis)matched reference origins and targets, to flag up "orphaned" references
TerraformValidate - uses Terraform CLI to run the validate subcommand and turn the provided (JSON) output into diagnostics

Variables Feature Jobs

ParseVariables - parses *.tfvars files to turn []byte into hcl types (AST)
DecodeVarsReferences - uses hcl-lang decoder to collect references within *.tfvars
SchemaVariablesValidation - does schema-based validation of variable files (*.tfvars) and produces diagnostics associated with any "invalid" parts of code

Root Modules Feature Jobs

GetTerraformVersion - obtains Terraform version via terraform version -json
ParseModuleManifest - parses module manifest with metadata about any installed modules
ObtainSchema - obtains provider schemas via terraform providers schema -json
ParseProviderVersions is a job complimentary to ObtainSchema in that it obtains versions of providers/schemas from Terraform CLI's lock file

Stack Feature Jobs

ParseStackConfiguration - parses *.tfstack.hcl and *.tfdeploy.hcl files to turn []byte into hcl types (AST)
LoadStackMetadata - uses earlydecoder to do early TF version-agnostic decoding to obtain metadata (variables, outputs etc.) which can be used to do more detailed decoding in hot-path within hcl-lang decoder
PreloadEmbeddedSchema – loads provider schemas based on provider requirements from the bundled schemas
DecodeReferenceTargets - uses hcl-lang decoder to collect reference targets within *.tfstack.hcl and *.tfdeploy.hcl
DecodeReferenceOrigins - uses hcl-lang decoder to collect reference origins within *.tfstack.hcl and *.tfdeploy.hcl
SchemaStackValidation - does schema-based validation of module files (*.tfstack.hcl and *.tfdeploy.hcl) and produces diagnostics associated with any "invalid" parts of code
ReferenceValidation - does validation based on (mis)matched reference origins and targets, to flag up "orphaned" references

Adding a new feature / "language"

The existing variables feature is a good starting point when introducing a new language. Usually you need to roughly follow these steps to get a minimal working example:

Create a new feature with the same folder structure as existing ones
Model the internal state representation
Subscribe to some events of the event bus
Add a parsing job that gets triggered from an event
Add a decoder that makes use of some kind of schema
Register the new feature in internal/langserver/handlers/service.go
- Start the feature as part of configureSessionDependencies()
- Make sure to call the Stop() function in shutdown() as well
If the feature reports diagnostics, add a call to collect them in updateDiagnostics() in internal/langserver/handlers/hooks_module.go

Job Scheduler

All jobs end up in the jobs memdb table, from where they're picked up from by any of the two schedulers described below.

scheduler contains a relatively general-purpose implementation of a job scheduler. There are two instances of the scheduler in use, both of which are launched by initialize LSP request and shut down with shutdown LSP request.

openDirIndexer processes any jobs concerning directories which have any files open
closedDirIndexer processes any jobs concerning directories which do not have any files open

The overall flow of jobs is illustrated in the diagram below.

The mentioned documents memdb table is consulted for whether a directory has any open files - i.e. whether server has received textDocument/didOpen and not textDocument/didClose concerning a particular directory. Using two separate schedulers loosely reflects the fact that data for files which the user is editing at the moment are more critical, unlike additional data about other directories/modules which would only enrich editing of the open files (such as by adding cross-module context, providing go-to-definition etc.).

Jobs also depend on each other. These dependencies are illustrated in the diagrams below.

didOpen Job Flow

Event Bus

The eventbus is responsible for distributing events to subscribers. It comes with a fixed list of topics that anyone can subscribe to. An event is sent to all subscribers of a topic. A subscriber can decide to block until the event is processed by using a return channel. It is primarily used to distribute LSP document synchronization events.

Event Sources

Walker

The Walker is responsible for walking the file system hierarchy of the entire workspace (including files that the user may not have open) in the background to gain a better understanding of the workspace structure. The walker doesn't schedule any jobs and doesn't do any additional work other than reporting the directory structure and the files it contains. The walker follows the LSP/RPC lifecycle of the server, i.e. it is started by an initialize request and shut down by a shutdown request.

The walker logic is contained in internal/walker/walker.go.

Watched Files

Clients are expected to watch *.tf and *.tfvars files by default and send updates to the server via workspace/didChangeWatchedFiles notifications. Additionally, the server uses dynamic watcher registration per LSP to instruct clients to watch for plugin and module lock files within .terraform directories, such that it can refresh schemas or module metadata, both of which can be used to provide IntelliSense.

The mentioned dynamic registration happens as part of initialized.

workspace/didChangeWatchedFiles handler invalidates relevant data based on what files were changed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

architecture.md

architecture.md

Language Server Architecture

Decoder

Schema

Global State

Document Storage

Filesystem

LSP/RPC Layer

"Features"

Modules Feature Jobs

Variables Feature Jobs

Root Modules Feature Jobs

Stack Feature Jobs

Adding a new feature / "language"

Job Scheduler

didOpen Job Flow

Event Bus

Event Sources

Walker

Watched Files

Files

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Language Server Architecture

Decoder

Schema

Global State

Document Storage

Filesystem

LSP/RPC Layer

"Features"

Modules Feature Jobs

Variables Feature Jobs

Root Modules Feature Jobs

Stack Feature Jobs

Adding a new feature / "language"

Job Scheduler

didOpen Job Flow

Event Bus

Event Sources

Walker

Watched Files