-
Notifications
You must be signed in to change notification settings - Fork 28
Resolution and Stitching
Maven ecosystem-wide dependency graph is used in order to resolve all (transitive) dependencies and dependents of any Maven artifact.
NB! Dependency graph is quite big and requires a lot of RAM. It is better to run everything listed below on a machine that around 64 GB of RAM).
The global dependency graph is built in the DependencyGraphBuilder
class in core/maven
.
In order to build a dependency graph, create an instance of this class and call buildDependencyGraph(...)
method:
var dbContext = PostgresConnector.getDSLContext("jdbc:postgresql://localhost:5432/fasten_java", "fastenro");
var serializedGraphPath = "../mvn_dep_graph"
var graphBuilder = new DependencyGraphBuilder();
var graph = graphBuilder.buildDependencyGraph(dbContext);
DependencyGraphUtilities.serializeDependencyGraph(graph, serializedGraphPath);
Then GraphMavenResolver.dependencyGraph
and GraphMavenResolver.dependentsGraph
can be used for any kinds of analyses or resolutions.
NB! When running the code, don't forget to add the environmental variable PGPASSWORD=fasten
to provide a password for the database connection.
The code snippet above makes use of the serialized dependency graph stored locally. However, if it is not stored, a new graph will be built from scratch and serialized to that location.
In order to resolve all (transitive) dependencies, use GraphMavenResolver
class which is also in the core/maven
and call method resolveDependencies(...)
:
var graphResolver = new GraphMavenResolver();
var dbContext = PostgresConnector.getDSLContext("jdbc:postgresql://localhost:5432/fasten_java", "fastenro");
var serializedGraphPath = "../mvn_dep_graph";
graphResolver.buildDependencyGraph(dbContext, serializedPath); // dbContext is needed only if serializedGraphPath doesn't actually contain the proper serialized graph in order to build the graph from scratch
var dependencySet = graphResolver.resolveDependencies(group, artifact, version, timestamp, dbContext, transitive);
-
group
- groupId of the artifact to resolve -
artifact
- artifactId of the artifact to resolve -
version
- version of the artifact to resolve -
timestamp
- a timestamp to filter the dependency graph to remove all artifacts which were released later than the given timestamp. Use-1
to disable timestamp filtering. -
dbContext
- database connection context. Can be acquired the same way as in this code snippet (var dbContext = PostgresConnector.getDSLContext("jdbc:postgresql://localhost:5432/fasten_java", "fastenro");
). -
transitive
- a boolean whether to resolve all transitive dependencies or only direct.
The same class contains a method for resolving all (transitive) dependents called resolveDependents(...)
and takes the same arguments as dependency resolution:
var graphResolver = new GraphMavenResolver();
// dependency graph is a static attribute so you don't need to rebuild the graph when creating a second instance of GraphMavenResolver
var dependentsSet = graphResolver.resolveDependents(group, artifact, version, timestamp, dbContext, transitive);
The dependency graph is serialzed and stored on Monster at /mnt/fasten/mvn-depgraph/
.
Updating dependency graph on Monster requires 4 steps:
- Stopping the REST API deployment (because it uses current dependency graph)
kubectl delete deployment -n fasten fasten-restapi
- Deleting old dependency graph
rm <dependency-graph-path>/mvn_dep_graph.*
- Generating new dependency graph (make sure to first build the latest version of FASTEN with
mvn clean install
)
PGPASSWORD=<postgres fastenro password> java -Xmx64g -cp docker/server/server-0.0.1-SNAPSHOT-with-dependencies.jar eu.fasten.core.maven.DependencyGraphBuilder <dependency-graph-path>/mvn_dep_graph
- Starting REST API
kubectl apply -f k8s-deployments/fasten/rest-api/restapi-deployment.yaml
If the database is not available then online resolution can be used for resolving dependencies. It downloads POM file of the given artifact, then runs mvn dependency:list
and parses the output. Here is how it can be used:
var mavenResolver = new MavenResolver();
var depedendencySet = mavenResolver.resolveFullDependencySetOnline(groupId, artifactId, version);
Let's assume that we have dependency set D = {A,B,C}. If we have the isolated call graphs for A, B, and C then we can Stitch them with respect to D. Precomputed FASTEN call graphs (RCGs, ERCGs, etc..) all have a section that indicates the callsites. We cannot fully resolve these callsites in isolation due to not having sufficient information about their libraries. Once we have a dependency set the context is available for resolving such calls. The Stitching Algorithm finds the valid targets for calls within RCGs with respect to a provided dependency set. For example, once the Dependency set D and RCGs of A, B, and C are available, Stitching can find out where the target of each call belongs with respect to D.
Important: Note that each RCG needs to be Stitched separately. Also, note that the dependency set must include all revisions including the artifact. Users need to add the revision to the result of the dependency resolver separately if the dependency resolver does not include it in the dependency set.
For example, if someone wants to figure out all the possible edges in every package existing in D,
he/she needs to first create a context (merger instantiation using D:var merger = new CGMerger(D)
) and then Stitch A, B, and C.
There is a merger.mergeWithCHA(ERCG artifact) method provided for doing that.
This provides flexibility not to Stitch everything present in the dependency set and only Stitch the packages that are needed.
There are two options currently available for Stitching the Revision Call Graphs, First Option works with RCG objects which we have already stored as serialized JSONs in FASTEN servers, and the second option uses the information available in the Graph and Metadata database. The Java implementation is available in the develop branch.
In order to do the Stitching for an ExtendedRevisionJavaCallGraph
object A
(example above) with respect to D (List
) one need to first instantiate the CGMerger object using D and then merge the desired CG object (in this case A) as follows:
var merger = new CGMerger(D);
var mergedA = merger.mergeWithCHA(A);
mergedA
is a DirectedGraph
object with the correct edges that can happen between A and its libraries in the context of D. There is also a method provided for merger object called getAllUris()
which can be used to retrieve the full Uris of the ids in the merged graph.
Similar to the previous example one needs to use a merger instance in order to perform Stitching. This merger constructor also needs dbContext
, and rocksDao
to work:
var dbContext = PostgresConnector.getDSLContext("DBUrl","DBUser");
var rocksDao = RocksDBConnector.createReadOnlyRocksDBAccessObject("GaphDBDir”);
Similar to local merger one needs to provide dependency set for merger and artifact to merge for merge method as follows:
var merger = new CGMerger(depSet, dbContext, rocksDao);
var mergedDirectedGraph = merger.mergeWithCHA(artifact);
//or by calling merger.mergeAllDeps() the whole dependency set will be stitched.
Note that database merger works with Maven coordinates or GIDs instead of ERCG objects. So the depSet
is a List
in which each String is a groupId:artifactId:version
(e.g. org.digidoc4j:digidoc4j:1.0.7.beta.2
). artifact
is also a String specifying the maven coordinate of the artifact to resolve. Also, the output of the merge is a DirectedGraph
including the GlobalIDs of nodes stored in the GraphDB.
Currently, both resolution and stitching for Java are available in the develop
branch.
Currently, both resolution and stitching for Java are available in the develop
branch.
Here is the example of how they can be used together to obtain a merged call graphs:
// Connect to the metadata database
var dbContext = PostgresConnector.getDSLContext("jdbc:postgresql://localhost:5432/fasten_java", "fastenro");
// Build dependency graph and resolve the dependency set of the Maven artifact
var graphResolver = new GraphMavenResolver();
var dbContext = PostgresConnector.getDSLContext("jdbc:postgresql://localhost:5432/fasten_java", "fastenro");
var serializedGraphPath = "../mvn_dep_graph";
graphResolver.buildDependencyGraph(dbContext, serializedPath);
var dependencySet = graphResolver.resolveDependencies(groupId, artifactId, version, -1, dbContext, true);
// Transform the set of dependencies into the list of Maven coordinates
var depIds = dependencySet.stream().map(e -> e.id).collect(Collectors.toSet());
// Connect to the graph database
var rocksDao = RocksDBConnector.createReadOnlyRocksDBAccessObject("path/to/graphdb");
// Obtain and merge call graphs
var databaseMerger = new DatabaseMerger(depIds, dbContext, rocksDao);
var mergedDirectedGraph = databaseMerger.mergeWithCHA(groupId + ":" + artifactId + ":" + version);