[flang][hlfir] Fixed some finalization/deallocation issues. #67047

vzakhari · 2023-09-21T17:54:55Z

This set of commits resolves some of the issues with elemental calls producing
results that may require finalization, and also some memory leak issues due to
the missing deallocation of allocatable components of the temporary buffers
created by the bufferization pass.

I split it into 7 commits to make the review a little bit more convenient, though,
I am planning to merge it as a single squashed commit.

[flang][runtime] Expose Finalize API for derived types.
[flang][hlfir] Add 'finalize' attribute for DestroyOp.
[flang][hlfir] Postpone result finalization for elemental calls.

The results of elemental calls generated inside hlfir.elemental must not
be finalized/destructed before they are copied into the resulting
array. The finalization must be done on the array as a whole
(e.g. there might be different scalar and array finalization routines).
The finalization work is left to the hlfir.destroy corresponding
to this hlfir.elemental.
[flang][hlfir] Tighten requirements on hlfir.end_associate operand.

If component deallocation might be required for the operand of
hlfir.end_associate, we have to be able to get the variable
shape/params to create a descriptor for calling the runtime.
This commit adds verification that we can do so.
[flang][hlfir] Lower argument clean-ups using valid hlfir.end_associate.

The operand must be a Fortran entity, when allocatable component
deallocation may be required.
[flang][hlfir] Properly clean-up temporary buffers in bufferization pass.

This commit combines changes for proper finalization and component
deallocation of the temporary buffers. The finalization part
relates to hlfir.destroy operations with 'finalize' attribute.
The component deallocation might be invoked for both hlfir.destroy
and hlfir.end_associate, if the operand is of a derived type
with allocatable component(s).

The changes are mostly in one function, so I decided not to split them.
[flang][hlfir] Disable optimizations for hlfir.elemental requiring finalization.

If hlfir.elemental is coupled with hlfir.destroy with 'finalize' attribute,
the temporary array result of hlfir.elemental needs to be created
for the purpose of finalization. We cannot do certain optimizations
on such hlfir.elemental operations.

I was not able to come up with a test for the OptimizedBufferization pass,
but I put the check there as well.

The results of elemental calls generated inside hlfir.elemental must not be finalized/destructed before they are copied into the resulting array. The finalization must be done on the array as a whole (e.g. there might be different scalar and array finalization routines). The finalization work is left to the hlfir.destroy corresponding to this hlfir.elemental.

If component deallocation might be required for the operand of hlfir.end_associate, we have to be able to get the variable shape/params to create a descriptor for calling the runtime. This commit adds verification that we can do so.

The operand must be a Fortran entity, when allocatable component deallocation may be required.

…ass. This commit combines changes for proper finalization and component deallocation of the temporary buffers. The finalization part relates to hlfir.destroy operations with 'finalize' attribute. The component deallocation might be invoked for both hlfir.destroy and hlfir.end_associate, if the operand is of a derived type with allocatable component(s). The changes are mostly in one function, so I decided not to split them.

…nalization. If hlfir.elemental is coupled with hlfir.destroy with 'finalize' attribute, the temporary array result of hlfir.elemental needs to be created for the purpose of finalization. We cannot do certain optimizations on such hlfir.elemental operations. I was not able to come up with a test for the OptimizedBufferization pass, but I put the check there as well.

llvmbot · 2023-09-21T17:56:02Z

@llvm/pr-subscribers-flang-runtime

@llvm/pr-subscribers-flang-fir-hlfir

Changes

This set of commits resolves some of the issues with elemental calls producing
results that may require finalization, and also some memory leak issues due to
the missing deallocation of allocatable components of the temporary buffers
created by the bufferization pass.

I split it into 7 commits to make the review a little bit more convenient, though,
I am planning to merge it as a single squashed commit.

[flang][runtime] Expose Finalize API for derived types.
[flang][hlfir] Add 'finalize' attribute for DestroyOp.
[flang][hlfir] Postpone result finalization for elemental calls.
[flang][hlfir] Tighten requirements on hlfir.end_associate operand.
[flang][hlfir] Lower argument clean-ups using valid hlfir.end_associate.
[flang][hlfir] Properly clean-up temporary buffers in bufferization pass.
[flang][hlfir] Disable optimizations for hlfir.elemental requiring finalization.

Patch is 68.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67047.diff

29 Files Affected:

(modified) flang/include/flang/Lower/ConvertCall.h (+3-1)
(modified) flang/include/flang/Optimizer/Builder/HLFIRTools.h (+7-31)
(modified) flang/include/flang/Optimizer/Builder/Runtime/Derived.h (+5)
(modified) flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h (+38)
(modified) flang/include/flang/Optimizer/HLFIR/HLFIROps.td (+23-2)
(modified) flang/include/flang/Runtime/derived-api.h (+4)
(modified) flang/lib/Lower/ConvertCall.cpp (+55-9)
(modified) flang/lib/Lower/HlfirIntrinsics.cpp (+4)
(modified) flang/lib/Optimizer/Builder/HLFIRTools.cpp (+9)
(modified) flang/lib/Optimizer/Builder/Runtime/Derived.cpp (+12)
(modified) flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp (+5)
(modified) flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp (+36-1)
(modified) flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp (+104-18)
(modified) flang/lib/Optimizer/HLFIR/Transforms/InlineElementals.cpp (+5)
(modified) flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp (+7)
(modified) flang/runtime/derived-api.cpp (+12)
(modified) flang/test/HLFIR/associate-codegen.fir (+1-1)
(added) flang/test/HLFIR/bufferize-destroy-for-derived.fir (+97)
(added) flang/test/HLFIR/bufferize-end-associate-for-derived.fir (+52)
(modified) flang/test/HLFIR/bufferize-poly-expr.fir (+1-1)
(modified) flang/test/HLFIR/destroy.fir (+16)
(modified) flang/test/HLFIR/elemental-codegen-nested.fir (+1-1)
(modified) flang/test/HLFIR/inline-elemental.fir (+37)
(modified) flang/test/HLFIR/invalid.fir (+16)
(modified) flang/test/HLFIR/mul_transpose.f90 (+2-1)
(added) flang/test/Lower/HLFIR/associate-for-args-with-alloc-components.f90 (+65)
(added) flang/test/Lower/HLFIR/elemental-call-with-finalization.f90 (+60)
(modified) flang/test/Lower/HLFIR/poly_expr_for_nonpoly_dummy.f90 (+2-2)
(modified) flang/test/Lower/HLFIR/polymorphic-expressions.f90 (+1-1)

diff --git a/flang/include/flang/Lower/ConvertCall.h b/flang/include/flang/Lower/ConvertCall.h
index 62ba229614d58c1..f8171236bb39d99 100644
--- a/flang/include/flang/Lower/ConvertCall.h
+++ b/flang/include/flang/Lower/ConvertCall.h
@@ -28,11 +28,13 @@ namespace Fortran::lower {
 /// the call and return the result. This function deals with explicit result
 /// allocation and lowering if needed. It also deals with passing the host
 /// link to internal procedures.
+/// \p isElemental must be set to true if elemental call is being produced.
+/// It is only used for HLFIR.
 fir::ExtendedValue genCallOpAndResult(
     mlir::Location loc, Fortran::lower::AbstractConverter &converter,
     Fortran::lower::SymMap &symMap, Fortran::lower::StatementContext &stmtCtx,
     Fortran::lower::CallerInterface &caller, mlir::FunctionType callSiteType,
-    std::optional<mlir::Type> resultType);
+    std::optional<mlir::Type> resultType, bool isElemental = false);
 
 /// If \p arg is the address of a function with a denoted host-association tuple
 /// argument, then return the host-associations tuple value of the current
diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h
index 6d73ebc3a7e1d9c..393e70f772e5ecf 100644
--- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h
+++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h
@@ -35,37 +35,6 @@ class ElementalOpInterface;
 class ElementalAddrOp;
 class YieldElementOp;
 
-/// Is this an SSA value type for the value of a Fortran procedure
-/// designator ?
-inline bool isFortranProcedureValue(mlir::Type type) {
-  return type.isa<fir::BoxProcType>() ||
-         (type.isa<mlir::TupleType>() &&
-          fir::isCharacterProcedureTuple(type, /*acceptRawFunc=*/false));
-}
-
-/// Is this an SSA value type for the value of a Fortran expression?
-inline bool isFortranValueType(mlir::Type type) {
-  return type.isa<hlfir::ExprType>() || fir::isa_trivial(type) ||
-         isFortranProcedureValue(type);
-}
-
-/// Is this the value of a Fortran expression in an SSA value form?
-inline bool isFortranValue(mlir::Value value) {
-  return isFortranValueType(value.getType());
-}
-
-/// Is this a Fortran variable?
-/// Note that by "variable", it must be understood that the mlir::Value is
-/// a memory value of a storage that can be reason about as a Fortran object
-/// (its bounds, shape, and type parameters, if any, are retrievable).
-/// This does not imply that the mlir::Value points to a variable from the
-/// original source or can be legally defined: temporaries created to store
-/// expression values are considered to be variables, and so are PARAMETERs
-/// global constant address.
-inline bool isFortranEntity(mlir::Value value) {
-  return isFortranValue(value) || isFortranVariableType(value.getType());
-}
-
 /// Is this a Fortran variable for which the defining op carrying the Fortran
 /// attributes is visible?
 inline bool isFortranVariableWithAttributes(mlir::Value value) {
@@ -442,6 +411,13 @@ hlfir::ElementalOp cloneToElementalOp(mlir::Location loc,
                                       fir::FirOpBuilder &builder,
                                       hlfir::ElementalAddrOp elementalAddrOp);
 
+/// Return true, if \p elemental must produce a temporary array,
+/// for example, for the purpose of finalization. Note that such
+/// ElementalOp's must be optimized with caution. For example,
+/// completely inlining such ElementalOp into another one
+/// would be incorrect.
+bool elementalOpMustProduceTemp(hlfir::ElementalOp elemental);
+
 } // namespace hlfir
 
 #endif // FORTRAN_OPTIMIZER_BUILDER_HLFIRTOOLS_H
diff --git a/flang/include/flang/Optimizer/Builder/Runtime/Derived.h b/flang/include/flang/Optimizer/Builder/Runtime/Derived.h
index 30998f8b0ea6558..d8b06f35b1da8a6 100644
--- a/flang/include/flang/Optimizer/Builder/Runtime/Derived.h
+++ b/flang/include/flang/Optimizer/Builder/Runtime/Derived.h
@@ -31,6 +31,11 @@ void genDerivedTypeInitialize(fir::FirOpBuilder &builder, mlir::Location loc,
 void genDerivedTypeDestroy(fir::FirOpBuilder &builder, mlir::Location loc,
                            mlir::Value box);
 
+/// Generate call to derived type finalization runtime routine
+/// to finalize \p box.
+void genDerivedTypeFinalize(fir::FirOpBuilder &builder, mlir::Location loc,
+                            mlir::Value box);
+
 /// Generate call to derived type destruction runtime routine to
 /// destroy \p box without finalization
 void genDerivedTypeDestroyWithoutFinalization(fir::FirOpBuilder &builder,
diff --git a/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h b/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h
index b76063fb7c53536..aa68d0811c4868a 100644
--- a/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h
+++ b/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h
@@ -78,6 +78,37 @@ inline bool isPolymorphicType(mlir::Type type) {
   return fir::isPolymorphicType(type);
 }
 
+/// Is this an SSA value type for the value of a Fortran procedure
+/// designator ?
+inline bool isFortranProcedureValue(mlir::Type type) {
+  return type.isa<fir::BoxProcType>() ||
+         (type.isa<mlir::TupleType>() &&
+          fir::isCharacterProcedureTuple(type, /*acceptRawFunc=*/false));
+}
+
+/// Is this an SSA value type for the value of a Fortran expression?
+inline bool isFortranValueType(mlir::Type type) {
+  return type.isa<hlfir::ExprType>() || fir::isa_trivial(type) ||
+         isFortranProcedureValue(type);
+}
+
+/// Is this the value of a Fortran expression in an SSA value form?
+inline bool isFortranValue(mlir::Value value) {
+  return isFortranValueType(value.getType());
+}
+
+/// Is this a Fortran variable?
+/// Note that by "variable", it must be understood that the mlir::Value is
+/// a memory value of a storage that can be reason about as a Fortran object
+/// (its bounds, shape, and type parameters, if any, are retrievable).
+/// This does not imply that the mlir::Value points to a variable from the
+/// original source or can be legally defined: temporaries created to store
+/// expression values are considered to be variables, and so are PARAMETERs
+/// global constant address.
+inline bool isFortranEntity(mlir::Value value) {
+  return isFortranValue(value) || isFortranVariableType(value.getType());
+}
+
 bool isFortranScalarNumericalType(mlir::Type);
 bool isFortranNumericalArrayObject(mlir::Type);
 bool isFortranNumericalOrLogicalArrayObject(mlir::Type);
@@ -94,6 +125,13 @@ bool isPolymorphicObject(mlir::Type);
 mlir::Value genExprShape(mlir::OpBuilder &builder, const mlir::Location &loc,
                          const hlfir::ExprType &expr);
 
+/// Return true iff `ty` may have allocatable component.
+/// TODO: this actually belongs to FIRType.cpp, but the method's implementation
+/// depends on HLFIRDialect component. FIRType.cpp itself is part of FIRDialect
+/// that cannot depend on HLFIRBuilder (there will be a cyclic dependency).
+/// This has to be cleaned up, when HLFIR is the default.
+bool mayHaveAllocatableComponent(mlir::Type ty);
+
 } // namespace hlfir
 
 #endif // FORTRAN_OPTIMIZER_HLFIR_HLFIRDIALECT_H
diff --git a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
index 1f584d6afd8fbcb..9fbf09331c099c0 100644
--- a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
+++ b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
@@ -705,6 +705,8 @@ def hlfir_EndAssociateOp : hlfir_Op<"end_associate", [MemoryEffects<[MemFree]>]>
 
   let description = [{
     Mark the end of life of a variable associated to an expression.
+    If the expression has a derived type that may contain allocatable
+    components, the variable operand must be a Fortran entity.
   }];
 
   let arguments = (ins AnyRefOrBoxLike:$var,
@@ -715,6 +717,7 @@ def hlfir_EndAssociateOp : hlfir_Op<"end_associate", [MemoryEffects<[MemFree]>]>
   }];
 
   let builders = [OpBuilder<(ins "hlfir::AssociateOp":$associate)>];
+  let hasVerifier = 1;
 }
 
 def hlfir_AsExprOp : hlfir_Op<"as_expr",
@@ -981,6 +984,11 @@ def hlfir_DestroyOp : hlfir_Op<"destroy", [MemoryEffects<[MemFree]>]> {
     Mark the last use of an hlfir.expr. This will be the point at which the
     buffer of an hlfir.expr, if any, will be deallocated if it was heap
     allocated.
+    If "finalize" attribute is set, the hlfir.expr value will be finalized
+    before the deallocation. Note that this implies that the hlfir.expr
+    is placed into a memory buffer, so that the library runtime
+    can be called on it. The element type of the hlfir.expr must be
+    derived type in this case.
     It is not required to create an hlfir.destroy operation for and hlfir.expr
     created inside an hlfir.elemental and returned in the hlfir.yield_element.
     The last use of such expression is implicit and an hlfir.destroy could
@@ -995,9 +1003,22 @@ def hlfir_DestroyOp : hlfir_Op<"destroy", [MemoryEffects<[MemFree]>]> {
     in bufferization instead.
   }];
 
-  let arguments = (ins hlfir_ExprType:$expr);
+  let arguments = (ins
+    hlfir_ExprType:$expr,
+    UnitAttr:$finalize
+  );
+
+  let assemblyFormat = [{
+    $expr (`finalize` $finalize^)? attr-dict `:` qualified(type($expr))
+  }];
+
+  let extraClassDeclaration = [{
+    bool mustFinalizeExpr() {
+      return getFinalize();
+    }
+  }];
 
-  let assemblyFormat = "$expr attr-dict `:` qualified(type($expr))";
+  let hasVerifier = 1;
 }
 
 def hlfir_CopyInOp : hlfir_Op<"copy_in", [MemoryEffects<[MemAlloc]>]> {
diff --git a/flang/include/flang/Runtime/derived-api.h b/flang/include/flang/Runtime/derived-api.h
index 3bf631bc4d0c763..decba9f686d9201 100644
--- a/flang/include/flang/Runtime/derived-api.h
+++ b/flang/include/flang/Runtime/derived-api.h
@@ -37,6 +37,10 @@ void RTNAME(Initialize)(
 // storage.
 void RTNAME(Destroy)(const Descriptor &);
 
+// Finalizes the object and its components.
+void RTNAME(Finalize)(
+    const Descriptor &, const char *sourceFile = nullptr, int sourceLine = 0);
+
 /// Deallocates any allocatable/automatic components.
 /// Does not deallocate the descriptor's storage.
 /// Does not perform any finalization.
diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp
index 59d059e27cf1aae..788aac2d583f93d 100644
--- a/flang/lib/Lower/ConvertCall.cpp
+++ b/flang/lib/Lower/ConvertCall.cpp
@@ -148,7 +148,7 @@ fir::ExtendedValue Fortran::lower::genCallOpAndResult(
     mlir::Location loc, Fortran::lower::AbstractConverter &converter,
     Fortran::lower::SymMap &symMap, Fortran::lower::StatementContext &stmtCtx,
     Fortran::lower::CallerInterface &caller, mlir::FunctionType callSiteType,
-    std::optional<mlir::Type> resultType) {
+    std::optional<mlir::Type> resultType, bool isElemental) {
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
   using PassBy = Fortran::lower::CallerInterface::PassEntityBy;
   // Handle cases where caller must allocate the result or a fir.box for it.
@@ -435,7 +435,13 @@ fir::ExtendedValue Fortran::lower::genCallOpAndResult(
     std::optional<Fortran::evaluate::DynamicType> retTy =
         caller.getCallDescription().proc().GetType();
     bool cleanupWithDestroy = false;
-    if (!fir::isPointerType(funcType.getResults()[0]) && retTy &&
+    // With HLFIR lowering, isElemental must be set to true
+    // if we are producing an elemental call. In this case,
+    // the elemental results must not be destroyed, instead,
+    // the resulting array result will be finalized/destroyed
+    // as needed by hlfir.destroy.
+    if (!isElemental && !fir::isPointerType(funcType.getResults()[0]) &&
+        retTy &&
         (retTy->category() == Fortran::common::TypeCategory::Derived ||
          retTy->IsPolymorphic() || retTy->IsUnlimitedPolymorphic())) {
       if (retTy->IsPolymorphic() || retTy->IsUnlimitedPolymorphic()) {
@@ -692,6 +698,14 @@ struct PreparedDummyArgument {
     cleanups.emplace_back(
         CallCleanUp{CallCleanUp::ExprAssociate{tempVar, wasCopied}});
   }
+  void pushExprAssociateCleanUp(hlfir::AssociateOp associate) {
+    mlir::Value hlfirBase = associate.getBase();
+    mlir::Value firBase = associate.getFirBase();
+    cleanups.emplace_back(CallCleanUp{CallCleanUp::ExprAssociate{
+        hlfir::mayHaveAllocatableComponent(hlfirBase.getType()) ? hlfirBase
+                                                                : firBase,
+        associate.getMustFreeStrorageFlag()}});
+  }
 
   mlir::Value dummy;
   // NOTE: the clean-ups are executed in reverse order.
@@ -896,8 +910,7 @@ static PreparedDummyArgument preparePresentUserCallActualArgument(
           loc, builder, hlfir::Entity{copy}, storageType, "adapt.valuebyref");
       entity = hlfir::Entity{associate.getBase()};
       // Register the temporary destruction after the call.
-      preparedDummy.pushExprAssociateCleanUp(
-          associate.getFirBase(), associate.getMustFreeStrorageFlag());
+      preparedDummy.pushExprAssociateCleanUp(associate);
     } else if (mustDoCopyInOut) {
       // Copy-in non contiguous variables.
       assert(entity.getType().isa<fir::BaseBoxType>() &&
@@ -924,8 +937,7 @@ static PreparedDummyArgument preparePresentUserCallActualArgument(
     hlfir::AssociateOp associate = hlfir::genAssociateExpr(
         loc, builder, entity, storageType, "adapt.valuebyref");
     entity = hlfir::Entity{associate.getBase()};
-    preparedDummy.pushExprAssociateCleanUp(associate.getFirBase(),
-                                           associate.getMustFreeStrorageFlag());
+    preparedDummy.pushExprAssociateCleanUp(associate);
     if (mustSetDynamicTypeToDummyType) {
       // Rebox the actual argument to the dummy argument's type, and make
       // sure that we pass a contiguous entity (i.e. make copy-in,
@@ -1201,7 +1213,8 @@ genUserCall(Fortran::lower::PreparedActualArguments &loweredActuals,
   // arguments.
   fir::ExtendedValue result = Fortran::lower::genCallOpAndResult(
       loc, callContext.converter, callContext.symMap, callContext.stmtCtx,
-      caller, callSiteType, callContext.resultType);
+      caller, callSiteType, callContext.resultType,
+      callContext.isElementalProcWithArrayArgs());
 
   /// Clean-up associations and copy-in.
   for (auto cleanUp : callCleanUps)
@@ -1687,9 +1700,14 @@ class ElementalCallBuilder {
     mlir::Value elemental =
         hlfir::genElementalOp(loc, builder, elementType, shape, typeParams,
                               genKernel, !mustBeOrdered, polymorphicMold);
+    // If the function result requires finalization, then it has to be done
+    // for the array result of the elemental call. We have to communicate
+    // this via the DestroyOp's attribute.
+    bool mustFinalizeExpr = impl().resultMayRequireFinalization(callContext);
     fir::FirOpBuilder *bldr = &builder;
-    callContext.stmtCtx.attachCleanup(
-        [=]() { bldr->create<hlfir::DestroyOp>(loc, elemental); });
+    callContext.stmtCtx.attachCleanup([=]() {
+      bldr->create<hlfir::DestroyOp>(loc, elemental, mustFinalizeExpr);
+    });
     return hlfir::EntityWithAttributes{elemental};
   }
 
@@ -1743,6 +1761,26 @@ class ElementalUserCallBuilder
     return {};
   }
 
+  bool resultMayRequireFinalization(CallContext &callContext) const {
+    std::optional<Fortran::evaluate::DynamicType> retTy =
+        caller.getCallDescription().proc().GetType();
+    if (!retTy)
+      return false;
+
+    if (retTy->IsPolymorphic() || retTy->IsUnlimitedPolymorphic())
+      fir::emitFatalError(
+          callContext.loc,
+          "elemental function call with [unlimited-]polymorphic result");
+
+    if (retTy->category() == Fortran::common::TypeCategory::Derived) {
+      const Fortran::semantics::DerivedTypeSpec &typeSpec =
+          retTy->GetDerivedTypeSpec();
+      return Fortran::semantics::IsFinalizable(typeSpec);
+    }
+
+    return false;
+  }
+
 private:
   Fortran::lower::CallerInterface &caller;
   mlir::FunctionType callSiteType;
@@ -1804,6 +1842,14 @@ class ElementalIntrinsicCallBuilder
     return {};
   }
 
+  bool resultMayRequireFinalization(
+      [[maybe_unused]] CallContext &callContext) const {
+    // FIXME: need access to the CallerInterface's return type
+    // to check if the result may need finalization (e.g. the result
+    // of MERGE).
+    return false;
+  }
+
 private:
   const Fortran::evaluate::SpecificIntrinsic *intrinsic;
   const fir::IntrinsicArgumentLoweringRules *argLowering;
diff --git a/flang/lib/Lower/HlfirIntrinsics.cpp b/flang/lib/Lower/HlfirIntrinsics.cpp
index 62f63e376c815df..20e570044e8d4ae 100644
--- a/flang/lib/Lower/HlfirIntrinsics.cpp
+++ b/flang/lib/Lower/HlfirIntrinsics.cpp
@@ -330,6 +330,9 @@ std::optional<hlfir::EntityWithAttributes> Fortran::lower::lowerHlfirIntrinsic(
     const Fortran::lower::PreparedActualArguments &loweredActuals,
     const fir::IntrinsicArgumentLoweringRules *argLowering,
     mlir::Type stmtResultType) {
+  // If the result is of a derived type that may need finalization,
+  // we have to use DestroyOp with 'finalize' attribute for the result
+  // of the intrinsic operation.
   if (name == "sum")
     return HlfirSumLowering{builder, loc}.lower(loweredActuals, argLowering,
                                                 stmtResultType);
@@ -348,6 +351,7 @@ std::optional<hlfir::EntityWithAttributes> Fortran::lower::lowerHlfirIntrinsic(
   if (name == "dot_product")
     return HlfirDotProductLowering{builder, loc}.lower(
         loweredActuals, argLowering, stmtResultType);
+  // FIXME: the result may need finalization.
   if (name == "transpose")
     return HlfirTransposeLowering{builder, loc}.lower(
         loweredActuals, argLowering, stmtResultType);
diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp
index dd62aa0e370122f..7034d6e893e7e9a 100644
--- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp
+++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp
@@ -1021,3 +1021,12 @@ hlfir::cloneToElementalOp(mlir::Location loc, fir::FirOpBuilder &builder,
                                elementalAddrOp.getShape(), typeParams,
                                genKernel, !elementalAddrOp.isOrdered());
 }
+
+bool hlfir::elementalOpMustProduceTemp(hlfir::ElementalOp elemental) {
+  for (mlir::Operation *useOp : elemental->getUsers())
+    if (auto destroy = mlir::dyn_cast<hlfir::DestroyOp>(useOp))
+      if (destroy.mustFinalizeExpr())
+        return true;
+
+  return false;
+}
diff --git a/flang/lib/Optimizer/Builder/Runtime/Derived.cpp b/flang/lib/Optimizer/Builder/Runtime/Derived.cpp
index 8975656ffb38026..fe7e2d157ad9a6b 100644
--- a/flang/lib/Optimizer/Builder/Runtime/Derived.cpp
+++ b/flang/lib/Optimizer/Builder/Runtime/Derived.cpp
@@ -37,6 +37,18 @@ void fir::runtime::genDerivedTypeDestroy(fir::FirOpBuilder &builder,
   builder.create<fir::CallOp>(loc, func, args);
 }
 
+void fir::runtime::genDerivedTypeFinalize(fir::FirOpBuilder &builder,
+                                          mlir::Location loc, mlir::Value box) {
+  auto func = fir::runtime::getRuntimeFunc<mkRTKey(Finalize)>(loc, builder);
+  auto fTy = func.getFunctionType();
+  auto sourceFile = fir::factory::locationToFilename(builder, loc);
+  auto sourceLine =
+      fir::factory::locationToLineNo(builder, loc, fTy.getInput(2));
+  auto args = fir::runtime::createArguments(builder, loc, fTy, box, sourceFile,
+                                            sourceLine);
+  builder.create<fir::CallOp>(loc, func, args);
+}
+
 void fir::runtime::genDerivedTypeDestroyWithoutFinalization(
     fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value box) {
   auto func = fir::runtime::getRuntimeFunc<mkRTKey(DestroyWithoutFinalization)>(
diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp
index 7ca6108a31acbb6..d3a6fb305c19919 100644
--- a/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp
+++ b/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp
@@ -207,3 +207,8 @@ mlir::Value hlfir::genExprShape(mlir::OpBuilder &builder,
   fir::ShapeOp shape = builder.create<fir::ShapeOp>(loc, shapeTy, extents);
   return shape.getResult();
 }
+
+bool hlfir::mayHaveAllocatableComponent(mlir::Type ty) {
+  return fir::isPolymorphicType(ty) || fir::isUnlimitedPolymorphicType(ty) ||
+         fir::isRecordWithAllocatableMember(hlfir::getFortranElementType(ty));
+}
diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flan...
[truncated]

vzakhari · 2023-09-21T18:02:14Z

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

@@ -772,6 +852,12 @@ struct ElementalOpConversion
    // Assign the element value to the temp element for this iteration.
    auto tempElement =
        hlfir::getElementAt(loc, builder, temp, loopNest.oneBasedIndices);
+    // FIXME: if the elemental result is a function result temporary


I will do this in a separate PR, because this one already touches a few pieces.

jeanPerier

Thanks for the clean-up on this non trivial problem Slava!

jeanPerier · 2023-09-22T08:23:30Z

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

-    builder.create<fir::FreeMemOp>(loc, var);
+
+    if (mustFinalize)
+      fir::runtime::genDerivedTypeFinalize(builder, loc, var);


I think Destroy may actually be a better runtime call here since automatic component should be deallocated if any. Currently, this does not make a difference since "automatic component" requires support for derived type with length parameter in lowering.
In general, the only place where we would want to use Finalize instead of Destroy is when the entity will be used again after finalization (I only see assignment LHS, and INTENT(OUT) dummies that fit this pattern), otherwise we will have memory leaks.
Now, we do not support length PDTs, so I am also fine with calling Finalize here and wait until we see how lengths PDTs are implemented.

I believe the automatic components will be deallocated by DestroyWithoutFinalization call below. Of course, hlfir::mayHaveAllocatableComponent have to return true, if there is either allocatable or automatic component, which might be not the case right now.

We may also call Destroy if both mustFinalize and deallocComponent are true.

jeanPerier · 2023-09-22T08:32:16Z

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

+    if (auto destroy = mlir::dyn_cast<hlfir::DestroyOp>(useOp)) {
+      if (destroy.mustFinalizeExpr())
+        return false;


While semantically correct, it is a bit annoying to introduce a copy instead of running the finalizers at the hlfir.end_associate.

Moving the finalization at the end associate needs a bit of thinking though since it is not the same if the associated buffer was modified (I think correct Fortran cannot modified a variable that we create with hlfir.associate, except maybe for VALUE dummies...), so I am OK with doing the copy right now.

I agree! I think we will need finalize attribute for hlfir.end_associate to communicate the "theft" of the buffer that needs finalization.

I also agree that we have to be careful about stealing the buffer from the elemental op, when the buffer might be modified inside the association scope, e.g.:

module types type t integer :: i contains final :: finalize end type t contains subroutine finalize(x) type(t), intent(inout) :: x print *, x%i end subroutine finalize end module types subroutine test(x) use types type(t) :: x(10) call callee(elem(x)) contains subroutine callee(x) type(t), value :: x(10) x%i = 10 end subroutine callee impure elemental function elem(x) type(t), intent(in) :: x type(t) :: elem elem = x end function elem end subroutine test

Since the callee might modify the buffer, the modification can be made visible via an impure finalization routine. I guess it may only be the case for impure elemental calls, so we can probably rely on the unordered attribute to decide if it is safe to steal the buffer.

tblah

Great work!

flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp

vzakhari

Thank you both for the review!

I am going to merge this without further modifications, and will follow-up in additional PR(s).

flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp

vzakhari · 2023-09-22T16:19:02Z

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

+    if (auto destroy = mlir::dyn_cast<hlfir::DestroyOp>(useOp)) {
+      if (destroy.mustFinalizeExpr())
+        return false;


I agree! I think we will need finalize attribute for hlfir.end_associate to communicate the "theft" of the buffer that needs finalization.

I also agree that we have to be careful about stealing the buffer from the elemental op, when the buffer might be modified inside the association scope, e.g.:

module types type t integer :: i contains final :: finalize end type t contains subroutine finalize(x) type(t), intent(inout) :: x print *, x%i end subroutine finalize end module types subroutine test(x) use types type(t) :: x(10) call callee(elem(x)) contains subroutine callee(x) type(t), value :: x(10) x%i = 10 end subroutine callee impure elemental function elem(x) type(t), intent(in) :: x type(t) :: elem elem = x end function elem end subroutine test

Since the callee might modify the buffer, the modification can be made visible via an impure finalization routine. I guess it may only be the case for impure elemental calls, so we can probably rely on the unordered attribute to decide if it is safe to steal the buffer.

vzakhari · 2023-09-22T16:23:37Z

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

-    builder.create<fir::FreeMemOp>(loc, var);
+
+    if (mustFinalize)
+      fir::runtime::genDerivedTypeFinalize(builder, loc, var);


I believe the automatic components will be deallocated by DestroyWithoutFinalization call below. Of course, hlfir::mayHaveAllocatableComponent have to return true, if there is either allocatable or automatic component, which might be not the case right now.

We may also call Destroy if both mustFinalize and deallocComponent are true.

Local branch amd-gfx 957cf8f Merged main:3510552df673 into amd-gfx:d975d0fd1040 Remote branch main ab1db26 [flang][hlfir] Fixed some finalization/deallocation issues. (llvm#67047)

vzakhari added 7 commits September 21, 2023 10:49

[flang][runtime] Expose Finalize API for derived types.

f0c8d51

[flang][hlfir] Add 'finalize' attribute for DestroyOp.

1188081

[flang][hlfir] Lower argument clean-ups using valid hlfir.end_associate.

f4533ec

The operand must be a Fortran entity, when allocatable component deallocation may be required.

vzakhari requested review from tblah and jeanPerier September 21, 2023 17:54

llvmbot added flang:runtime flang Flang issues not falling into any other category flang:fir-hlfir labels Sep 21, 2023

vzakhari commented Sep 21, 2023

View reviewed changes

jeanPerier approved these changes Sep 22, 2023

View reviewed changes

tblah approved these changes Sep 22, 2023

View reviewed changes

flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp Show resolved Hide resolved

vzakhari commented Sep 22, 2023

View reviewed changes

vzakhari merged commit ab1db26 into llvm:main Sep 22, 2023

kstoimenov mentioned this pull request Sep 22, 2023

Add memcpm test kstoimenov/llvm-project#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang][hlfir] Fixed some finalization/deallocation issues. #67047

[flang][hlfir] Fixed some finalization/deallocation issues. #67047

vzakhari commented Sep 21, 2023 •

edited

Loading

llvmbot commented Sep 21, 2023 •

edited

Loading

vzakhari Sep 21, 2023

jeanPerier left a comment

jeanPerier Sep 22, 2023

vzakhari Sep 22, 2023

jeanPerier Sep 22, 2023

vzakhari Sep 22, 2023

tblah left a comment

vzakhari left a comment

vzakhari Sep 22, 2023

vzakhari Sep 22, 2023

[flang][hlfir] Fixed some finalization/deallocation issues. #67047

[flang][hlfir] Fixed some finalization/deallocation issues. #67047

Conversation

vzakhari commented Sep 21, 2023 • edited Loading

llvmbot commented Sep 21, 2023 • edited Loading

vzakhari Sep 21, 2023

Choose a reason for hiding this comment

jeanPerier left a comment

Choose a reason for hiding this comment

jeanPerier Sep 22, 2023

Choose a reason for hiding this comment

vzakhari Sep 22, 2023

Choose a reason for hiding this comment

jeanPerier Sep 22, 2023

Choose a reason for hiding this comment

vzakhari Sep 22, 2023

Choose a reason for hiding this comment

tblah left a comment

Choose a reason for hiding this comment

vzakhari left a comment

Choose a reason for hiding this comment

vzakhari Sep 22, 2023

Choose a reason for hiding this comment

vzakhari Sep 22, 2023

Choose a reason for hiding this comment

vzakhari commented Sep 21, 2023 •

edited

Loading

llvmbot commented Sep 21, 2023 •

edited

Loading