-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate featurizers #1573
Merged
Merged
Integrate featurizers #1573
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
abab596
Merged PR 4815: Added Sample Featurizer and Infrastructure
6ce331e
Merged PR 4877: Traits class added
michaelgsharp c313b3d
Merged PR 4881: Including boost as a header only library for now (whi…
3883a48
Merged PR 4846: DateTime Transformer
michaelgsharp 4c794a1
Boilerplate build changes for featurizers and Kernel wrappers.
yuslepukhin 5f73457
Mv to a target folder
yuslepukhin 117c860
Merge branch 'master' of d:\dev\dp_copy into yuslepukhin/integrate_fe…
yuslepukhin a1b1c98
Make featurizers and unit tests compile and run with GTest.
yuslepukhin 5f91738
Merge branch 'master' into yuslepukhin/integrate_featurizers
yuslepukhin 93c295f
Provide unit tests for new AutoML DateTimeTransformer kernel.
yuslepukhin 09acbab
Add a missing header to CMake.
yuslepukhin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Copyright (c) Microsoft Corporation. All rights reserved. | ||
# Licensed under the MIT License. | ||
# This source code should not depend on the onnxruntime and may be built independently | ||
|
||
file(GLOB automl_featurizers_srcs CONFIGURE_DEPENDS | ||
"${ONNXRUNTIME_ROOT}/core/automl/featurizers/src/FeaturizerPrep/*.h" | ||
"${ONNXRUNTIME_ROOT}/core/automl/featurizers/src/FeaturizerPrep/Featurizers/*.h" | ||
"${ONNXRUNTIME_ROOT}/core/automl/featurizers/src/FeaturizerPrep/Featurizers/*.cpp" | ||
) | ||
|
||
source_group(TREE ${ONNXRUNTIME_ROOT}/core/automl/ FILES ${onnxruntime_automl_featurizers_srcs}) | ||
|
||
add_library(automl_featurizers ${automl_featurizers_srcs}) | ||
|
||
target_include_directories(automl_featurizers PRIVATE ${ONNXRUNTIME_ROOT} PUBLIC ${CMAKE_CURRENT_BINARY_DIR}) | ||
|
||
set_target_properties(automl_featurizers PROPERTIES FOLDER "AutoMLFeaturizers") | ||
|
||
# Individual featurizers unit tests added at bulk | ||
file(GLOB automl_featurizers_tests_srcs | ||
"${ONNXRUNTIME_ROOT}/core/automl/featurizers/src/FeaturizerPrep/Featurizers/UnitTests/*.cpp" | ||
) | ||
|
||
list(APPEND automl_featurizers_tests_srcs | ||
"${ONNXRUNTIME_ROOT}/core/automl/featurizers/src/FeaturizerPrep/UnitTests/Traits_UnitTests.cpp" | ||
"${ONNXRUNTIME_ROOT}/core/automl/featurizers/src/FeaturizerPrep/UnitTests/Featurizer_UnitTest.cpp" | ||
"${ONNXRUNTIME_ROOT}/core/automl/featurizers/src/FeaturizerPrep/UnitTests/test_main.cpp" | ||
) | ||
|
||
add_executable(automl_featurizers_unittests ${automl_featurizers_tests_srcs}) | ||
add_dependencies(automl_featurizers_unittests automl_featurizers) | ||
target_link_libraries(automl_featurizers_unittests PRIVATE gtest automl_featurizers) | ||
source_group(TREE ${ONNXRUNTIME_ROOT}/core/automl/ FILES ${automl_featurizers_tests_srcs}) | ||
set_target_properties(automl_featurizers_unittests PROPERTIES FOLDER "AutoMLFeaturizers") | ||
add_test(NAME automl_featurizers_unittests | ||
COMMAND automl_featurizers_unittests | ||
WORKING_DIRECTORY $<TARGET_FILE_DIR:automl_featurizers_unittests> | ||
) | ||
|
||
|
||
if (WIN32) | ||
# Add Code Analysis properties to enable C++ Core checks. Have to do it via a props file include. | ||
set_target_properties(automl_featurizers PROPERTIES VS_USER_PROPS ${PROJECT_SOURCE_DIR}/ConfigureVisualStudioCodeAnalysis.props) | ||
endif() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
// Copyright (c) Microsoft Corporation. All rights reserved. | ||
// Licensed under the MIT License. | ||
|
||
// Cumulative header with automl featurizers includes exposed to | ||
// ORT | ||
#pragma once | ||
|
||
#include "core/automl/featurizers/src/FeaturizerPrep/Featurizers/DateTimeFeaturizer.h" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
// Copyright (c) Microsoft Corporation. All rights reserved. | ||
// Licensed under the MIT License. | ||
|
||
#include "core/common/common.h" | ||
#include "core/framework/data_types.h" | ||
#include "core/framework/op_kernel.h" | ||
|
||
#include "automl_ops/automl_types.h" | ||
#include "automl_ops/automl_featurizers.h" | ||
|
||
namespace dtf = Microsoft::Featurizer::DateTimeFeaturizer; | ||
|
||
namespace onnxruntime { | ||
|
||
// This temporary to register custom types so ORT is aware of it | ||
// although it still can not serialize such a type. | ||
// These character arrays must be extern so the resulting instantiated template | ||
// is globally unique | ||
|
||
extern const char kMsAutoMLDomain[] = "com.microsoft.automl"; | ||
|
||
extern const char kTimepointName[] = "DateTimeFeaturizer_TimePoint"; | ||
// This has to be under onnxruntime to properly specialize a function template | ||
ORT_REGISTER_OPAQUE_TYPE(dtf::TimePoint, kMsAutoMLDomain, kTimepointName); | ||
|
||
namespace automl { | ||
|
||
#define REGISTER_CUSTOM_PROTO(TYPE, reg_fn) \ | ||
{ \ | ||
MLDataType mltype = DataTypeImpl::GetType<TYPE>(); \ | ||
reg_fn(mltype); \ | ||
} | ||
|
||
void RegisterAutoMLTypes(const std::function<void(MLDataType)>& reg_fn) { | ||
REGISTER_CUSTOM_PROTO(dtf::TimePoint, reg_fn); | ||
} | ||
#undef REGISTER_CUSTOM_PROTO | ||
} // namespace automl | ||
} // namespace onnxruntime |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
// Copyright (c) Microsoft Corporation. All rights reserved. | ||
// Licensed under the MIT License. | ||
|
||
#pragma once | ||
|
||
#include "core/framework/data_types.h" | ||
#include <functional> | ||
|
||
namespace onnxruntime { | ||
namespace automl { | ||
void RegisterAutoMLTypes(const std::function<void(MLDataType)>& reg_fn); | ||
} // namespace automl | ||
} // namespace onnxruntime |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
// Copyright (c) Microsoft Corporation. All rights reserved. | ||
// Licensed under the MIT License. | ||
|
||
#include "core/common/common.h" | ||
#include "core/framework/data_types.h" | ||
#include "core/framework/op_kernel.h" | ||
|
||
#include "core/automl/featurizers/src/FeaturizerPrep/Featurizers/DateTimeFeaturizer.h" | ||
|
||
namespace dtf = Microsoft::Featurizer::DateTimeFeaturizer; | ||
|
||
namespace onnxruntime { | ||
namespace automl { | ||
|
||
class DateTimeTransformer final : public OpKernel { | ||
public: | ||
explicit DateTimeTransformer(const OpKernelInfo& info) : OpKernel(info) {} | ||
Status Compute(OpKernelContext* context) const override; | ||
}; | ||
|
||
Status DateTimeTransformer::Compute(OpKernelContext* ctx) const { | ||
Status s; | ||
auto input_tensor = ctx->Input<Tensor>(0); | ||
dtf::TimePoint* output = ctx->Output<dtf::TimePoint>(0); | ||
|
||
int64_t tp = *input_tensor->Data<int64_t>(); | ||
std::chrono::system_clock::time_point sys_time{std::chrono::seconds(tp)}; | ||
*output = std::move(dtf::SystemToDPTimePoint(sys_time)); | ||
return s; | ||
} | ||
|
||
ONNX_OPERATOR_KERNEL_EX( | ||
DateTimeTransformer, | ||
kMSAutoMLDomain, | ||
1, | ||
kCpuExecutionProvider, | ||
KernelDefBuilder() | ||
.TypeConstraint("T1", DataTypeImpl::GetTensorType<int64_t>()) | ||
.TypeConstraint("T2", DataTypeImpl::GetType<Microsoft::Featurizer::DateTimeFeaturizer::TimePoint>()), | ||
DateTimeTransformer); | ||
} // namespace automl | ||
} // namespace onnxruntime |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
// Copyright (c) Microsoft Corporation. All rights reserved. | ||
// Licensed under the MIT License. | ||
|
||
#include "automl_ops/cpu_automl_kernels.h" | ||
#include "core/graph/constants.h" | ||
#include "core/framework/data_types.h" | ||
|
||
namespace onnxruntime { | ||
namespace automl { | ||
|
||
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kCpuExecutionProvider, kMSAutoMLDomain, 1, DateTimeTransformer); | ||
|
||
void RegisterCpuAutoMLKernels(KernelRegistry& kernel_registry) { | ||
static const BuildKernelCreateInfoFn function_table[] = { | ||
// add more kernels here | ||
BuildKernelCreateInfo<ONNX_OPERATOR_KERNEL_CLASS_NAME(kCpuExecutionProvider, kMSAutoMLDomain, 1, DateTimeTransformer)> | ||
}; | ||
|
||
for (auto& function_table_entry : function_table) { | ||
kernel_registry.Register(function_table_entry()); | ||
} | ||
} | ||
|
||
} // namespace automl | ||
} // namespace onnxruntime |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
// Copyright (c) Microsoft Corporation. All rights reserved. | ||
// Licensed under the MIT License. | ||
|
||
#pragma once | ||
|
||
#include "core/framework/op_kernel.h" | ||
#include "core/framework/kernel_registry.h" | ||
|
||
namespace onnxruntime { | ||
namespace automl { | ||
void RegisterCpuAutoMLKernels(KernelRegistry& kernel_registry); | ||
} // namespace automl | ||
} // namespace onnxruntime |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be a transformer that transforms a tensor (not necessarily a scalar) of ints to a tensor/array of time_points? Or, is it supposed to be restricted to the case where the input is a scalar? Assuming that automl needs to work with batches of inputs, I would have expected support for a tensor of ints, not just a scalar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer to this is no. This transformer transforms a single timepoint into a Featurizer specific type. We may come to batches. This kernel is very experimental to verify that we can share the code, create models on a .MET side and run them.