-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Partitioner] First Graph Partitioning #2268
Conversation
Thanks a lot @bertmaher ! Borrowed some ideas and code from your previous partition work :) |
64ab826
to
6580b76
Compare
might be worthwhile outline that design in the Github issue. |
lib/Partitioner/Partitioner.cpp
Outdated
/// Current only partition the representive function. | ||
DAGNode *Partitioner::doPartitioning(Function *F, NodeFunctionMap &mapping) { | ||
// The dummy node. | ||
DAGNode *DAG = new DAGNode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you return and use std::unique ptr to avoid any memory leak issues?
these days we should avoid naked new
, given all the support we have in std:: various pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdzhabarov After discussing with @gcatron, we use shared_ptr since all DAGNodes are in HM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nadavrot we don't use shared_ptr now. This is the current design: #2268 (comment)
1e6b60a
to
f6a36fe
Compare
|
||
using namespace runtime; | ||
|
||
using ModuleList = std::unique_ptr<Module>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this called list
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also it appears only to be used once so I'm not sure if it's needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. It was from previous design and not used anymore.
using MemUsageMap = std::unordered_map<Node *, unsigned>; | ||
|
||
/// Dependency map type. | ||
using MapTy = llvm::DenseMap<Module *, ModuleList>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused?
Map nodeToFunction_; | ||
|
||
public: | ||
/// Create a new partition \p F. Add it to the map with initial node \p N. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the N param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the "N param" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, it says that in the comment on this line though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Forgot to update the comment :P
/// original function to destination partitions, along with a list of the | ||
/// newly-created functions. | ||
class NodeFunctionMap { | ||
using Map = llvm::DenseMap<Node *, Function *>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a name like NodeToFunctionMap
would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why DenseMap here but not in the visited list std::unordered_set<const Node *> visited;
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that xxxxxTy would be a better name, because this is not a map, it is a map type.
const FunctionList &getFunctions() const { return functions_; } | ||
|
||
/// \returns the number of partitions. | ||
Map::size_type size() const { return functions_.size(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is using Map
's size_type but returning the size of functions_ which is actually a FunctionList.
lib/Partitioner/Partitioner.cpp
Outdated
size_t size = memSize; | ||
|
||
// The set to keep the placeholders (only for Inputs) whose size is already | ||
// calcualted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calculated
// calcualted. | ||
std::set<std::string> pSet; | ||
|
||
for (auto &node : F->getNodes()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a quick overview of the logic here would be helpful like "Sum the sizes of all input Placeholders in the graph"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'm surprised we don't have something for this already that gets all input placeholders, seems useful. I think I remember there being some attempt at this but some placeholders are inputs and outputs due to training, maybe we need to handle that case here also not sure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, a few utility functions (the input one, the memory cost for one arbitrary sub-graph...) should be helpful. But I would prefer separate PRs.
lib/Partitioner/Partitioner.cpp
Outdated
for (auto &node : F_->getNodes()) { | ||
int n = node.getNumInputs(); | ||
unsigned size = 0; | ||
if (node.getKind() != Kinded::Kind::SaveNodeKind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
easier to read if we do if (node.getKind() == Kinded::Kind::SaveNodeKind) {continue;}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there is a statement after this condition. I think it is OK to use "!=" here since I don't want to duplicate that statement .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me two reads to get it, how about:
if (node.getKind() == Kinded::Kind::SaveNodeKind) { memUsage_[&node] = 0; continue; }
It's not obvious that the n
and size
variables are unused in this case.
lib/Partitioner/Partitioner.cpp
Outdated
GraphPreOrderVisitor visitor(*F); | ||
Node *node = nullptr; | ||
for (auto &N : visitor.getPreOrder()) { | ||
if (isa<Storage>(N)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add brackets here
if (isa<Storage>(N)) | ||
continue; | ||
node = N; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, won't this just quit at the first non-storage node? Why can't there be multiple in the first level, is this the save node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a save node. The comment mentioned that now we assume there is only one root now. The current models I've met only have 1 output node. But it could be multiple roots as well later.
lib/Partitioner/Partitioner.cpp
Outdated
int current = 0; | ||
auto newPair = std::make_pair(level, std::vector<Node *>()); | ||
newPair.second.push_back(node); | ||
bfs.levels.push_back(newPair); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could just be bfs.levels.push_back({0, {node}});
Map nodeToFunction_; | ||
|
||
public: | ||
/// Create a new partition \p F. Add it to the map with initial node \p N. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, it says that in the comment on this line though
6874ac4
to
c9da452
Compare
#ifndef GLOW_PARTITIONER_PARTITIONER_H | ||
#define GLOW_PARTITIONER_PARTITIONER_H | ||
|
||
#include "glow/Graph/Context.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use forward declarations and move some of the includes to the cpp file?
Function *operator[](Node *n) { return nodeToFunction_[n]; } | ||
}; | ||
|
||
/// Partition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Partitions a function into multiple functions based on A... B... C...
/// The result of module partitioning. | ||
DAGNodeList partitions_; | ||
|
||
/// Total memory requested by one module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size in bytes.
lib/Partitioner/Partitioner.cpp
Outdated
using namespace glow; | ||
using llvm::isa; | ||
|
||
// Margin for code itself -- should be adjusted later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stale comment here.
// Margin for code itself -- should be adjusted later. | ||
//#define MARGIN 1024 * 128 | ||
|
||
struct BFSLevel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doxygen please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added.
lib/Partitioner/Partitioner.cpp
Outdated
|
||
auto funcList = module_->getFunctions(); | ||
for (Function *F : funcList) { | ||
// printf("function name : %s\n", F->getName().data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Printf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flush some comments
#define GLOW_PARTITIONER_PARTITIONER_H | ||
|
||
#include "glow/Graph/Graph.h" | ||
#include "llvm/ADT/DenseMap.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you separate includes for glow headers vs llvm vs standard ones.
|
||
namespace glow { | ||
|
||
using namespace runtime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should partitioner be in the runtime namespace as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DAGNode is defined in runtime namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean,
namespace glow {
namespace runtime {
... // rest of the code here.
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have include/glow/Runtime/.. and include/glow/Partitioner/...
I think Partitioner is more "compiler" level, not "runtime". The namespace should be glow::Partitioner.
|
||
public: | ||
/// Create a new partition \p F. | ||
void create(Function *F) { functions_.emplace_back(F); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
createPartition?
void add(Node *N, Function *F) { nodeToFunction_[N] = F; } | ||
|
||
/// Get list of functions contained in this map. | ||
const FunctionList &getFunctions() const { return functions_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPartitions
}; | ||
|
||
/// Given a module, partitions each of the its functions into multiple ones | ||
/// based on memory constraints and minimizes the communication cost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd describe how partitioning is done here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current one is just a simple algorithm, still evaluating the next step improvement. Will add the algorithm description later.
Partitioner(Module *parent, std::vector<DeviceInfo> &devices); | ||
|
||
/// Decompose each function in a module and return a list of DAGNodes. | ||
DAGNodeList &Partition(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we allow caller to modify internal state of partitions by returning reference to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DAGNode will be fulfilled by Partitioner, Provisioner, and DeviceManager.
DAGNodeList &Partition(); | ||
}; | ||
} // namespace glow | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commend on what it ends after #endif
include/glow/Runtime/RuntimeTypes.h
Outdated
@@ -45,10 +45,10 @@ struct DeviceInfo { | |||
struct DAGNode { | |||
/// The children of this node, these are nodes that depend on the current | |||
/// node. | |||
std::vector<DAGNode *> children; | |||
std::vector<std::shared_ptr<DAGNode>> children; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rebase this on master.
lib/Partitioner/Partitioner.cpp
Outdated
|
||
Partitioner::Partitioner(Module *parent, std::vector<DeviceInfo> &devices) | ||
: deviceInfo_(devices) { | ||
module_ = parent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initialize in the initializer list.
/// original function to destination partitions, along with a list of the | ||
/// newly-created functions. | ||
class NodeFunctionMap { | ||
using Map = llvm::DenseMap<Node *, Function *>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why DenseMap here but not in the visited list std::unordered_set<const Node *> visited;
.
lib/Partitioner/Partitioner.cpp
Outdated
: deviceInfo_(devices) { | ||
module_ = parent; | ||
// Check if all the functions are in the same structure. | ||
if (!isSameFunctionFamily()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, for asserts like that you can do: assert(isSameFunctionFamily() && "not in the same function family")
lib/Partitioner/Partitioner.cpp
Outdated
|
||
/// Get the representative function (the one with the largest input) and update | ||
/// the memSize_. | ||
void Partitioner::selectRepFunc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be static and have module as input, this way you can test it properly.
8e4a859
to
457e4e2
Compare
After more discussion with @opti-mix and @bertmaher, now inside of struct DAGNode, we use raw pointer. Then we use a DAGNodeList as a container to keep the unique_ptr of each DAGNode. |
Function *operator[](Node *n) { return nodeToFunction_[n]; } | ||
}; | ||
|
||
/// The struct contains all the created DAGNodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
It could be a good idea to add some comments here and/or for DAGNode to explain the ownership model used here and who owns what. One of the consequences of this design is that DAGNode cannot outlive the DAGNodeList owning it. The second consequence is that DAGNodes can only refer to the DAGNodes from the same DAGNodeList and they don't need to use any smart pointers for that as all of DAGNodes are owned by the same DAGNodeList.
|
||
// The set to keep the placeholders (only for Inputs) whose size is | ||
// already calculated. | ||
std::set<std::string> pSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can save stringRef here.
lib/Partitioner/Partitioner.cpp
Outdated
NodeToFunctionMap mapping; | ||
BFSLevel bfs = getBFSLevel(F); | ||
unsigned level = bfs.levels.size(); | ||
std::vector<int> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is formatted funny.
Function *F_; | ||
|
||
/// The cost model related to device. | ||
std::vector<DeviceInfo> &deviceInfo_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a const ref?
/// identical. The required memory and computation cost for each op can be | ||
/// found in Module. The \p devices provides the cost model related to | ||
/// devices. | ||
Partitioner(Module *parent, std::vector<DeviceInfo> &devices); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const ref then?
lib/Partitioner/Partitioner.cpp
Outdated
for (auto &node : F_->getNodes()) { | ||
int n = node.getNumInputs(); | ||
unsigned size = 0; | ||
if (node.getKind() != Kinded::Kind::SaveNodeKind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me two reads to get it, how about:
if (node.getKind() == Kinded::Kind::SaveNodeKind) { memUsage_[&node] = 0; continue; }
It's not obvious that the n
and size
variables are unused in this case.
lib/Partitioner/Partitioner.cpp
Outdated
// cut[0]), [cut[0] - 1, cut[1]), ..., [cut[n], -1). | ||
|
||
// Step 1 : get the initial cut based on BFS levels and avaiableMemory. | ||
// TODO .. need to remove the duplicated memory useage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: memory usage
lib/Partitioner/Partitioner.cpp
Outdated
BFSLevel bfs = getBFSLevel(F); | ||
unsigned level = bfs.levels.size(); | ||
std::vector<int> | ||
cut; // A list of cut. The graph can be partitioned by levels [level - 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is cut? Is this described in a different comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial partitions are based on the BFS level, and the cut is used for initial partitions.
lib/Partitioner/Partitioner.cpp
Outdated
/// Current only partition the representive function. | ||
void Partitioner::doPartitioning(Function *F, NodeToFunctionMap &mapping) { | ||
// The dummy node. | ||
std::unique_ptr<DAGNode> DAG(new DAGNode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::make_unique<DAGNode>()
?
DAG->logicalDevice = 0; | ||
DAG->name = F->getName(); | ||
DAG->deviceID = 0; | ||
DAG->logicalDevice = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a constructor of DAGNode that takes these initializations as arguments? Does it make sense to have a DAGNode without these set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DAGNode is filled by Partitioner, Provisioner and DeviceManager(?). According to my understanding, all the fields should be set before a DAGNode is actually in use. That is why I simply set each field related to Partitioner instead of using a constructor.
|
||
// Possible minimal k devices for a succesful partitioning | ||
// Note: here 2 is for testing; | ||
unsigned k = 2; //(memSize_ + MARGIN) / devices[0].availableMemory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can k
be a member of the Partitioner? This would let us write unit tests to cover multiple cases (k == 1, k == 2, k == n, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the partitioner, we will try to get the smallest k (which is the return value, not the param). Here 2 is just for testing.
Function *F_; | ||
|
||
/// The cost model related to device. | ||
std::vector<DeviceInfo> &deviceInfo_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a vector? Should the comment say that this is per-device info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we use vector here is maybe later, we are consider heterogeneous devices. Will updated the comment.
|
||
/// Get the representative function (the one with the largest input) and | ||
/// update the memSize. | ||
static Function *selectRepFunc(Module *parent, size_t &memSize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move this to the CPP file.
|
||
// The set to keep the placeholders (only for Inputs) whose size is | ||
// already calculated. | ||
std::set<llvm::StringRef> pSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to save the names of the pointers to the placeholders?
for (auto &node : F->getNodes()) { | ||
int n = node.getNumInputs(); | ||
if (node.getKind() == Kinded::Kind::SaveNodeKind) { | ||
// Special node, the placeholder should be ignored? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this? Why should we ignore this node? What is special?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For "Save" node, it is just create a placeholder for the output. Here, we only consider the input size.
for (int i = 0; i < n; i++) { | ||
Placeholder *in = | ||
llvm::dyn_cast<Placeholder>(node.getNthInput(i).getNode()); | ||
if (in && (pSet.find(in->getName()) == pSet.end())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find -> count()
} | ||
} | ||
} | ||
if (size > memSize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments.
lib/Partitioner/Partitioner.cpp
Outdated
Partitioner::Partitioner(Module *parent, std::vector<DeviceInfo> &devices) | ||
: module_(parent), deviceInfo_(devices) { | ||
// Check if all the functions are in the same structure. | ||
assert(isSameFunctionFamily() && "not in the same function family"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we decided not to use the concept of family functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nadavrot We are not using function families for partitioning (this definitely makes sense to me) or at all (I'm not clear yet on how this can done)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of "family" is not well defined. You can say that some functions are related if they share constants, or something, but these are all things that you can express in the Graph (with functions that refer to constants).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be removed after today's discussion.
lib/Partitioner/Partitioner.cpp
Outdated
bfs.visited.insert(node); | ||
level++; | ||
while (current < level) { | ||
std::vector<Node *> nList; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nList -> nodes?
/// the number of DAGNodes) is larger than the number of devices. E.g: | ||
/// node1(6GB) -> node2(14GB) -> node3(6GB). The memory limitation is 16GB, and | ||
/// there is only 2 devices. | ||
void Partitioner::adjustLogicalDeviceID(DAGNode *DAG, int num) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add something like assert(false && "invalid .. .. ");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partitioner will return the minimal k partitions. And the provisioner will double check if k fits the devices and give the invalid assert.
funcDAG[subF] = subDAG.get(); | ||
partitions_.nodes.push_back(std::move(subDAG)); | ||
} | ||
for (auto &N : subF->getNodes()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
if (subF == inputF) | ||
continue; | ||
|
||
if (funcDAG.find(inputF) == funcDAG.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
@beicy This PR is looking good. I have a few small comments but I think that we are close. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I am excited about seeing graph partitioning landing.
Description:
Graph partitioning. Use BFS order to get the initial partitions based on memory restrictions. (working on the advanced algorithm, which is not included in this PR)
Testing:
Added an unittest.
The original function is:
And the partitions are:
P1:
P2:
P3:
Documentation: #2045 #2298
[Optional Fixes #issue]
Please see a detailed explanation of how to fill out the fields in the relevant sections in PULL_REQUEST.md.