Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read in VT object file and migrate objects accordingly #431

Closed
lifflander opened this issue Aug 27, 2019 · 33 comments
Closed

Read in VT object file and migrate objects accordingly #431

lifflander opened this issue Aug 27, 2019 · 33 comments
Assignees

Comments

@lifflander
Copy link
Collaborator

Currently, VT can only write an object map file. See ProcStats::outputStatsFile in src/vt/vrt/collection/balance/proc_stats.cc for the implementation of writing out the map. We need to read in the map (without communication edges) and then migrate objects based on that mapping. A new load balancer may need to be created to do this.

@lifflander
Copy link
Collaborator Author

@mperrinel @ppebay

@mperrinel
Copy link
Contributor

I have produced some object map file using the ProcStats::outputStatsFile and the examples/lb_iter program :
mpirun --use-hwthread-cpus -n 4 examples/lb_iter --vt_lb_stats --vt_lb_stats_dir=statsoutput --vt_lb_stats_file=statgreedy

I don't know if it normal, but each call produces different stats.
I had to add the enable_LB option in Cmake.

I think the Python LoadReader needs a file named like that : base-name.node.vom,
The VT writer generates base-name.node.out
I can change in the writer : out to vom

I have created a simple program which is able to read (using fscanf) the Load part (not the com) of these files.
Where can I put it in VT ? Does the src/vt/vrt/collection/balance/proc_stats.cc is a good place for that ?

Do I have to create a new feature based on develop or based on the #427-lb-base-class ?
If I create a new method on the proc_stats.cc to Read the stats, I don't know when to call it and how to use its result. I will work on this last point tomorrow.

Why a new load balancer may need to be created ?

@lifflander
Copy link
Collaborator Author

@mperrinel I just have merged #427 on develop, so please branch directly off of develop now.

  • Yes, src/vt/vrt/collection/balance/proc_stats.cc is a good place for it.
  • Can you create a new static method static <some-return-type> inputStatsFile(std::string filename); that reads the file?

@lifflander
Copy link
Collaborator Author

Regarding the new load balancer:

  • After the file is read we need to put it in a data structure that the system can use. It could have the type: std::vector<std::unordered_map<ElementIDType,TimeType>> like ProcStats::proc_data_.
  • With this data in hand, we need to actually migrate the objects based on the file. That will require a load balancer that uses the data that was read to enact those changes.

Would it be useful to have a Skype voice meeting to discuss exactly how the new load balancer should act?

@mperrinel
Copy link
Contributor

I think I can do alone the first point. For the second point, I need more explanation ! So ok for the Skype voice meeting.
Today I can at Midday your time, 15 minutes can be enough ?. I'm currently in another meeting and later I won't be free.
I'm free tomorrow if today is too hard.

@mperrinel
Copy link
Contributor

I have created the new method on the proc_stats.cc to Read the stats.
It's only partial now because it doesn't read the communication and doesn't loop on the num_iters variable (which correspond to the proc_data_.size()).
A new proc_data_in_ member variable contains the Load values.

@lifflander
Copy link
Collaborator Author

@mperrinel Does 12:30pm PDT work for you?

@mperrinel
Copy link
Contributor

Yes

mperrinel added a commit that referenced this issue Sep 1, 2019
The input statistic method is now called during the runtime initialize method
@mperrinel
Copy link
Contributor

Input VOM statistic can now be used by using the new --vt_lb_stats_dir_in and --vt_lb_stats_file_in arguments.
To finish this feature :

  • Fill the proc_data_ data member directly instead of the new proc_data_in_ (which will be removed)
    _ Use the num_iters variable to finish the reader
  • Block the regular filling of the proc_data_ if the input VOM file is used
  • Create a new LB which aims to do the object migration using the proc_data_

@ppebay
Copy link
Contributor

ppebay commented Sep 1, 2019 via email

mperrinel added a commit that referenced this issue Sep 2, 2019
proc_data is now used instead of proc_data_in.
@mperrinel
Copy link
Contributor

  • proc_data_ data member is directly used instead of proc_data_in_
  • the regular filling of the proc_data_ is blocked if the input VOM file is used

To finish this feature :

  • Remove proc_data_in_
  • Add a new map in the proc_data_ variable for every different value in the first column. (Reader to update)
  • Create a new LB which aims to do the object migration using the proc_data_

@lifflander
Copy link
Collaborator Author

lifflander commented Sep 10, 2019

More detailed commends based on our discussion:

The system currently sets proc_data_ for each phase as the program runs. proc_data_ contains instrumentation which is valuable to the runtime even when following a user-specified map. It's the data structure that records how long each object actually took, which should not be modified based on a user mapping. It's a derived value and we need it to not change with the file so the statistics run properly.

We do need a new data structure for a user-specified map. It can go in ProcStats, but we could call it something else, maybe user_specified_map.

After the map is populated, the data from the map needs to be imported into the LB. Instead of messing with startLBHandler, we should just read it in runLB in the new load balancer. BaseLB::phase_ will tell you which phase you are on so you can index the first dimension of the user_specified_map.

Next, after reading, we should do reductions to determine if any object has moved for a given phase. Each processor can do this by locally checking if the user_specified_map[i] is different than user_specified_map[i+1]. We can create a std::vector<bool> locally and then boolean or reduce that vector across the whole machine into a new variable std::vector<bool> user_specified_map_changed;. Then, in LBManager::decideLBToRun, if the load balancer is from a file user-specified, we should index that vector to determine if it needs to run.

The remainder is just calling migrateObjectTo(ProcStats::proc_perm_to_temp_[obj_id], this_node)

mperrinel added a commit that referenced this issue Sep 10, 2019
Use the new user_specified_map_changed_ data member instead
@mperrinel
Copy link
Contributor

Thank @lifflander for the details !

So I updated the proc_data_ stats to keep its old behaviour. In addition the user_specified_map_ variable member store the data coming from the input file.
In the new StatsMapLB class, I added a new variable of type std::vector that is filled using the difference between the user_specified_map[I+1]_ and the user_specified_map[I]_. the size of this vector is user_specified_map.size() - 1. If there is at least one difference between the two phases, then the correspondant vector value is set to true. By doing that, I don't really have a boolean value for every load of a phase but a unique boolean value for all the Loads.
I didn't find the LBManager::decideLBToRun method but in the runLB method, I do the migration for a phase only if the correspondant std::vector is true.
I need more information about std::vector reduction.
Let's talk about that at the meeting.

@lifflander
Copy link
Collaborator Author

@mperrinel Here's a snippet of how you might do a reduce:

namespace vt { namespace collective { namespace reduce { namespace operators {

template <typename T>
struct OrOp<std::vector<T>> {
  void operator()(std::vector<T>& v1, std::vector<T> const& v2) {
    vtAssert(v1.size() == v2.size(), "Sizes of vectors in reduce must be equal");
    for (size_t ii = 0; ii < v1.size(); ++ii)
      v1[ii] = v1[ii] or v2[ii];
  }
};

}}}} /* end namespace vt::collective::reduce::operators */


struct StatsMapLB {
  using ReduceMsgType = collective::ReduceVecMsg<bool>;
  void doneReduce(ReduceMsgType* msg) {

  }

  void doReduce() {
    auto cb = theCB()->makeBcast<StatsMapLB,ReduceMsgType,&LBManager::doneReduce>(proxy);
    auto msg = makeMessage<MsgType>(in_vector);
    proxy.reduce<OrOp<std::vector<int>>>(msg.get(),cb);
  }

private:
  objgroup::proxy::Proxy<RotateLB> proxy = {};
};

This might not compile, but that's an example of what you need to write.

@mperrinel
Copy link
Contributor

mperrinel commented Sep 12, 2019

Hi @lifflander,

I try to understand your details:

  1. I guess the reduction will put in a global vector the reduction done using all local vector (1 by node I mean). So for 4 nodes, I will put in a 5th global vector the reduction of all the 4th vectors. And because it is a vector of bool, the good value for the global vector is the result of the or operator between the global (initialised with false values) and each local. That means for a phase, if at least one node vector contains a true value, the correspondant global value (with the same phase) will also be true .

  2. I put the OrOp<std::vector> into the .../vt/collective/reduce/operators/functors/or_op.h

  3. I put the doReduce inside the StatsMapLB class (the LB where I have already the runLB method)

  4. I don't know where to put the doneReduce method. In your exemple, you wrote it in the StatsMapLB but you also use it in the doReduce as it come from the LBManager. So I have put it in the invoke.h file, inside the InvokeLB Struct class. I'm really not sure about that.

  5. theCB()->makeBcast<StatsMapLB,ReduceMsgType,&LBManager::doneReduce> call doesn't compile. I think it's because the doneReduce method should be implemented into a struct that inherits from : vt::Collection<T1, T2> which is not the case for invokeLB class. The correspondant makeBcast template is :

  template <typename ObjT, typename MsgT, ObjMemType<ObjT,MsgT> f>
  Callback<MsgT> makeBcast(objgroup::proxy::Proxy<ObjT> proxy);

As we can see the LBManager::doneReduce have not the ObjMemType<ObjT,MsgT> form.

  1. How to combine all of these new methods ?
  • In the runLB method, I can just call the doReduce method
  • Then in the doneReduce method, I can get the global vector from the ReduceMsgType that should contain it. With it, I can check for the phase_ attribute of the StatsMapLB if the global vector.at(phase_) return true and in this case, call the migration.

Thanks for your help. We can discuss about that when you want !

mperrinel added a commit that referenced this issue Sep 12, 2019
The input statistic method is now called during the runtime initialize method
mperrinel added a commit that referenced this issue Sep 12, 2019
proc_data is now used instead of proc_data_in.
mperrinel added a commit that referenced this issue Sep 12, 2019
Use the new user_specified_map_changed_ data member instead
lifflander pushed a commit that referenced this issue Jan 28, 2020
lifflander pushed a commit that referenced this issue Jan 28, 2020
lifflander added a commit that referenced this issue Jan 28, 2020
lifflander pushed a commit that referenced this issue Feb 5, 2020
lifflander pushed a commit that referenced this issue Feb 5, 2020
lifflander pushed a commit that referenced this issue Feb 5, 2020
lifflander pushed a commit that referenced this issue Feb 5, 2020
lifflander pushed a commit that referenced this issue Feb 5, 2020
…put files in ProcStats. Clean up variable names.
lifflander pushed a commit that referenced this issue Feb 5, 2020
lifflander pushed a commit that referenced this issue Feb 5, 2020
lifflander pushed a commit that referenced this issue Feb 5, 2020
lifflander added a commit that referenced this issue Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants