-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
master thesis placeholder - decentralised learning with security and unbounded scalability #7027
Comments
I have read through the existing papers, and am still of the opinion that trust in web3 is still my favorite direction. Over the last two weeks, I have spent much time looking for related work and have composed a self-use document in which all relevant papers I have read are stated and briefly summarized for later use. The papers I have found can be divided into the following sections:
After having read the papers, I, as of now, do not know of any work where the user is capable of frictionlessly experiencing trust in web3 based on resource content alone. Many existing algorithms are based on the notion that agents can perform work for each other, and it can automatically be checked whether this work was actually performed. For resource content, the solution is not so trivial, as computers can not be trusted with automatically scanning content and determining its trustworthiness. The following methods have been proposed:
One of my own ideas (brainstorming): Use the duration spent accessing a certain resource as a metric to determine trustworthiness of a resource for a specific search query. Not much work has been performed on this method, but it does come with a few serious drawbacks:
I have more similar ideas, but they are all prone to Sybil attacks or require the notion of a 'session'. Another thought: if we were to use a trust graph, true information might be classified as unreliable due to its distance from the user in the trust graph. Also, how does the user indicate who to trust? This can only be built dynamically if the user provides feedback on whether articles are trustworthy (AKA not friction-less) Work document: |
Web3 with a trust framework for good, evil, truth, and misinformationWeb3 platform information and finance are under continuous relentless attack. We craft a future-proof architecture for trust in Web3.
We present the first first frictionless Web3 platform which autodetects trustworthiness based reports by strangers. We combine a privacy-first method for detecting content attention, peer-to-peer collaboration, explicit user feedback on misinformation, the user-to-user trust network, and use this to asses trustworthiness. ToDo: try to make an axiomatic model of your model Idea: to solve Sybil attacks the only thing we actually need is is "reputation staking". Users putting their own reputation on the line when claiming that a certain profile is real "I know this real person! this is not a fake identity". Instead of a social graph we have a "validity-of-personhood" graph. Mechanism to implement could be tamper-proof recording of "invite code sharing" and invite code acceptance. Could be thesis focus, dropping the tag complexity. |
Tried to look online for you for more broader perspective on Web3 trust. This is 2019 ML stuff with theoretical grounding: Privacy Risks of Securing Machine Learning Models against Adversarial Examples. |
This iteration I spent mainly on reading more papers and made a start on the literature survey itself (see attached document). The idea of "reputation staking" seems very interesting. Although it makes me wonder:
The linked paper on social turing tests relies on a workforce actively checking entities, whereas we do not have such a trustworthy workforce (assume everyone is malicious). I am not sure about the papers on ML stuff (ML is not my forte), although I could consider leaving the hardcore ML outside the scope of this thesis and work solely on trust between cluster nodes. Take the attached document with a grain of salt, as it is merely a work document at the current stage. |
|
On using vouching in reputation mechanisms: I have read the document by Microsoft on their vouching mechanism. However, that work is not focused on a decentralized setting:
A possible model I thought of, which supports web3 deployment, might include the following axiomatic properties:
Some related work:
Idea: the possibility of a 'negative vouch': I am will to put some of my reputation at stake by saying that this person is not real. Although some incentive may be required for 'vouching negatively'. |
Another idea. Less complex and (as far as I know) unexplored approach adopting a vouch graph:
Or, instead of the nodes that have vouched for exposed nodes having to perform the expensive task themselves: |
tip: deep analysis of why academia failed and only industry succeeded ? |
Related; #3657 |
This iteration was spent mainly on (re)writing the literature survey. I have also read some more papers and plan to continue expanding on the current literature survey. Thoughts/discussion points for next meeting:
Current progress: |
idea of an internet neighbourhood in which you are born. Inspired by evolution of cooperation, the social, biological angle with vouching, friends, "social capital" [ref], blood-ties, merit, and hometown. Other angle is the commercial reputation angle with rationality, slander, fake identity and fraud. Edit: this old paper from 2003 on trust in online shopping has been cited over 10000 times! See excellent related work table going back to 1967. |
Final version of literature survey I will spend the coming week mainly on focusing on a more specific model/algorithm for the thesis until our next meeting (which is also the kick-off for the master thesis). |
Related work: This paper introduces a novel and economical protocol, entitled KarmaNET, which binds any routing protocol with trust to build a trusted social path and create judicious forwarders. This creates incentives for nodes to build good karma, and excises any node that has accumulated too much bad karma. btw: quick reading feedback.. Table 1: please add year column. Plus also "something deep sciency" like: More in-depth comments:
|
"The Alchemy of Trust: The Creative Act of Designing Trustworthy Socio-Technical Systems" When designing a system with multiple stakeholders in their multiple contexts, how does one decide which trust model(s) to apply? And furthermore, how does one go from selecting a model or models to translating those into design? We review and analyse two prominent trust models, and apply them to the design of a trustworthy socio-technical system, namely virtual research environments. |
Latest version of literature survey (would like to start focusing exclusively on thesis): Latest vouching model:
Nice points:
Not so nice points:
Other remarks/observations:
Possible improvements:
Open questions:
|
|
Proposed direction: Additional 'nice to have' features:
Alternatively, a more focused approach: |
Infinitely shrinking ToDo list
Master thesis First Sybil-resilient federated machine learning on edge devices. Use Meritrank. Improve meritrank or improve federated learning??? Choose! Strict 3 weeks practical thesis direction exploration. Spend 3 days getting this code working. Formulate IEEE 1-column "Problem Description". Check usability of great dataset: 170,919 Creative Commons articles in the arXiv for biology. Explore: https://github.com/DQ-Zhang/refchaser Java: https://github.com/CeON/CERMINE |
Note on master thesis direction which @synctext asked me to write down after meeting with Rohan and Johan: I would like to keep my thesis quite generic by continuing around the point where Joost (Bristle) left off (make it better/integrate with meritrank) as well as perform more thorough evaluation (both with and without Android device emulation for sequential round simulation). There is no point in focusing on a very specific field early on. However, a 'nice to have' would be to integrate it with either the Music DAO or the Literature DAO. |
Thnx for the clarification! Lets see if we can nail down the non-iid part and possible dataset(s) for your thesis (e.g. I'm an ML beginner).
To Discuss: "design, deployment, security, and even usage of federated machine learning with non-iid". Update: @devos50 do not use experimental dataset (e.g. MusicDAO) AND create a new algorithm. So just boring imageNET because MNIST is just too embarrassing. Use the "edge devices" storyline, to set us apart from classical learning and provide our byzantine and fault-tolerance context. |
@grimadas Notes from meeting just now: I am looking into the possibilities of improving MeritRank. Possible gaps:
On a further note: while improving MeritRank would be great, I am not sure whether this will actually be included in my master thesis. Integrating 'current' MeritRank with decentralized learning will be a prior step. |
Brainstorm results. Focus Decentralised machine learning with non-iid
Sprint goal: familiar and datasets going (Kotlin). |
Cardinal lesson for your thesis design and experiments, don't mix like Bristle: |
Note after lab meeting of yesterday:
Note: the third thesis direction will include the second direction, as the second direction is simply the combination of weighted average and MeritRank. However, by choosing the third direction, we will not only find out the performance improvement of the second thesis direction, but also quantify the general increase of performance through introducing reputation metrics to decentralized learning. On the other hand, going with option 2 may allow us to add some extra neat tricks, such as gossiping edge weights, rather than broadcasting them |
(relevant for future work, possibly relevant for this thesis)
|
Possible new idea (with inspiration from @devos50 ): For federated settings, there already exists a very nice defence mechanism: FoolsGold. They have shown it is possible to detect Sybils in federated learning by looking at the updates of the weights of the models every round (Sybils tend to have very similar updates every round). Visualization of intuition behind FoolsGold: However, in decentralized learning, this task becomes a bit more complex, as there is no central aggregator, and we do not have access to all trained models. My suggestion for solving this in a decentralized setting consists of the following items:
This way we handle the following cases:
|
-the model gradients obtained from peers becomes the self-confession of being a Sybil. So these ideas all sounds interesting. Create a "thesis ideas" PDF. Next step is to find an academic sponsor. For msc thesis grading. Key question to ask: what idea is could be published in an academic venue? |
I do have some concerns about the proposed solution and non-IIDness of data. Non-IID data will lead to (honest) gradients with large differences and might incorrectly classify nodes as being Sybil. This effect is probably even more pronounced as each node has access to less models. It is hard to say if this idea works or not, and what it offers compared to related solutions. I would suggest an incremental approach here: 1) reproduce the experimental results of FoolsGold in centralised settings so you have a solid baseline, 2) have every node in the decentralized network run the FoolsGold main logic and quantify the "decentralisation penalty" / costs of decentralization in terms of attacker efficiency and possibly overhead, 3) add your magic solution and see how much it improves the situation.
Are you dynamically modifying the network topology? Note that there has been some work on building "optimal" network topologies to accelerate model convergence, see for example this paper. I was also recently pointed to this work that might be related to your approach. |
Thesis very very first draft (work document): All thesis ideas: Last week, I continued on the latest idea (Foolsgold decentralized). During this time, I have reconsidered the use of a reputation scheme
By using this approach, we do the following:
Main disadvantage of this approach:
Possible extension: the current idea of this new algorithm will only protect against targeted poisoning attacks (the adversary has a certain goal for which all Sybil need to send similar updates). Untargeted poisoning attacks (the adversary simply wants to ruin the global model by inserting random model updates) will not be caught by this algorithm, as no Sybil model update may be similar. What we could do: by evaluating all received models on your own data, a naive reputation score can be calculated, which may be used in a more complex reputation scheme. The benefit of doing this is that nodes need to be somewhat sure that received models are not just random noise. The drawback of such an approach is, evidently, the non-iidness of data, which will result in inaccurate reputation scores, which means that you may only trust someone who has somewhat similar data. (we sacrifice informativeness for resilience) Proposed plan:
Lastly, I am planning on performing extensive evaluations (different network topologies, multiple datasets, many nodes). |
|
Shorter update this week:
Let's further discuss this during our meeting. Furthermore, I have done some more reading on related work and started preparing the first experiment (reproducing FoolsGold's results). This experiment is almost ready to be performed on the DAS6. |
D-cliques: Compensating noniidness in decentralized federated learning with topology |
This week's update: I am well under way with a simple framework in which I will be able to run all necessary experiments. To verify correct code execution, I ran default decentralized learning and federated learning both non-IID and IID (1 hour): I also implemented a simple targeted poisoning Sybil attack, in which all Sybils attempt to affect the final model to classify a certain class as another class (label-flip attack). This graph shows the result with 100 Sybils and 100 honest nodes. (9 hours) Furthermore, I started reproducing FoolsGold's results: (1 hour) Next iteration, I'm planning on continuing the reproduction of FoolsGold's results. Implement the backdoor attack and show how FoolsGold does/should not work as well in decentralized learning. I will also start on our own algorithm for which it does work. General observation: Machine learning people often do not see some risks in decentralized systems. However, decentralized systems people do not know much about machine learning. Trans-disciplinary research is required to truly push the frontier forward. Hypothesis: naively applied decentralized FoolsGold may be vulnerable to very specific targeted poisoning sybil attacks. The lack of a centralized server decreases informativeness during the aggregation process. Attempt to removing the central server will result in significant performance degradation. After this week, I am confident that given the proper implementation, we may be able to apply this to a large decentralized network (of android devices) Our assumption is that generating random model updates is extremely cheap. Therefore, this will only work for targeted poisoning attacks, rather than untargeted poisoning attacks. |
Great progress! |
This week's update (it's a long one): More experimental results: our approach worksWe have shown that FoolsGold works relatively well in specific cases. For example, in the case where we have a random geometric graph with 50 random honest nodes and 50 sybils performing a label-flipping attack. It is clear that FoolsGold performs significantly better than the basic averaging algorithm. If we compare FoolsGold to our own solution, we see that we lose some accuracy while completely removing the attacker's success. (Probably still need to tweak our algorithm a bit) However, if we consider a different network topology, in which the honest nodes form a random geometric graph among themselves, but each have exactly one attack edge we see how our approach achieves much better results. Note I realise this is a naive network topology, but it shows clearly how FoolsGold is not well-equipped in some decentralized learning configurations. Our novel gossiping mechanismOur gossiping mechanism works as follows:
We will also perform theoretical analysis of this gossip mechanism and determine the relation between attack edge distribution and Note: spoofing identities is out of scope of this research. FoolsGold's experimental evaluationFurthermore, I found that the results presented in Foolsgold are not very realistic. First of all, they only use 10 honest participants, which is by far not enough participants to obtain realistic results. Note that amount of participants coincidentally corresponds to the amount of classes in MNIST and CIFAR-10. This brings us to the second point: their definition of non-IID entails that every peer has all/most samples of a single class. However, using the Dirichlet distribution for determining the class distribution over the peer is more realistic and more accepted in the scientific community (credits to Martijn). This means that Foolsgold's effectiveness is exagerated as it has been evaluated in an unrealistic environment, which, if necessary, we can prove. This does not mean that FoolsGold is useless; the core idea is still genius and works, and we can improve on it. ExperimentsNow that we have shown our solution works, I believe it is a good week to start fixating on which experiments we want to perform and which graphs to include in the paper. As one would expect there are many parameters involved in these decisions. I have composed a list of the relevant parameters, as well as whether I intend to include them.
Weight relevanceAs was noted in FoolsGold's paper and by Johan previously. The cosine similarity function no longer works if Sybils add random irrelevant noise to the model updates. Therefore, we have strong motivation to consider techniques for determining the relevant weights in the model updates and only perform the cosine similarity on these. The paper proposing FoolsGold does not go into details on this, and only looks at the output layer to directly map edge weight to probilities. I believe that we can do better and look at all weights in the model. Therefore, I have composed the following techniques (note that I do not yet know which one will be most effective).
I am not planning to go much more into details here due to time constraints, but I would like to discuss this aspect and present our findings on the best option. PaperI am also planning to think about the paper structure. This is my current plan: Paper structure:
Next iteration's plan
Plenty of work to be done! :) Also, this meeting we will have to complete the thesis examination committee form and submit it to the BoE. |
|
@synctext Remark on Gossip relay feedback:
I will for now assume the first option. Note that this will not affect the results of the experiments, but will likely simplify the code base. |
https://proceedings.mlr.press/v162/zhang22w/zhang22w.pdf UPDATE: try to do a grand thesis: Foolsgold attack assumption is completely naive , attacks have lots of compute and humans: No shame in making storyline: Can't be solve. proving the hardness of distributed ML security. we need passport-level identity 💡 |
This sprint's update: Thesis with problem description and draft structure Other completed tasks:
Plan for next sprint:
Note: I took your advice into account to keep a single storyline and decided that we should indeed likely not focus too much novel weight relevance methods (as described in my previous sprint update). |
|
Thesis update: Other notes:
|
|
This week's update: Additional remarks:
|
|
This week's update: Additional remarks:
|
|
I have submitted the first thesis version for feedback to the thesis committee. |
|
Grades for With Honor graduation. So focus on a full scientific thesis. Graduation Fri 7 July 2023?
Brainstorm {proudly copied from Rohan issue}. Current main idea:
Other directions for inspiration:
OR
OR
OR
DAO cryptoeconomics. Warmly recommended is this cutting-edge of science stuff:
https://brian2509.github.io/test-123213/thresh-hold%20signatures/frost-implementations/
== crypto for creating an leaderless organisation scaling to billions of people and billions of treasury money.
First step: literature survey. Example from a while ago. 24 Oct target finish (start thesis 2 weeks early)
The text was updated successfully, but these errors were encountered: