Using proximity to mitigate the Sybil attack #4481

synctext · 2019-05-03T12:50:49Z

We utilize the inherent fraud-resilience of proximity to stop unlimited fake identity creation

Close proximity is a property which is easy to validate. Also online it is possible to prove that somebody is less than 500km away. This enables us to distinguish close acquaintance from far away strangers.

Simple methods such as sending a large number and expecting it to be returned quickly offers a tamper-proof proximity measure. The laws of physics and limited speed of light prevent tampering and can prove that you are located nearby.

This master thesis will create a mathematical model which reflects the success of proximity-based defenses for a given strength of the Sybil attack.

We make a realistic assumption on attacker capabilities. It is far harder to create fake identities in close proximity to a victim than at any other random location on the Internet. We assume that attackers have finite resources and some financial constraints. The goal is to prove strong mathematical properties of defenses and attacks. In a generic cooperative agent model build trust after several successful interactions. We believe that a successful strategy to build cooperative cluster in an overwhelmingly defectors-dominated world is to do some initial risk taking when meeting strangers. By adding our proximity mechanism it is possible to build highly effective mechanisms to identify cooperators in close proximity plus test, discard non-cooperators, and mitigate (Sybil) attackers. We further assume cooperators can gossip about the collaboration levels of others using a tamper-proof data structure, for instance, our IETF Internet Standard draft trustchain.

Seuken proved that it is impossible to defend against the Sybil attack in the general case, in any system which has single-report responsiveness. This thesis work is meticulously designed to bypass that impossibility result and create a meaningful, effective and even highly efficient system without single-report responsiveness. We have looked into restricting communication to strangers using proximity for years, #2541. Prateek Mittal from Princeton exploited the temporal dynamics of the Sybil attack for a defense. This work also builds upon our our prior work by Pim Otte.

Desired outcome:

literature scan (games on graphs, All Novak work, social capital accounting, etc.)
Sybil-Proofness using proximity in model form.
ToDo: beautiful math proof

(still tentative, we just see how far we get with this master thesis direction)

alexander-stannat · 2019-05-17T12:04:03Z

Reading List (done):

Trust, Reputation and Accounting Mechanisms

Evolutionary Game Theory

To read:

I have written a pretty comprehensive introduction to the trust problem and have been thinking about the different existing trust and reputation mechanisms, such as

BarterCast
Netflow
Eigentrust
DropEdge
TrustRank
etc

I wrote down all necessary requirements that our mechanism will have to fulfill and devised a mathematical framework for these. In the coming week I will finish the framework for trust and incorporate elements from Nowak's papers about the development of trust and reputation and evolutionary game theory, so that we can send the first draft to Japan.

For the geographic proximity I have thought about this:
When an agent (A) in the network decides that one of its peers (S) might be a sybil, controlled by another peer (B) in the network. Then A wants to determine if S is, in fact the same person as B. In order to do this, A will ping both B and S and depending on the time it takes for the ping to return, A can determine whether its assumption is correct. In order to prevent S or B from spoofing the ping, by artificially delaying their response time, A can collude with other trusted peers in the network and send out multiple pings from different locations. The goal is to geographically surround S with A's trusted partners, which will prevent S from lying about its location. The resulting ping times are cross-referenced to determine whether S and B are in the same place.

There is a slight problem with this though. How many peers does A have to work with and where should these peers be located in order for no credible delay to exist with which S can convince everyone that S and B are not in the same place. I will perform some testing over the weekend and come up with some formulas in the next sprint.

devos50 · 2019-05-18T07:33:01Z

How many peers does A have to work with and where should these peers be located in order for no credible delay to exist with which S can convince everyone that S and B are not in the same place.

I would say this highly depends on the trust requirements of the application in which you deploy this system. It would be nice to have some model that determines: peer S is a Sybil, controlled by B with X% confidence. This is influenced by the number of peers you select for the "ping test", and the variance in ping times.

synctext · 2019-05-20T11:02:54Z

Fascinating discussion outcome:

emergent effect: only cooperators are long-term successful
- specifically aligned the incentives.
- applied evolutionary game theory
triangulation of exact location is possible, but unacceptable violation of privacy for majority of Internet users
close proximity is impossible to fake, it is indeed a very restrictive mechanism
goal is to create a full-proof mechanism that is guaranteed to always work
Transitive exploration of trust
- our mechanism is very restrictive in the bootstrap phase
- Strategy: build cooperative cluster in an overwhelmingly defectors-dominated world
- do some initial risk taking when meeting strangers
- after the bootstrap phase, we continue to meet strangers
  - reliable cooperators introduce us to their reliable cooperators.
  - generalized close proximity: either close to us or to our reliable cooperators
  - Emergent effect: successful clusters of cooperation grow, link with other clusters, and merge into a single connected cooperative cluster.
close attacker can generate an infinite number of Sybils
- Fundamentally impossible to stop them initially
- Each Sybil requires an Internet point of attachment to communicate (IP address)
- We require another mechanism to protect us, use IPv4 address blocks
Close attacker protection mechanism
- Like normal city neighborhoods and houses, the Internet not only has a sense of proximity, but also the concept of a block of houses, called Internet address blocks
- stop infinite identity creation by close attackers within your neighborhood
- attackers always control a limited amount of Internet address blocks
- Maintain a reputation score for each Internet address block (IPv4 range)
- Prefer to meet strangers in higher reputed neighborhoods within your proximity (like prefer traveling up North versus going to the unsafe South)
to illustrate the effectiveness of repeated interaction and evolutionary survival: http:https://netlogoweb.org/launch#http:https://netlogoweb.org/assets/modelslib/Sample%20Models/Social%20Science/Unverified/Prisoner's%20Dilemma/PD%20Basic%20Evolutionary.nlogo
Like postal codes/zip codes the Internet has neighborhoods, see Internet address block allocation:

synctext · 2019-05-21T05:54:14Z

initial netlogo simulation of 90% defectors with 10% cooperators. Shows the evolutionary survival of the blue (cooperators) versus the defectors (red).

synctext · 2019-06-11T09:46:01Z

More on the social side of "social proximity":
Online exchanges mimic the close ties once formed through face-to-face exchanges in villages, but on a much larger and unconfined scale. In other words, technology is reinventing old forms of trust

from the book: What's Mine Is Yours: The Rise of Collaborative Consumption, plus mentioned here.

cuileri · 2019-06-13T19:08:12Z

And more on the trustworthiness-proximity relation, from The concept of decentralized and secure electronic marketplace, 2008:

In traditional commerce, the trust between buyers and sellers is based on such things as societal laws and customs, and on the intuition people tend to develop about each other during interpersonal interactions. **The trustworthiness of these factors is based, to a large extent, on the geographical proximity between buyers and sellers.** It is the physical venue of the trade that is subject to specific trading laws, which may be enforced by the local police; and it is the eye contact between the trading partners that may induce a degree of trust between them.

MeadowHillSoftware · 2019-06-13T21:05:28Z

What if B controls a botnet?

synctext · 2019-06-26T11:55:10Z

Generic scientific problem formulation attempt draft

We demonstrate a strategy for cooperation in which only cooperators are long-term successful, given some assumptions. Our strategy is an "evolutionary stable strategy (ESS), meaning that if adopted by a population in a given environment, is impenetrable and cannot be invaded by any alternative strategy.

We assume the existence of a secure identity function. The agent population is not assumed to be static. New agents continuously enter the system. The secure identity function ensure agents can't defect in one round, re-enter the system with a new identity, and defect in future rounds. The secure identity function creates a probability that previous defections can be linked to the misbehaving agent. In general, each agent in our system can rely on the secure identity function to test a certain property of another agent which is unequivocally true and is resilient to strategic manipulation. (protects against botnet, as correctly remarked by @MeadowHillSoftware)

We assume the ability of agents to gossip successful acts of cooperation. We define this as the goodness_gossip(). Any interaction based on gossip and trust is exploitable. Our strategy relies on mitigation the effect of defection using the secure function.

Each agents uses a reciprocity_function(secure_function(), goodness_gossip()) to accept or reject requests for cooperation. Depending on both the outcome of the secure function and the gossip the agent can classify others as pure strangers, low-trust agents or long-term cooperators.

The core of each agent strategy is to build personalized community of trusted collaborators. Each agent is either building this community (bootstrap phase) or maintaining it (bootstrapped). Specifically, each agent continuously maintains a small community (e.g. 50) of other agents it has build up a long-term reciprocal relationship with. This community is different for each agent. We make the cardinal assumption that these long-term cooperators are also truthfully reporting on the cooperative behavior of others. These collaborators are used to conduct goodness_gossip().

The uniqueness of our strategy is the cryptographic protection of goodness_gossip(). We assume the existence of a mechanism which can guarantee that the original message can not be modified or tampered with. Agents can not hide gossip. They have to truthfully report all their interactions, or collude with other agents to hide them. When interacting with honest agents, successful gossip manipulation is impossible.

Our mechanism is very restrictive in the bootstrap phase. Reliable cooperators introduce us to their reliable cooperators. After the bootstrap phase, we continuously are introduced through gossip with strangers. We then expand our network into non-reciprocal agents, we transition into network reciprocity strategy. Emergent effect: successful clusters of cooperation grow, link with other clusters, and merge into a single connected cooperative cluster.

ToDo: mathematically prove certain attacks to be unexploitable.

Any exchange of value based on trust is exploitable

ichorid · 2019-06-27T13:04:07Z

After the bootstrap phase, we continuously are introduced through gossip with strangers. We then expand our network into non-reciprocal agents, we transition into network reciprocity strategy.

Sounds like a "proof of good deed" or "proof of public service".

synctext · 2019-06-27T13:48:51Z

Thnx for explaining your token cap idea:

simple model (exploitable)
second model: with token limits (give newcomers initial benefit)
third model: add neighborhood principle and bad neigborhoods

Emergent property: certain regions have ran out of "trust"; no kindness to strangers, other regions trust strangers.

alexander-stannat · 2019-06-27T23:35:23Z

Current idea:

Leech-Caps:
It is our goal to enable newcomers to the network to consume some data before having to contribute. A node that wants to bootstrap is eligible for data limited by an upper bound, which we call the leech-cap. Every agent in the network has personal cap of how much they're willing to seed to an unknown node that they have no transaction history with yet. A node can leech up to the cap and no more thereafter. In order to prevent excessive leeching by a group of new nodes, this cap decreases after every contribution made. This way we can prevent uncooperative behaviour through whitewashing and Sybil attacks.
The leech-cap of node i, could be determined through an exponential function and the formula might look something like this:

$-k\exp(cl) + c_o(i) + k$

Transitive Caps:
A node's leech-cap is also partially determined by the leech-cap of its neighbours (contributers only). This way if any excessive consuming without contributing occurs, node's in the region affected will reduce their caps collectively, rendering such an attack less beneficial. The effect the caps of nodes have on the caps of their neighbours is determined by the weight of the connecting edges.

$c_1(i) = \alpha\cdot c_0(i) + (1-\alpha)\cdot\frac{\sum\limits_{j\in V \\ (j,i)\in E}w(j,i)\cdot c_1(j)}{\sum\limits_{j\in v \\ (j,i)\in E}w(j,i)}$

We therefore obtain a recursive formula for a continuous process of cap updating in the graph. This makes Sybil attacks, without any significant work put into the network by the attacker, disadvantageous.

Maliciously increasing honest node's caps:
An attacker might decide to perform some work to increase the cap of honest nodes in the network such that other identities controlled by the attacker can benefit from the increase in capsize. This is prevented by an upper bound that the leech-cap cannot exceed. Hence, any work performed in such a way will have a decreasing return in the amount of work gained. This makes strongly beneficial Sybil attacks unfeasible.

Question:
A question that still needs to be answered is: What can be done to prevent the leaking of leech-caps into a particular region of the graph? When a node leeches from another node, some of the contributer's cap wanders over to the leecher. If the leecher does not provide any work for the network thereafter, this cap never 'moves'. If this happens a lot the network might loose all its cap and could then stagnate.

Idea:
Maybe introduce another kind of cap, called the seed-cap. While the leech-cap determined how much a node should be willing to contribute, the seed-cap determines how much a node is entitled to receiving from the network. When a node requests some work performed, the seeder will determine the requester's seed-cap and its own leech-cap and then provide no more than the minimum of the two. This could prevent excessive leaking of leech-cap out of the honest part of the network and provide some additional robustness against Sybil attacks.

Geographic Proxmitiy:
The Seed cap could be distributed among newcomers based on geographic proximity, i.e. if there are a very large number of bootstrapping nodes in a particular region then the seed-cap is split up among these nodes. This means an attacker that creates multiple identities does not gain any additional work, because the work they're entitled to is distributed among the multiple identities.

Note that this is not a mature idea yet and still needs to be given some thought.

ichorid · 2019-06-28T10:53:32Z

There are two kinds of resources: the ones that can be stored (stockpiled) and the ones that can't. One cannot store bandwidth. Owning a PC that has some bandwidth available to it at all times is like owning a small personal hydro power plant: I don't care who uses my electricity as long as I (first) and my friends (second) have priority of using it at all times. I'm completely fine with giving my "free" power to strangers if no one else wants it, because I can't store it. At any moment, I cannot consume more bandwidth than my uplink allows me to, thus I can't "save up" some bandwith for the future. Basically, this is a post-scarcity economy.

We don't need accounting. We need priority queues based on reciprocity.

synctext · 2019-07-01T16:28:30Z

We need priority queues based on reciprocity.

Excellent point. Many users have monthly caps, so bandwidth is finite. Unused bandwidth can be given away. The effort of leaving you laptop running overnight can be a burden.

cuileri · 2019-07-29T10:32:15Z

We now have this fraud-resilient overlay idea:

Fraud Resiliant Overlay

We want to create an overlay network which makes it difficult for sybils to gain majority in a specific node's neighborhood.

Motivation:

Sybils form close proximity regions
Latency is difficult to fake (?? to be discussed ??)
Limiting the number of neighbors from a specific distance (region) decreases the chance of a sybil region having the majority in the neighborhood of a single node.

Input: Network Layer
Output: Overlay Network where a node's direct neighbors are as distant as possible from each other.
Requirement: A discovery mechanism
Procedure:

Using discovery mechanism, learn about as many nodes as possible.
Group the nodes according to their distances (ping-pong time, latency).
Select nodes from each group fairly (w.r.t groups) and randomly (within group)
Use a b-matching algorithm to find the overlay network.

Notes - Todos:

Find another (possibly bidimensional) distance measure: Current idea of measuring distance returns a scalar distance value. (Two very distant nodes which have same distance with another node A are considered by A to be in the same region.)
Adapt distributed b-matching algorithm to this case. Use randomness to create edge weights in such a way that, when the edges are ordered according to their weights, the order sequence visit the regions fairly and one by one.
Find and prove the probability of nodes in a sybil region gaining the majority in one's neighborhood.
Find and prove the probability (for a byzantine node) of getting over a “double-spend detection” mechanism built on top of fraud resiliant overlay.

Related Works:

Sybil Detection Using Latent Network Structure, Schonebeck et al., 2016
The Latent Community Model for Detecting Sybil Attacks in Social Networks, Cai & Jermaine, 2012

Where to use
Fraud resiliant overlay will help the upper-layer mechanism to have the following features and guarantees:

Incentives for cooperation: Given the fraud resiliant overlay network, nodes select their counterparties only from their two-hop neighborhood. The node has to cooperate with its neighborhood to develop its identity.
Information hiding detection: (relies on the gossiping between peers)
Double-spending detection on Trustchain: (relies on the gossiping between peers)
Eventual-consistency of Trustchain: It is now our intuition that, assuming fraud-resilient overlay and a gossiping mechanism on top of this overlay, we can prove eventual consistency of Trustchain.
Risk-aware partner selection: Nodes have a threshold value for the maximum amount of tokens they can trade with a specific node, at a specific time. The threshold can change according to positive/negative observations about the trading partner. (Leech-cap and transitive-cap idea of Alexander)
Initial drawings of model:

grimadas · 2019-07-29T11:09:31Z

Some minor comments on the model:

Latency can be faked but only in one way. Peer can artificially slow itself down, but that would have an effect on upper layers.
Random walks are also helpful for avoiding Sybil regions (?)

alexander-stannat · 2019-08-05T09:58:12Z

Comments about my discussions with Hisashi Ohtsuki at Sokendai university in Japan:
I introduced Hisashi and his team to the trust problem in the tribler context with a presentation. The idea was to introduce the problem from the perspective of evolutionary biology and evolutionary stable strategies that induce cooperation in a population. Hisashi and Martin Nowak identified a set of 8 strategies which they labelled the leading eight, inducing cooperation and deterring defectors. They introduced a set of axioms which provide a framework for what properties such strategies need to satisfy in order to foster cooperative communities.

Hisashi's idea was to introduce the notions of reputation scores in combination with behavioural strategy sets. Given a fixed but arbitrary reputation score, agents choose a particular strategy.

Example Reputation Scores:

Example Behaviour Strategies

In Hisashi's and Nowak's case, reputation values are binary, whereby 0 is a bad reputation and 1 is a good reputation. The set of strategies is binary as well $\{C,D\}$ , with C standing for "cooperate" and D for "defect".

A behaviour strategy is a method $d:\{0,1\}^2\rightarrow{}\{C,D\}$ , which determines whether a node with a given reputation value wants to cooperate with another node. For example:

means that if a node has reputation 0 it wants to help a node with reputation 1.

A reputation dynamic is a function that assigns a node that has made a decision whether to cooperate or defect with another node a new reputation value, i.e.

$d:\{0,1\}^2\times{}\{C,D\}\rightarrow{}\{0,1\}$

This yields reputation dynamics and behavioural strategies.

Here Hisashi introduces the concept of a relative payoff of a behavioural strategy given a reputation dynamic. He then identifies the 8 most lucrative behaviour strategies, called "leading eight". It is our goal to find a continuous equivalent to this reputation dynamic, which additionally belongs to the leading 8. We explored a number of different graph theoretical centrality measures: EigenCentrality, KCores, Shortest Paths, Katz Centrality, Betweenness Centrality, Shapley Value, etc. We came to the conclusion that none of these metrics are truly sybil resistant and that an inversed approach might be smarter, i.e. find some kind of impossibility result. So far all of our attempts have been heuristic. Hisashi advised me to take a more theoretical approach and set up a more strictly defined set of axioms for such a dynamic function. Hisashi acknowledged that the problem was not as closely related to evolutionary biology as it first appears, as non-unique identities change the nature of the problem significantly.

I'm currently devising such a set of axioms oriented along the idea of the leading eight:

The highlighted tuple (d,p) would be the optimal in this case. So far, I have come up with 4: Cooperate, Punish, Apologise, Forgive:

Cooperation with reputable people yields increase in reputation
Punishing, i.e. defecting from low reputation does not decrease one's own reputation (it might even increase it)
Apologies are allowed, i.e. a low reputation peer cooperates can still cooperate with other peers in an attempt to recover from its bad reputation
Nodes that have redeemed themselves are forgiven by the rest of the network

If we choose a strict enough set of axioms we can maybe come up with a nice impossibility result.

There are number of differences between our model and the model of Hisashi's work, namely:

Reputation values should not be binary, but continuous
A reputation value should be aggregated from a node's entire transaction history and not just its most recent transaction
Identities are not unique (Sybil attacks)

synctext · 2019-08-12T12:52:00Z

Focus&brainstorm for thesis direction:

prisoners dilemma problem
- multiple rounds
- large population
- lot of defectors
- lots of prior work
assumption: complete network knowledge
focus on a variant of above problem, usable/valid for computer science
- stranger policy
- dissemination of outcomes and limited information storage
- tit-for-tat is blind to other people's prior interactions
- reputation systems require broad knowledge of others (e.g. personalised pagerank)
compare:
- blind tit-for-tat
- pagerank
- informed tit-for-tat:
- repeat prior 1-hop cooperators
- history of potential interaction partners is known
- push out defectors due to their history
- it is simple & effective; hopefully can prevents lazy freeriding
- idea we propose: tamper-proof and immutable record keeping (e.g. Trustchain)
  - not local information; not global information; now we have 1-hop information
  - key issue: how to select strangers
- extend with sybil attack problem or not?
  - limit stranger access (any non 1-hop agent)
  - bounded resources given to sybils
  - introduce secure attribute: age, ping time, proof-of-work, passport-identity, etc.

synctext · 2019-08-12T15:56:53Z

Please upload you current research notes here + add a brief 1 page with possible research directions; given discussion(s) we had. (e.g. something with strangers get lower priority)

alexander-stannat · 2019-08-20T11:09:05Z

Research Notes, thesis draft and possible research directions have been added here

synctext · 2019-08-20T14:13:15Z

remarks, as going through the current thesis draft.
Section 1.5 possibly focus on "1.5 fully self-organising systems"
Very few academically pure self-organising systems have been successfully engineered.
File sharing system Bittorrent and cybercurrency Bitcoin are two two leading example of systems without any single-points of failure. etc.
(see terminology abuse, like, "serverless")
Possibly find more appealing "Figure 1", https://commons.wikimedia.org/wiki/File:P2P_Topology.jpg

"2 Problem Description: Sybil Attacks and Misreports"
[dramatic angle] Dealing with scarce resources and our planetary limits is on of the cardinal questions of our times. Freeriding on the public good is a key open scientific problem. For all natural resources like fresh air, pristine forest, and abundant fisheries it is possible to exploit them.
In the digital world a similar freeriding problem exists. etc.. The central problems we address in this thesis are the Sybil attack and misreports. We explain in detail the background and ...

4.1 We improve upon the notation and model introduced by [PIMOTTE] and [SEUKEN].
(my advise would be copy where possible; improve when you can; needs approval of Dion)
Specifically, we add the novelty of a trust graph to the work graph. This enables a model of information availability. This enables us to reason with partial information and lack of information. Split certainty and uncertainty.

Definition 5.1
Our model introduces a mathematical notation of the IETF Trustchain draft Internet Standard. Indexes refer to hashpointers, etc.
Recall that in TrustChain elements of these sequences are ’linked’ through hashpointers. In trustchain a transaction corresponds to a single block. Engineering details such as digital signatures are covered in other parts of our model and assumptions (or something). [please remove 14-point bullet list]

"6.3.1 Allocation Policies"
This could be the road towards the key thesis contribution???
What is more fitting/grand: allocation strategy or allocation policy

Define freeriding please

Definition 5.5 (Misreport transactions).
Definition 6.5 (Misreport trust)

Define "stranger" and initial risk taking; people of the 'shadow of the graph'. No direct observation. Or another approach to a definition: if you know their interaction history they are no longer strangers. If the stranger shares their interaction history; this action transforms them into non-stranger.

Come up with some nice proof. Suggestion bounded Stealing with parameter S by strangers with this overly strict allocation policy :-)

alexander-stannat · 2019-08-22T09:00:03Z

Thesis draft updated.

alexander-stannat · 2019-08-23T10:48:05Z

Good meetings yesterday and today. A quick summary of my model (a more comprehensive discussion can be found in my thesis draft):
In yesterday's meeting we introduced the idea of a third graph structure which should be based on geographic locations of agents. We now have three graphs: A proximity graph, an interaction graph and a trust graph. The latter two are old concepts. The proximity graph is introduced with the sole purpose of nullifying Sybil attacks:

The concept of work graph is repeated below:

We begin by introducing what we call a similarity vector (might be renamed later). This is a vector for every node in the subjective work graph, consisting of attributes pertaining to the physical location of the node in question:

The idea behind the similarity vector is that by default Sybil nodes all will have very similar values. The individual components should relate to ping times from the seed node (as well as from certain established nodes perhaps), location in the network tree (traceroutes) and IP-addresses, etc. Note that we want to keep this as generic as possible for now.

Next, the proximity graph is derived from the similarity vectors as follows:

This means that every set of nodes in the network that are very close to one-another, by standards of similarity vectors are grouped into one neighbourhood. Note that some false positives might arise, but not false negatives, i.e. some non-sybils will be grouped into the same neighbourhood as sybils, but no sybils should be overlooked. Later on our goal will be to optimise our similarity vector in such a way as to minimise the amount of false positives through use of a clever metric...

Now, using the proximity graph we redefine the subjective work graph as follows:

This means that all nodes that are considered Sybils by our similarity vector are collapsed into one single node, whereby all edges in between the potential Sybil nodes are dropped and we obtain a "proxy node" representing all Sybils as one. If this is implemented effectively, all fake edges in between malicious nodes are removed.

In the next step we can rely on already established mechanisms of reputation, such as maxflow, netflow, hitting time, etc.

So instead of looking like this:

our new model would look as follows:

We would call this a "two-layer trust model" instead of our previous "one-layer trust model", i.e. trust models that are only based on the unmodified work graph.

In the next week, we will have to come up with reasonable metrics for the similarity vectors as well as determine the best possible reputation mechanisms to applied on the modified work graph now.

synctext · 2019-08-29T09:06:37Z

Thesis raw .PDF link ongoing discussion:

you are a theory person, therefore we simply assume the availability of a mechanism to produce a misreport-proof proximity graph.
Rapid depletion of attack edges is key, keep the 2nd order depletion effect in mind (e.g. 2-hop friendlies).
- possibly document this attack and give it a name
- Proof: having depletion means you are vulnerable to strongly beneficial attack
- Your proof idea: with sufficient depletion you are protected for n th order of strongly beneficial attack
after your two-layer trust model is documented in your thesis, possible we document a very restrictive mechanism which is proven to be Sybil proof:
- assume everybody can help each other (repeated PD game, large populations) {file sharing relaying}
- long-term tit-for-tat with memory
- proof-of-work, no negative balance
- do not allow any transitive trust
  - only 1-hop interactions
  - everybody else is a stranger
  - strangers need to build a positive balance first (e.g. become small 1-hop interactors)

alexander-stannat · 2019-09-04T15:53:22Z

Summary of my progress and my meeting with Dion:
I discovered a mistake in Seuken & Parkes. Theorem 1 states that in case an accounting mechanism satisfies three conditions (symmetry, independence of disconnected nodes, single-report responsiveness) there always exists a strongly beneficial (passive) sybil attack. Their proof is based on the following subjective work graph:

The argument is that while no work is required to go from the above graph to the one below, node k gains some reputation through adding the edge (j,k). If the weight of the edge is large enough then it will hold . Hence, the attack must be a strongly beneficial as it holds $\omega_{-}=0$ and $\omega_{+}>0$ and therefore $\frac{\omega_{+}}{\omega_{-}}=\infty$ .

However, this is actually incorrect as the work that corresponds to the edge (i,j) is not taken into account at all. Seeing as this (the attack edge) is an indispensable part of a sybil attack, I found myself disagreeing with the assertion of this theorem. I think that this kind of attack constitutes a weakly beneficial sybil attack at most, which is a pretty big difference.

I introduced two new definitions: parallel-report responsiveness and serial-report responsiveness, based on which I have come up with two theorems that actually prove the existence of strongly beneficial sybil attacks under the assumption of these slightly more restrictive conditions.

From this definition, in combination with the former definitions we can derive the existence of a strongly beneficial sybil attack.

The second definition is given as follows:

From this we can derive the theorem:

The proofs to the are on pages 20 and 23 here. In the future we might be able to prove that any accounting mechanism which does not allow for parallel and serial report responsiveness could be strongly sybil resistance. In order to do this, we need to define the reward of a sybil attack $\omega_{+}^n$ more accurately. So far, I have come up with the definitions:

and

However, the upper definition is given in terms of work consumed, while the lower definition is given in terms of reputation gained. These two are not directly comparable and it is our goal to find a function that translates an increase of reputation into an expected value/amount of additional data that can be consumed.

Over the course of the next week I will try to come up with a function for this. However, this is likely going to be a rather difficult thing to do as it is not deterministic, but rather probabilistic. Depending on the allocation policy a node will receive some data, if it is the node with the highest reputation among all nodes in the choice set. However, it does not have any influence over the nodes it is "up against". Hence an increase in reputation is only directly related to an increase in probability of being served, making this a rather difficult problem to solve.

Finally, I discussed the model of physical proximity with Dion and he suggested that in order to avoid any excessive free-riding of Sybil agents (through being assigned to the same neighbourhood as a large number of honest nodes) I should somehow limit the number of nodes in a neighbourhood of the proximity graph. I will have a think about this problem as well over the course of the coming week.

synctext · 2019-09-11T10:19:32Z

Goal of thesis: a theoretical proof of full Sybil resilience. We obtain this goal by imposing very harsh and restrictive conditions.

Definition: Samaritans are nodes which give away resources for free to total strangers. After a node acted as a samaritan, they are no longer strangers.

Definition: Positive net contribution. A particular node only collaborates with another node if the sum of all historical exchanges remains positive.

Definition: Remoteness. Unless you have a positive net contribution, we consider you a stranger and will not interact.

(now we have long-term tit-for-tat, very well studied stuff)

Definition: Custodian responsibility. A custodian responsibility is an alternative form of transitive trust. Taking custodian responsibility mean introducing a stranger to a node to which you have a positive net contribution. The amount of custodian responsibility you take is subtracted from your positive net contribution and assigned to the stranger. The stranger then becomes a samaritan, through "vouching". (not incentive compatible, but we'll get there)

Desired theorem:
Every accounting system which satisfies the remoteness, etc. custodian responsibility condition is misreporting-proof.

Definition: zero-cheating exposure is loosely defined as that you can never be cheated. Thus you are very paranoid, unforgiving, and only engage in positive net contributions. You never interact with strangers, only samaritans. Only interact to strangers with custodian responsibility. Thus there exists an initial investment, proof-of-work or attack edge. Only restrictive weak minimal transitive trust exists (add more weasel words). Open question: how do we restrict this or add incentives? Nodes primarily engage in direct reciprocity and ensure a positive net contribution.

Desired theorem:
Every accounting system which satisfies the "zero-cheating exposure" condition is Sybil attack proof.

synctext · 2019-09-13T08:42:59Z

@grimadas Please define the social profit incentive we just came up with.

grimadas · 2019-09-13T09:41:02Z

One idea to get further the idea of Custodian responsibility. Some peer with high reputation/balance can vouch(invest) in other peer using part of reputation/balance.
This will give a chance for peers with low balance to increase their reputation, while high reputation can see it as a profit possibility. That might increase total social capital.

ichorid · 2019-09-13T22:18:00Z

ISPs provide space for rent in their datacenters directly connected to the backbone. It is possible to strategically place a single node in each of these backbone facilities. Such a network will allow the attacker to emulate almost any latency mapping.

alexander-stannat · 2019-09-22T22:54:30Z

Progress Report:

I have solved the problem mentioned above, in which it was our goal to express the investment/cost of a Sybil attack in terms of work, which the attacker would be able to consume after the attack had been carried out. However, this was a problematic definition, as determining is non-deterministic. It depends on the reputation scores of the sybil nodes, the reputation scores of all other nodes, the allocation policies of the nodes queried for work and the probability of other nodes being in the choice set of the node queried.

I suggested a model with rounds. In each round a proportion q of honest nodes will decide to query another honest for a fixed amount of data, d. The honest node to be queried is chosen at random from a uniform distribution. Hence for an honest node i, we find that the number of nodes in the choice set will be given by the binomial distribution

Now, whether a sybil node will be served by the honest node i, is given by the probability of it being the node with highest accounting value in the choice set

Now, we determine the random variable as as the indicator function which is equal to one if node j is the node with the highest accounting value in the choice set of i at round n. Now the amount of work node j may be able to consume from the point of the Sybil attack

Now, this model may return a decent value for the return of a Sybil attack. However, it involved a number of significant restrictions. Hence, we came up with another type of model, which might be more informative and easier to determine. Instead of trying to determine the return of a Sybil attack in units of work consumed, we may try to determine the investment of a Sybil attack in terms of accounting values. Recall that the earlier definition of a Sybil attack investment

We redefine this definition as follows: Recall the work graph after the Sybil attack, which contains all attack edges as well as all sybils.

From this we derive a new third work graph

in which all sybils (including the attacker themselves) are collapsed into a single node, i.e. all internal edges of the sybil region are dropped and all outgoing and incoming edges are connected to the attacker j.

Now, we determine the investment of a Sybil attack as the reputation the attacker has legitimately received, i.e.

This a much cleaner and easier definition of Sybil attack investment which is now easy to compare to the return of a Sybil attack and we can from hereon out easily determine the ratio

A question however arises now. Are these two definitions equivalent? Is a Sybil attack that is strongly beneficial in the former definition automatically strongly beneficial in the latter definition as well? The immediate answer to this question is no. There are sybil attacks, in which the amount of reputation that can be gained is bounded while the amount of work that can be consumed as a consequence is not. There are also sybil attacks in which the amount of work that can be consumed is finite while the aggregated reputation is infinite. Hence, we find that whether or not a sybil attack is beneficial highly depends on which definition of sybil attack profit (and investment) we choose.

It is now our goal to make some restrictions on the definition of accounting mechanisms that lead to an equivalence between the two concepts. This will be worked on in the days to come.

cuileri · 2019-09-23T10:15:10Z

Impressive progress. Here are come comments and questions:

However, this was a problematic definition, [...] depends on the reputation scores of the sybil nodes, the reputation scores of all other nodes, the allocation policies of the nodes queried for work and the probability of other nodes being in the choice set of the node queried.

Seuken seemed to overcome this by assuming "all of this (adding sybils to the network, make multiple false reports about created sybils, etc) happens in one step". What I understand here is that is the "marginal" gain of an action of the attacker, assuming all other things remain unchanged. His reasoning seems weak as he only says "we do not need a dynamic, multi-step analysis to study sybil attacks" and I couldn't find any other intuition for it. However I wouldn't say off hand that the "marginal" gain/cost approach is wrong. What do you think? Can't we use the same assumption?

What is q in the following formula?

This distribution would work if you assume all the nodes who are to evaluate (choose/not choose) the honest node have the same number of neighbors.

Now, whether a sybil node will be served by the honest node i, is given by the probability of it being the node with highest accounting value in the choice set

It seems you assumed that the honest node will select its transaction party with the condition of "being the node with the highest accounting value in the choice set". Is there any motivation for this? Why not to use a lottery biased by the accounting value?

Instead of trying to determine the return of a Sybil attack in units of work consumed, we may try to determine the investment of a Sybil attack in terms of accounting values.

Interesting approach. It would be nice to see the difference of approaches on a simple illustrative example.

synctext · 2019-10-03T10:41:49Z

LATEST .PDF of master thesis now 56 pages long (@grimadas please read and comment if possible, today I have a 14:00 meeting with @alexander-stannat; all welcome)
https://www.coindesk.com/the-blockchain-paradox, thesis could mention Nobel prize theory in intro
please add "proof of work" model before an identity is trusted. Superior idea to the "custodian responsibility", you simply need to work(relay) for X amount of units to gain the trust of a single individual. Identities obtain some cost. Sybil attacks are no longer profitable, each fake identity needs to invest X work units to gain the trust of a single victim. Is there any related theory work to this? Is this deceptively simple idea mentioned in classical game theory?

synctext · 2019-10-04T10:05:09Z

Great progress! (took me a while to read in detail). Your works has now reached "scientific quality" I believe.

Note that language is everything. Similarity vector, nice one. Proximity graph, you can make it a stronger concept by giving it a stronger name like clone graph used to estimate suspicious nodes likely controlled by a single entity. Something less strong does not communicate the strength that this concept represent, twin graph, coupling graph, equality graph or duplicates graph. Clone is a nice negative word that destroys individualism.

Please add the "Beta reputation system" from 2002, credited as being the first such proposal in history. How is your model more evolved and powerful?
http:https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.60.5461&rep=rep1&type=pdf
How does your work differ from belief discounting?

Finally, please start considering getting your work in article form before the end of this year. We might be able to get into Auckland, New Zealand; see 2019 workshop version https://sites.google.com/view/trust2019/

alexander-stannat · 2019-10-28T15:43:30Z

Presentation on Friday went very well.

You find the Powerpoint here.

synctext · 2019-10-28T15:51:54Z

Mid-term presentation review. Result: happy. Swap "We seek a generic incentive mechanism for human cooperation", discuss this first; then file sharing? Slide 10: cite to related work in small font? Slide 19: challenge of open systems that fake identities are cheap; no central control point or Internet police.

synctext · 2019-10-28T17:28:42Z

Somewhat related work: "The Economics of Trust and Reputation: A Primer, Lu´ıs M B Cabral"

synctext · 2019-11-20T12:11:27Z

LATEST thesis is now 58 pages long Solid progress:

Corollary 6.1.
Every accounting mechanism SM that satisfies independence of disconnected nodes, symmetry,
single-report responsiveness and weak parallel-report responsiveness, has a strongly beneficial
sybil attack.
Definition 6.17 (Convergence of parallel reports).
7. A Mathematical Framework for Physical Proximity

Good idea now to make each chapter thesis ready.

Key Chapters:

8: A two-layered Trust Model.
9: A mathematical Framework for Trust and Reputation Mechanisms

Chapter 14 now contains great stuff: "Trust and Reputation are equivalent to the flow of water in between containers (equivalently capacitors). In an interaction, reputation will flow from the higher container to the lower one.". In December the thesis might be included in some form in some chapter(s):

perfect cooperation algorithms does not exist
- weakly or strongly beneficial sybil attack
- or no informativeless
Positive bilaterial balance; lack of informativeness (problem with cooperation with strangers)
Netflow (needs an Alpha to make it work)
Drop-Edge mechanism
Personalised pagerank
personalised hitting time.
priority-based allocation; using "choice set". no strong Sybil proof, very practical!
Perfect algorithm tentatively called 2-hop T4T
- is slow to bootstrap
- runs parallel, connects with numerous strangers (2-hop) only
- always net-positive bilaterial connections
- strong Sybil attack resilience ????

synctext · 2020-01-08T10:30:29Z

LATEST thesis is now 67 pages long Solid progress:

Focus is now on finishing the Heavy Math, text polish later
ongoing polishing ("What about overlaps and loops? Needs to be re-evaluated...")
integrate chapter 2,3 and 4 possibly?
Chapter focus: Requirements for sybil-proof Accounting Mechanisms
Please consider if you have created a circular definition as the core of the heavy math? Sybils and non-Sybils execute various algorithms. The circular algorithm relies on the execution by non-sybils to detect other non-Sybils.
9 A mathematical Framework for Trust and Reputation Mechanisms. Note this actually starts with clean slate math definitions of allocation policy and Sybils.
within 6 weeks after green light is master defense

synctext · 2020-03-28T16:45:31Z

Switched thesis repo, 102 page thesis .PDF now here

alexander-stannat · 2020-03-28T18:18:37Z

Yes, please note that the current final thesis is found here.

I transitioned to the official TU Delft thesis template a few months ago. Sorry if this caused any confusion. I will make sure to "tidy up" the repository in the days to come.

alexander-stannat · 2020-04-08T13:22:25Z

Thesis finished! Final product can be found here. Thanks a lot everyone.

alexander-stannat · 2020-04-11T01:06:02Z

Due to some merge conflicts my final thesis repository has now been moved here.

synctext · 2020-07-14T12:57:13Z

Lets make this this the main weekly update issue. Target for coming time: a scientific publication. Also some experimental work with single Python script that can handle 120 million blocks?
Grander vision of real-world reputation, going even beyond the master thesis.
ToDo: 24July, write 10-line-ish goal for your 3-month project here please. Deploy a real algorithm?

qstokkink · 2023-11-30T12:26:21Z

@synctext this thesis has been completed. Can we close the issue?

synctext added the type: MSc-Thesis-Work label May 3, 2019

synctext added this to the Backlog milestone May 3, 2019

synctext assigned alexander-stannat May 3, 2019

synctext mentioned this issue May 20, 2019

Incremental update of trust levels in a dynamic blockchain graph #2805

Open

grimadas mentioned this issue Jul 29, 2019

phd placeholder: Accountability Protocol for Future Decentralized Applications #4719

Open

ichorid mentioned this issue Sep 23, 2019

Lab goal: "Distributed Trust Design" #3571

Open

synctext mentioned this issue Nov 11, 2019

Bitcoin-Intelligent-Life - Blockchain engineering class project #4039

Closed

4 tasks

devos50 mentioned this issue Apr 20, 2021

[CSE3000] Towards a Digital Foundational Identity #6050

Closed

synctext closed this as completed Nov 30, 2023

Using proximity to mitigate the Sybil attack #4481

Using proximity to mitigate the Sybil attack #4481

Comments

synctext commented May 3, 2019 • edited

alexander-stannat commented May 17, 2019

Reading List (done):

To read:

devos50 commented May 18, 2019

synctext commented May 20, 2019 • edited

synctext commented May 21, 2019

synctext commented Jun 11, 2019

cuileri commented Jun 13, 2019 • edited

MeadowHillSoftware commented Jun 13, 2019

synctext commented Jun 26, 2019 • edited

ichorid commented Jun 27, 2019

synctext commented Jun 27, 2019

alexander-stannat commented Jun 27, 2019

ichorid commented Jun 28, 2019

synctext commented Jul 1, 2019

cuileri commented Jul 29, 2019

Fraud Resiliant Overlay

grimadas commented Jul 29, 2019

alexander-stannat commented Aug 5, 2019

synctext commented Aug 12, 2019 • edited

synctext commented Aug 12, 2019

alexander-stannat commented Aug 20, 2019

synctext commented Aug 20, 2019

alexander-stannat commented Aug 22, 2019

alexander-stannat commented Aug 23, 2019 • edited

synctext commented Aug 29, 2019 • edited

alexander-stannat commented Sep 4, 2019

synctext commented Sep 11, 2019

synctext commented Sep 13, 2019

grimadas commented Sep 13, 2019

ichorid commented Sep 13, 2019

alexander-stannat commented Sep 22, 2019

cuileri commented Sep 23, 2019

synctext commented Oct 3, 2019 • edited

synctext commented Oct 4, 2019

alexander-stannat commented Oct 28, 2019

synctext commented Oct 28, 2019

synctext commented Oct 28, 2019

synctext commented Nov 20, 2019 • edited

synctext commented Jan 8, 2020

synctext commented Mar 28, 2020

alexander-stannat commented Mar 28, 2020

alexander-stannat commented Apr 8, 2020

alexander-stannat commented Apr 11, 2020

synctext commented Jul 14, 2020 • edited

qstokkink commented Nov 30, 2023

synctext commented May 3, 2019 •

edited

synctext commented May 20, 2019 •

edited

cuileri commented Jun 13, 2019 •

edited

synctext commented Jun 26, 2019 •

edited

synctext commented Aug 12, 2019 •

edited

alexander-stannat commented Aug 23, 2019 •

edited

synctext commented Aug 29, 2019 •

edited

synctext commented Oct 3, 2019 •

edited

synctext commented Nov 20, 2019 •

edited

synctext commented Jul 14, 2020 •

edited