Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trustworthy gossip: integrating gossip mechanisms with trust calculations #3690

Open
synctext opened this issue Jun 26, 2018 · 25 comments
Open
Assignees
Milestone

Comments

@synctext
Copy link
Member

synctext commented Jun 26, 2018

This thesis project explores making the connection between gossip and trust.

This work directly ties in with the mission of the lab Distributed Trust Design, see #3571. We are currently implementing incremental personalised pagerank #2805 and IPv8 distributed apps #2943.

This work shall demonstrate in 3 use-cases the linkage of IPv8 community gossip mechanism with trust calculations:

  • Youtube-alternative. spread of publish/subscribe in the @xoriole search community.
  • Facebook-alternative. spread of news updates in new discovery community.
  • Uber-alternative. spread of bid/asks in the IPv8 market community.

Sprint milestones:

  • implement 63k instances of IPv8 communities on our 48-core machine
  • visualize real-time gossip on 1 machine connected to the IPv8 testnet
  • Deeply understand all gossip communities in Tribler: AllChannel, Market, Search, and tunnel.
  • devise common features and re-usable mechanisms for trustworthy gossip
  • re-factor the code and re-use Trustworthy_Gossip(), possible design:
  • Trustworthy_Gossip(suppress,delete,store,collect,spread)
    • all information comes from a known origin
    • you only connect to long-term neighbors which you trust somewhat
    • for each piece of incoming information, depending on available metainformation (source, path, size, date, type etc.) you either:
      • actively suppress it (drop)
      • pro-actively spread it
      • or something in between
      • Pub/Sub is simply user-defined origin is trusted, always spread

Our #10 operational prototype from 2009 and 2013. Uses random graph walk:
web-of-trust-in-Tribler-GUI

@synctext synctext added this to the Backlog milestone Jun 26, 2018
@synctext
Copy link
Member Author

synctext commented Jun 27, 2018

IPv8/TCP (Trust Control Protocol). Game Theory direction title would be: using natural selection for the evolution of trust. Even more over-the-top: game theory of trust.

illustration from done by the Prof. Nowak team at Harvard. Their Nature paper. This model is designed for population growth and wrong for trust. But perhaps it can be modified to model the growth of trust.
image

@qstokkink
Copy link
Contributor

A first version of the suppress, delete, store, collect, spread protocol has been captured in Tribler/py-ipv8#186. We have found that this roughly relates to the emotion of a user for a given incoming message.
afbeelding

@synctext
Copy link
Member Author

(posting here, as it seems the best fit for now)

Any good, this related work? @qstokkink "Practical Homomorphic Encryption Over the Integers"
We present novel homomorphic encryption schemes for integer arithmetic, intended for use in secure single-party computation in the cloud. These schemes are capable of securely computing only low degree polynomials homomorphically, but this appears sufficient for most practical applications. In this setting, our schemes lead to practical key and ciphertext sizes. We present a sequence of generalisations of our basic schemes, with increasing levels of security, but decreasing practicality. We have evaluated the first four of these algorithms by computing a low-degree inner product. The timings of these computations are extremely favourable. Finally, we use our ideas to derive a fully homomorphic system, which appears impractical, but can homomorphically evaluate arbitrary Boolean circuits.

@qstokkink
Copy link
Contributor

@synctext not particularly.

@DanGraur
Copy link

DanGraur commented Jul 27, 2018

Gossip Overlay Security Concerns

Task Description

Find vulnerabilities in the Gossip overlay. Try to do the following:

  • Document the security concerns

  • If possible, find a fix for / remove the threats

Types of threats

Attacks which should be prevented:

  • Impersonation attack: An attack in which an adversary successfully assumes the identity of one of the legitimate parties in the system or in a communication protocol.

    • Some Identified cases:

      • Allowing malicious peers to infiltrate the Gossip Overlay might be, generally speaking, the biggest threat to disrupting the normal workflow of this overlay: a malicious peer somehow becomes a trusted peer / neighbor, i.e. in the case of this application: becomes verified. However, this should not be possible, generally speaking, if only the (somewhat) trusted peers become verified. One way of augmenting trust might be to use the age of the peer as an indication of its perceived trustworthiness, i.e. how long has it been active. Alternatively / Additionally, one may take into consideration the number of neighboring peers a peer has (if the number is high, it's probably trustworthy, otherwise we can't say much about it); this might open the system up to Sybil or Eclipse attacks though, and might be easily exploitable. If a malicious peer does infiltrate the overlay as a verified peer, then it can do all sorts of undesirable things: mark good peers as suppressed, spread malicious messages, return malicious messages when queried with a COLLECT request, tamper with their own message DB (avoid storing messages from a non-malicious peer, modify messages or do something similar - perhaps this would be a bit more difficult, however, since the signatures would probably not match, still, it may present a security risk -).

      • An example of such a case: a malicious, yet trusted peer flags another trusted, yet non-malicious peer as a spammer for SUPPRESSION, once this happens, there is no way to go back, and un-SUPPRESS the peer. This is a serious issue, since it opens up many opportunities for malefactors to negatively impact the overlay and its peers by relying on other types of attacks, after suppressing many other (non-malicious) peers, and thus, implicitly, changing the network / overlay topology to one which suits their harmful intentions. SUPPRESS should probably be addressed in some of the next increments of the Gossip Overlay's implementation, in addition to adding the general case of changing the Gossip Rule for a peer.

Attacks which should be protected against (cannot be prevented):

  • Eclipse attack: An attack wherein an adversary controlling a sufficient number of IP addresses to monopolize all connections to and from a victim crypto node. The attacker can then exploit the victim for attacks on a crypto’s mining and consensus system, including N-confirmation double spending, selfish mining, and adversarial forks in the blockchain.

    • Some Identified cases:

      • For instance, an adversary attaches itself to a (set) of non-malicious peer(s). In such a case, the adversary can, for instance, spam the SUPPRESSED message against other non-malicious peers to have them suppressed, thus elimintaing any 'positive' influence from the peers on which it leeches. From this point on, it can pretty much provide its own desired image of the Gossip Network to the peers it leeches, or it can disrupt the normal flow of the Gossip Network. Depending on the specific application domain of the Gossip Overlay, this sort of attack may have serious negative effects. For instance, if one would use the Gossip Overlay for gossiping blocks in a blockchain, this sort of attack could trick non-malicious nodes into agreeing on a fraudulent chain, it could alternatively enable the attacker to project a false majority in mining power, etc..

      • Even if the attacker does not suppress non-malicious nodes, it can still disrupt normal network functionality by, for instance, flooding the peers to which it is connected with bogus requests, which shouldn't be too difficult, since most requests are multicast / broadcast to all known neighbors. Perhaps introducing a limit on how much a peer can communicate, or limiting the communication scope of some of the requests might help towards reducing the network communication overhead.

  • Sinkhole attack: In this type of attack, a malicious node advertises itself as a best possible route to the base-station which deceives its neighbors to use the route more frequently. Thus, the malicious node has the opportunity to tamper with the data, damage the regular operation or even conduct many further challenges to the security of the network. In this case, the attacker might not necessarily tamper with the route cost function, but rather try to affect the topology of the Gossip Overlay such that it becomes a popular node (that is, it finds itself in a large number of neighborhoods) or becomes an essential node (that is, it stands on one of the few paths - or perhaps even the only path - between two neighborhoods).

    • Some Identified cases:

      • A verified (and malicious) node may become a sinkhole by telling other peers to suppress other (non-malicious) peers. This is turn could lead to a situation as dire as that described in figure 1, where a malicious verified node stands between two or more ‘clusters’ of peers / otherwise connected components (as defined by the ‘verified peers’ neighborhood). Communication between these clusters should always go through the sinkhole peer, which might be very problematic, since the sinkhole peer can now, amongst others, do the following: drop packets, exploit their contents, modify the packets, etc.

      • Alternatively, this may also happen in a sense by becoming a ‘neighboring peer’ / trusted peer to many other peers, or by becoming in a sense a central node to a cluster, as seen in figure 2. In both these cases, the adversary may receive a lot of messages which can be tampered with or snooped, etc. In the latter case, the malicious node may even project its own desired image of the Gossip Overlay to the neighboring nodes, similar to what was described in the Eclipse Attack section.

      • Should avoid against these attacks: monitor traffic for each peer, if traffic towards peer too high, possibility of Sinkhole attack → need another peer for redundancy → try to find another trusted peer which offers good routing possibilities.

      • Perhaps a good idea here is to use a trust metric which includes the number of peers which find a peer trustworthy (as directly specified by them, rather than the evaluated peer) → might be a good approach against the sinkhole attack, however this might encourage Sybil or Eclipse attacks. Additionally, this idea can be augmented to introduce a limit to the number of neighborhoods a peer can belong to, so as to reduce it influence (this is somewhat contrary to what I suggested in the Impersonation attack section regarding peer popularity and trustworthiness). Perhaps a lower bound can be introduced as well.

Alt Text
Alt Text

  • Sybil attack: In a Sybil attack, the attacker subverts the reputation system of a peer-to-peer network by creating a large number of pseudonymous identities, using them to gain a disproportionately large influence. A reputation system's vulnerability to a Sybil attack depends on how cheaply identities can be generated, the degree to which the reputation system accepts inputs from entities that do not have a chain of trust linking them to a trusted entity, and whether the reputation system treats all entities identically.

    • Some Identified cases:

      • Idea to remove SUPPRESS message abuse: a peer may be SUPPRESSED by perhaps using a voting scheme. The problem with this is that a voting scheme may be subverted by Sybil attacks, where a node creates a lot of identities, which become verified, and overthrow the vote (peer verification is done at the peer level in the gossip overlay’s network object, however, it may be possible to create multiple peers, and try to implement this sort of attack).

Miscellaneous observations

  • Method enforce → rule == GossipRule.COLLECT: The Bloom Filter (more specifically its contents) might represent a security risk. For instance, a node is free to tamper with the information stored in the filter: add bogus messages (from real or fake peers), changing the contents of the messages themselves (perhaps difficult due to the private key based message signature), or discarding real ones.

  • When a message from a non-suppressed peer whose rule (i.e. that of the message) is DEFAULT and source is not a neighbor, the self.update() method is called, which calls the listener methods to process the newly arrived message. These hook methods may be an easy place where one could place malicious code. Also, in this case, the message will be deleted from the message DB of the receiving peer, even if no hook placed it there. Could this be a way of force deleting a message from a peer’s DB without its consent?. Alternatively, if the source is a neighbor, the message is stored, and the callbacks are also invoked. Obviously, if the peer is actually malicious, this allows it to insert bogus messages in message DB (but this is beyond the scope of this observation).

  • Method enforce → rule == GossipRule.SPREAD. This might mean that a malicious peer (who is trusted either by Sybil or Impersonation attacks) might force spread malicious messages through the overlay.

  • Also, adding the possibility of code injection through the self.listeners data structure (by demanding the addition of a new listener through a Gossip Overlay request - currently this is not implemented, but in case it will be in the future -) might be dangerous since the stored listeners get executed on the arrival of a SPREAD or DEFAULT request. The data itself can simply and conveniently be passed through one of the aforementioned requests.

  • Also, there currently is no way of setting the rules for a peer (via a Gossip Overlay request, that is), other than setting SUPPRESS when told to do so by another trusted peer. Regardless of how this will be implemented it should be done in such a way that a malicious peer cannot abuse it. A good idea here would be to develop a solution which should work for setting any of the possible Gossip Rules, and still protect from abuse, thus making the solution reusable.

  • Currently, it seems that a lot of the available requests (nearly all of them bar the COLLECT request - when sent, in the on_message method, the COLLECT branch -) communicate with all of their neighbors, i.e. a multicast type of communication. This might be dangerous, since a malicious peer might seek to flood the network with bogus requests. This may also apply when the peer is not really malicious, but rather just irresponsible when using the available network capacity. Perhaps introducing a bandwidth limit to the peers might be a good idea here.

  • Generally speaking, the SUPPRESS request is quite dangerous due the fact that a peer can easily demand that other peers ignore neighbors (even if they are non-malicious). This allows malicious peers to abuse this functionality in various ways, for instance: to remove 'positive' influence from the neighborhood, modify the topology of the network, or increase vulnerability against other types of attacks. This particular request should be further refined, to prevent abuse.

  • Since most attacks are dependent on the ability of a peer to become trusted and join neighborhoods, then it's probably a good idea to seek ways through which malicious peers will find it hard or even impossible to do that. This will likely reduce the vulnerability of the system to other (more harmful) types of attacks.

@synctext
Copy link
Member Author

synctext commented Aug 1, 2018

Wireguard: related work on security, session cookies, DDoS protections and 4000 lines of beauty.

Getting our layering model clear: IPv8 delivers UDP messages. Also a discovery services, bit of DHT lookup. Should it stay away from sessions and shared state?

Brainstorming... Trustworthy_Gossip() implements the "Gossip Control Protocol" (GCP on top of IPv8). This adds sessions, statefullness, and long-lived shared secrets? Plus Libtorrent-based congestion control and bulk data exchange. Trustchain also has/needs a bulk data transfer protocol?

@qstokkink
Copy link
Contributor

Trustchain also has/needs a bulk data transfer protocol?

If libtorrent channels work out, this could also be a good candidate to sync with libtorrent.

@synctext
Copy link
Member Author

Boostrap idea:

  • we manually create thousands-million pre-trusted nodes
  • 25 MByte of top trustchain records of all time.
  • download this using Libtorrent and trustchain records
  • This links you to the existing network.
  • crawl to discover signed trustchain records by trust-by-default top trustchain nodes.
  • New Trustchain agent then can calculate personalised pagerank.

@qstokkink
Copy link
Contributor

Best idea so far:

Use a Polytomous Rasch model (Wikipedia) to reach a decision (SUPPRESS, DELETE, STORE, COLLECT, SPREAD) based on the decisions (SUPPRESS, DELETE, STORE, COLLECT, SPREAD) of others for a specific message.

@qstokkink qstokkink self-assigned this Feb 27, 2019
@synctext
Copy link
Member Author

Title brainstorm: 'middleware for provably trustworthy information flow'.

  • irrifutable ledger as underlying mechanism
  • system model: interaction, gossip and attestations
  • tamper-proof and Sybil-resilience
  • an verifiable random function guarantees uncoruptable witnesses and erasure protection

@grimadas
Copy link
Contributor

Some of the relevant work:

@qstokkink

This comment has been minimized.

@qstokkink

This comment has been minimized.

@synctext
Copy link
Member Author

synctext commented Mar 20, 2019

Related work called "safe gossip" by Maidsafe team, they did a badly executed, early, few million ICO. Push-pull gossip protocol. They use Scott Shenker approach from 2000.
image

We investigate whether such a large communication overhead is inherent to epidemic algorithms.
On the positive side, we show that the communication overhead can be reduced significantly.
We give an algorithm using only O(n ln ln n) transmissions and O(ln n)rounds.
In addition, we prove the robustness of this algorithm, e.g., against adversarial failures
struct Network {
    pool: CpuPool,
    // An mpsc channel sender for each node for giving new client messages to that node.
    message_senders: Vec<mpsc::UnboundedSender<String>>,
    // An mpsc channel receiver for getting the client messages and stats from the nodes.
    stats_receiver: mpsc::UnboundedReceiver<(Id, Vec<String>, Statistics)>,
    // The last set of client messages received via `stats_receiver` for each node.
    received_messages: HashMap<Id, Vec<String>>,
    // The last set of stats received via `stats_receiver` for each node.
    stats: HashMap<Id, Statistics>,
    // The futures for all nodes.  When these return ready, that node has finished running.
    node_futures: Vec<CpuFuture<(), Error>>,
    // All messages sent in the order they were passed in.  Tuple contains the message and the index
    // of the node used to send.
    client_messages: Vec<(String, usize)>,
    // Message which when sent to a node via its `message_sender` indicates to the node that it
    // should terminate.
    termination_message: String,
}

@qstokkink

This comment has been minimized.

@qstokkink

This comment has been minimized.

@qstokkink

This comment has been minimized.

@qstokkink

This comment has been minimized.

@synctext
Copy link
Member Author

Related work: the I2P people also build a gossip overlay (they have 32k nodes). The Network-Database offers:

I2P's netDb is a specialized distributed database, containing just two types of data - router contact information (RouterInfos)
and destination contact information (LeaseSets). Each piece of data is signed by the appropriate party and verified by anyone
 who uses or stores it. In addition, the data has liveliness information within it, allowing irrelevant entries to be dropped, newer
 entries to replace older ones, and protection against certain classes of attack.
The netDb is distributed with a simple technique called "floodfill", where a subset of all routers, called "floodfill routers",
maintains the distributed database.

Floodfill is their networkDatabase update mechanism and seems vulnerable and under-documented.

The floodfill netDb is a simple distributed storage mechanism. The storage algorithm is simple: send the data to the
closest peer that has advertised itself as a floodfill router. When the peer in the floodfill netDb receives a netDb store
from a peer not in the floodfill netDb, they send it to a subset of the floodfill netDb-peers. The peers selected are the
ones closest (according to the XOR-metric) to a specific key.

Not unlike our community numbering they also have identifiers:
image

@synctext
Copy link
Member Author

synctext commented Sep 2, 2020

Related work, also for the @alexander-stannat work. "RepuCoin: Your Reputation is Your Power". Show how to "grow" a reputation in mid-life and mature system. Quote:

image

@synctext
Copy link
Member Author

synctext commented Oct 19, 2021

Related work:

@synctext
Copy link
Member Author

synctext commented Oct 13, 2022

Related work:
Based on a June 2002 algorithm. Gossip library by Cornell design: "Smudge is a minimalist Go implementation of the "SWIM: scalable weakly-consistent infection-style process group membership protocol" (Scalable Weakly-consistent Infection-style Membership) protocol for cluster node membership, status dissemination, and failure detection developed at Cornell University by Motivala, et al. It isn't a distributed data store in its own right, but rather a framework intended to facilitate the construction of such systems." GITHUB : https://github.com/clockworksoul/smudge
[Based on historical 2001 work "On scalable and efficient distributed failure detectors"
Super naive all-to-all. Oops, no security, reputation, spam measures, or trust! (because this was, like, 21 years ago)

@synctext
Copy link
Member Author

synctext commented Nov 14, 2023

2021 related work: BASALT: A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks. Use also the idea of diversity to fight Sybils. we will consider two paradigmatic scenarios where an attacker attempts a Sybil attack using many IP addresses Complex solution:
image
Quite clever bit of have dictated randomness with a pseudo-random function. Bit this still gives bias and is not strictly unbiased random. Difficult to predict if this design will actually work in Real NAT-based Internet.

definition by repeatedly exchanging neighbor lists with other peers,
discovering at each step peers that better match its ranking functions.
In the following, we first detail the use of ranking functions to
identify target neighbors. Then we discuss how nodes update these
ranking functions to make the random graph dynami

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants