Bitswap, as the specific implementation of the IPFS network data exchange interface, mainly focuses on how to quickly and efficiently obtain the required blocks.
Original title: “Secret IPFS Data Exchange Module Bitswap”
Written by: Ma Yaoyao, working at Data Grid Lab BitXMesh
InterPlanetary File System (IPFS) is a network transmission protocol designed to create persistent and distributed storage and sharing of files.
Different from the traditional file system, it is a point-to-point hypermedia distribution protocol based on content addressing.
The nodes in the IPFS network form a distributed file system network. Bitswap is the core module of IPFS, responsible for requesting and sending data blocks with other nodes in the network.
figure 1
IPFS decomposes files into data blocks called blocks, which are identified by content identifiers (CID).
IPFS file content is stored on different nodes, each node stores the root block, a small number of nodes store complete file data, and most nodes store partial file blocks. Because blocks are stored in different nodes, the Bitswap protocol solves the problem of efficiently obtaining all data blocks from multiple nodes .
Figure 2 Schematic diagram of multi-node file storage
Overall architecture
This article is based on Bitswap v0.3.3 version for analysis.
IPFS obtains blocks from Bitswap. Bitswap is the specific implementation of the IPFS exchange interface and is responsible for completing the IPFS network data exchange function.
Figure 3 Bitswap component architecture diagram
The Bitswap protocol content is relatively cumbersome. In order to reduce the implementation complexity, the protocol is divided into block request module, block sending module, session management module, block provider discovery module, network module, etc.
The Bitswap module is responsible for receiving new messages and providing an interface for external data exchange.
The session management module is used to manage multiple sessions, and each session manages the download of a set of files to improve download efficiency.
The block sending module is responsible for managing the sending of data blocks to other nodes. The block request module is responsible for managing data block requests. The block provider discovery module is responsible for discovering data blocks in the network through the DHT network. Usually, the session will call this module to discover the data block when it cannot find a node that already has the required block.
Protocol overview
In order to complete the block exchange between nodes, Bitswap defines communication messages and communication protocols.
Figure 4
Bitswap message processing flow:
- IPFS requests file blocks, Bitswap sends a want-have message carrying CID1 information to all connected nodes. The node returns a have or dont-have message according to whether it has a CID1 file block.
- The client sends a want-block message to the node with CID1, and the node returns the corresponding block message.
- When no node has the requested block, Bitswap broadcasts want-have to all connected nodes, or finds the node that owns the file block through DHT.
Detailed module
Bitswap module
The Bitswap module is responsible for receiving new messages and implementing the external data exchange interface.
When a new message is received, Bitswap processing flow:
- Record some statistics about the message
- Notify the sending module wants messages so that the data sending module can send a response message to the demand node according to the actual situation.
- Notify the sending module of any received blocks, the sending module can send the received blocks to any node that needs them according to the node’s demand list
- Notify SessionManager of the received blocks, HAVEs and DONT_HAVEs messages so that SessionManager can notify the session related to the message.
Bitswap improves the convenience of using the module through the facade pattern, which makes the usage of the Bitswap subsystem simple and avoids the high coupling of IPFS and Bitswap.
Session management module
The session management module is used to manage multiple data block download sessions, and each session manages the download of a set of files.
When SessionManager receives a new message, it will
- Notify the BlockPresenceManager component to track each block.
- Notify the Sessions that are interested in the received block and want.
- Notify the PeerManager component of the received block. PeerManager checks whether any wants are sent to the node to receive the received block. If so, it will send “CANCEL” messages to those nodes to prevent other nodes from sending messages repeatedly.
The Session management module improves the efficiency of data interaction by coordinating the data requirements of multiple sessions, and avoids repeated requests and transmissions of data blocks.
Session module
Session manages the downloading of a group of files to improve the efficiency of downloading a group of files, such as a single file. When IPFS calls Bitswap, Bitswap will create a new Session and call the corresponding methods of the Session, such as GetBlocks() to obtain blocks. Session will manage a list of nodes, and the data acquisition process will only acquire data from nodes in the session, not all connected nodes. When none of the nodes in the session have a block, Bitswap will obtain the nodes with the block through DHT and join the session.
Since there are no nodes at the beginning of the session, it is in “discovery” mode. When IPFS initially requests a block from a session, the session processing flow is as follows:
- Notify the SessionInterestManager component of the blocks it is interested in.
- Notify sessionWantManager of the required blocks.
- Notify the PeerManager component to send a “want-have” message to the connected nodes to discover which nodes have the required blocks.
- Query the ProviderQueryManager component to discover which nodes have the block.
When the session receives a message with “HAVE” or “block”, it will notify the SessionPeerManager component.
When the session receives a message with “block”, it will notify the SessionInterestManager component.
Once the session has a node, it is no longer in “discovery” mode. When IPFS requests a subsequent block, the session will notify the sessionWantSender component. The sessionWantSender component notifies the PeerManager component to send “want-have” and “want-block” messages to the nodes in the session.
For each block required by the session, the sessionWantSender component checks with the BlockPresenceManager component which nodes have sent “HAVE” for the block to determine which node is most likely to own the block. If multiple nodes have sent “HAVE”, they will be selected based on the response speed of the previous request.
Due to the slow speed of DHT content discovery and the large consumption of network bandwidth, the session obtains data from a group of nodes with high probability of containing target data blocks, thereby greatly improving the efficiency of data block acquisition.
Block sending module
The block sending module is responsible for managing the sending of data blocks to other nodes, including the node demand management component (Ledger) and the message sending task queue (PeerTaskQueue).
Engine is the processing class of block sending module. When Engine is notified of new wants
- Add want to Ledger, Ledger will store the list of requirements for each node.
- Check the corresponding block in the blockstore and add the task to the PeerTaskQueue component. If the block library does not have the desired block, add a “DONT_HAVE” task; if the blockstore has a block, add a “HAVE” task for “want-have” and a “block” task for “want-block”.
- When the Engine receives a new block message, it checks the Ledger to see if any node needs this block. For each node that sent “want-have”, add a “HAVE” task to the PeerTaskQueue component, and for each node that sent “want-block”, add a “block” task to the PeerTaskQueue component.
- Engine will periodically fetch tasks from the PeerTaskQueue component and create messages with “blocks”, “HAVEs” and “DONT_HAVEs”.
The PeerTaskQueue component prioritizes tasks, and the node with the least amount of data in the sending queue has the highest priority. Through this flow control measure, the efficiency of data block sending processing is improved.
Data block request module
The Block request module is responsible for managing data block requests. The PeerManager component creates a MessageQueue component for each node connected to Bitswap, records which node “want-have” and “want-block” have been sent to, and directs any new wants to the correct node. And the MessageQueue component improves the efficiency of message sending by combining multiple wants into one message and then sending the message to the node.
Content Discovery Module
When Bitswap cannot find a node that already has the required block, it will use DHT for content discovery. Bitswap manages these requests through the ProviderQueryManager component. The ProviderQueryManager component rate-limits the requests and deduplicates the requests in progress.
to sum up
Bitswap, as the specific implementation of the IPFS network data exchange interface, has designed and implemented a set of efficient data exchange protocols between nodes.
The main focus of the protocol is how to quickly and efficiently obtain the required blocks, including node selection strategies, and how to maximize the use of each node’s capabilities.
Bitswap improves the convenience of module use through Facade Pattern, and decouples complex dependencies to each functional module in specific implementation, thereby reducing the complexity of system implementation.