IPIP-0000: On-Demand Pinning Based on DHT Provider Counts

Related Issue
History
Commit History
Feedback
GitHub ipfs/specs (inspect source, open issue)

1. Summary

Defines a mechanism for IPFS nodes to automatically pin content when DHT provider counts fall below a configurable replication target, and unpin when replication has recovered above target for a grace period.

2. Motivation

IPFS content availability depends on at least one provider remaining online. Content hosted by a small number of nodes is fragile.

Node operators who want to help keep content alive must manually monitor provider counts and re-pin content when replication drops. This is tedious, error-prone, and doesn't scale. Conversely, nodes may continue pinning content that already has abundant providers elsewhere, wasting local storage that could serve under-replicated content instead.

A standardized on-demand pinning mechanism would let community nodes act as an automatic safety net: pinning content that is at risk of disappearing, and releasing it once enough other providers exist.

3. Detailed design

This IPIP defines an on-demand pinning mechanism with three components:

3.1 Registry

Implementations maintain a persistent registry of CIDs to monitor. Each entry tracks:

Users add and remove CIDs from the registry explicitly. Adding a CID does not immediately pin it -- it registers it for monitoring.

3.2 Background checker

A periodic loop evaluates every registered CID:

  1. Query the DHT for providers of the CID (excluding self)
  2. If providers < replication target and not currently pinned: recursively pin the content
  3. If providers >= replication target and currently pinned: start the grace period timer (if not already running)
  4. If grace period has elapsed: unpin the content
  5. If providers drop below target again while grace period is running: reset the timer

The checker skips CIDs that have a user-created pin (to avoid interfering with manual pin management). The periodic query replaces the periodic re-provide and only adds overhead for actively pinnned CIDs (below replication target).

3.3 Configuration parameters

Implementations MUST support the following parameters (values are TBD):

3.4 Pin naming

When the checker creates a pin, it SHOULD use a well-known name (e.g., "on-demand") to distinguish it from user-created pins. The checker MUST NOT unpin content whose pin name does not match, to avoid removing user pins.

3.5 Pinning scope

The checker MUST use recursive pins. Direct pins do not preserve content availability since they do not protect linked blocks.

4. Design rationale

The design favors simplicity: it relies entirely on existing DHT infrastructure for provider discovery and existing pin semantics for storage. No new wire protocols or peer coordination are introduced.

The grace period mechanism prevents thrashing. Without it, a CID hovering near the replication target would be pinned and unpinned on every check cycle. A 24-hour default gives enough time to confirm that new providers are stable.

The pin naming convention lets the checker coexist with user pins. If a user manually pins a CID that is also registered for on-demand pinning, the checker will not interfere.

4.1 User benefit

Nodes can contribute storage where it matters most. Instead of pinning content indefinitely regardless of how many other providers exist, on-demand pinning frees storage automatically when content is well-replicated, making room for content that actually needs help.

On-demand pinning can be easily integrated into existing UI-flow. UI-flow

4.2 Compatibility

This feature is purely additive. Nodes that do not implement on-demand pinning are unaffected. On-demand pinning nodes interact with the network using only existing DHT queries and standard pin operations -- no protocol changes are required.

4.3 Security

DHT provider counts can be gamed. A Sybil attack could inflate provider counts by announcing many fake provider records, tricking nodes into unpinning content that is not actually well-replicated. Implementations SHOULD document this limitation. The grace period provides partial mitigation: an attacker would need to sustain fake provider records for the full grace duration.

4.4 Alternatives

A dedicated replication protocol between cooperating nodes was considered. This would allow nodes to explicitly coordinate who pins what, avoiding redundant work. However, it would require a overlay network protocol, and peer discovery for replication partners -- dramatically increasing complexity or introducing centralized components. The DHT-based approach reuses existing infrastructure and requires no coordination between nodes.

5. Test fixtures

Not applicable. This IPIP defines node behavior, not content-addressed data formats.

A. References

[rfc2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119

B. Acknowledgments

Editor
Cornelius Ihle GitHub