~pierrenn/barkaelogy

npm package to do bitcoin file archaelogy
fc02e2cc — pierrenn 2 months ago
remove useless Parse::write helper
388ad82a — pierrenn 2 months ago
add editor config
9cd2307c — pierrenn 2 months ago
add strict mode

clone

read-only
https://git.sr.ht/~pierrenn/barkaelogy
read/write
git@git.sr.ht:~pierrenn/barkaelogy

You can also use your local clone with git send-email.

barkaelogy

A small npm package to do bitcoin file archaeology.


Since the early days of bitcoin, non-financial data has been etched into the blockchain via several methods.

This npm package provides easy to use functions to help decoding files contained in blockchains.

We support all methods given in the aforementioned link and more.

Specifically, this library supports:

If you're interested in coinbase or ASCII data, just use strings (or bitcoinstrings.com).

Getting Started

Prerequisites

You need a bitcoin core (or bcash, bsv, ...) node daemon with txindex=1 running.

Pass connection information to the Parser constructor as described here.

Basic example

Read the bitcoin whitepaper stored in the blockchain into a local buffer:

const bark = require('barkaelogy');
const whitepaper_txid = "54e48e5f5c656b26c3bca14a8c95aa583d07ebe84dde3b7dd4a78f4e4186e713";

const parser = new bark.Parser({network:"mainnet", username: "foo", password: "bar"});
const data = await parser.extractData([whitepaper_txid]);
// data.file is a buffer containing the file, data.mimetype contains it's mimetype

Some helpful functions :

  • getInputData/getOutputData : retrieves parts of data contained in a raw transaction
  • parseData : convert parts of data into a file by detecting it's mimetype
  • extractData(txid, parse_input=false) : extracts data from a list of sequential txids, only checks input if parse_input=true (calls previous functions for you)

See the files in the tests/ directory for more helpers and usage examples.

Running the tests

Before running the tests, make sure that the __connectionDetails variable in jest.config.js matches your bitcoin daemon configuration.

Then simply install dependencies and check the package is working via npm run test (using jest).

License

This project is licensed under the GPLv3 License - see the COPYING file for details

FAQ

How to extract the whole UTXO set?

Etching methods using scriptPubSigs produce outputs that are in the UTXO set, so we can trivially have a small superset of them.

You can use for example bitcoin-utxo-dump and filter by txids which are at least N times in the UTXO set:

$ bitcoin-utxo-dump -f txid -db chainstate_folder_location
$ huniq -c < utxodump.csv > uniq_tx_utxo.csv #use huniq (https://github.com/ahamlinman/huniq) or just uniq+sort
$ grep -v "^[1-9] " uniq_tx_utxo.csv | awk '{print $2}' > txids_to_check.csv #get a list of txids appearing at least 10 times in the UTXO set, ~450k as of march 2020

How can I etch my data?

Please just don't.

Bitcoin was designed to store financial data, not your wedding pictures. Don't spam (and pollute the UTXO set) for everyone who runs a bitcoin node. Or at least use a blockchain dedicated to that. Or go spam BSV, they seem to like it. Or even better, use an appropriate data structure.

And if you insist, just please at least :

  • don't create a new format
  • prepend the <4B LE length><4B LE crc32 checksum> at the beginning of your data