www.crossflux.org www.crossflux.org
 
Home Software Publications Forums Contact
Software :: crcp — cooperative remote file copy

Overview

The distribution of a file to a large number of hosts is a time consuming task, even if the size of the file is modest. Consider, for instance, the scenario of a company that needs to distribute a virus update of 4 MB to all of the 100'000 machines in its Intranet. When using a single server with a bandwidth capacity of 100 Mb/s and 90% link utilization, file distribution takes almost 10 hours.

Using cooperative methods where each client that already has downloaded the file serves another client, this time can be reduced to approximately 1 minute (even less with elaborate distribution protocols). Unsurprisingly, the improvement factor becomes significantly higher as the size of the file increases to tens or hundreds of megabytes.

Cooperative remote file copy (crcp) is a cooperative version of the well-know rcp program. It was developed as part of a research project and must thus be considered as unstable software. It provides middleware that can be used not only to quickly replicate files on a set of remote machines, but also to test and deploy new distribution protocols. At the end of the distribution process, statistics are returned to the source to allow developers to evaluate the performance of their protocols. The most efficient protocol is called ptree. The least efficient is parallel, where the source sends the data simultaneously to all peers (no cooperation).

For more information on program usage, use the -h flag. Note that there are many undocummented features in the program.

Download

Documentation

Documentation is currently very minimal. Some information is available in the README file located in the software distribution. For general information about the protocols currently implemented in crcp, please refer to the following paper (available from the publications page):

  • E.W. Biersack, P. Rodriguez, and P. Felber.
    Performance Analysis of Peer-to-Peer Networks for File Distribution.
    In Proceedings of the 5th International Workshop on Quality of future Internet Services (QofIS'04), pp. 1-10, Barcelona, Spain, September 2004.

Installation

Before installing crcp, make sure you have installed the libevent library (available from http://www.monkey.org/~provos/libevent/).

After installing libevent and unpacking the crcp distribution, you can compile it using the following commands:

    ./configure --prefix=<installation-dir> --with-libevent=<libevent-dir>
    make
    make install

The program is statically linked against libevent, which makes it unnecessary to deploy the library on all participating peers. We have tested crcp on various distributions of Linux and on Windows (using Cygwin).

Execution

The crcp program can be started in source or peer mode. The source drives the whole content distribution process. The peers listen to the source commands and actively participate to the distribution process.

The source can optionally launch remote peers via ssh. The easiest approach is to deploy ssh keys and use a local ssh agent.

You can test crcp by executing the following steps:

  1. Create a sample file for distribution:
      dd if=/dev/zero count=1 ibs=5M of=5M
    
  2. Start crcp manually (here on a single host):
      ./crcp -P 9000 --info &
      ./crcp -P 9001 --info &
      ./crcp --info \
             --protocol parallel \
             file://5M \
             tcp://localhost:9000/5M-9000 \
             tcp://localhost:9001/5M-9001
    
  3. Alternatively, you can start crcp on remote peers peers via ssh. The program must be installed in the same directory on all machines. The user must have the permission to execute commands on remote machines (e.g., using a local ssh agent):
      ./crcp --info \
             --protocol tree \
             --program `pwd`/crcp \
             --extra "degree=2" \
             file://5M \
             ssh://localhost:9000/5M-9000 \
             ssh://localhost:9001/5M-9001 \
             ssh://localhost:9002/5M-9002 \
             ssh://localhost:9003/5M-9003 \
             ssh://localhost:9004/5M-9004 \
             ssh://localhost:9005/5M-9005
    

Command-line options

The command-line options, as of version 0.8.4, are listed below:

    crcp -- cooperative remote file copy

    Usage:
      crcp [options...] local remote...
      crcp [options...] local @file
      crcp [options...]

    Arguments:
      local
            Local path name of the form `file://<path>'
      remote
            Remote path name of the form `ssh://<user>@<host>:<port>/<path>' or `tcp://<host>:<port>/<path>'
      @file
            File containing a list of remote path names
      <no argument>
            Wait for incoming file

    Options:
      --silent
      --fatal
      --error
      --warn
      --info
      --debug
            Print no, little, or much information (default=warn)
      -C, --program <string> (default="crcp")
            Path to the crcp program to start remotely using ssh
      -H, --host <string> (default="localhost")
            Local host
      -P, --port <int>
            Local port (default=9999)
      -S, --ssh "option"
            Option passed to ssh, e.g., -S "-2" (one ssh option per -S)
      -T, --timeout <int>
            Timeout before shutting down when there is no activity (default=60 seconds)
      -V, --version
            Print version number
      -d, --daemon
            Run program as deamon (peer only)
      -h, --help
            Print this message
      -p, --protocol <string>
            Protocol to use for content distribution (default="parallel")
      -s, --size <int>
            Define chunk size (default=16384 bytes)
      -x, --extra "args"
            Extra protocol-specific arguments
    Valid protocols:
      parallel
      linear
      tree
      ptree

Feedback

If you any have questions, suggestions, or comments, please do not hesitate to contact us.