Skip to content

Setting Up a Local Musicbrainz Mirror

lpallard1 edited this page Oct 25, 2014 · 26 revisions

Installing a Local Musicbrainz Mirror

This guide will show you how to set up a local musicbrainz mirror on a Debian-based Linux distro. Although untested, you can probably get it up and running on other Linux distros - usually just by pulling in the required packages using your distro's package manager.

The following is pulled from the official guide at: https://github.com/metabrainz/musicbrainz-server/blob/master/INSTALL with some notes on some error-prone parts.

Also see: http://pastebin.com/raw.php?i=a99EqPzb for a more condensed version of the steps involved.

Installing the MusicBrainz Server

The MusicBrainz Server is the web frontend to the MusicBrainz Database, and is accessible at http://musicbrainz.org.

This document explains the steps necessary to setup your own MusicBrainz Server. If you require any assistance with these instructions, please feel free to contact us via the information given at the bottom of this document.

Prerequisites

  1. A Unix based operating system
The MusicBrainz development team uses a mix of Ubuntu and Debian, but Mac OS
X will work just fine, if you're prepared to potentially jump through some
hoops. If you are running Windows we recommend you set up a Ubuntu virtual
machine.
This document will assume you are using Ubuntu for its instructions.
  1. Perl (at least version 5.10.1)
Perl comes bundled with most Linux operating systems, you can check your
installed version of Perl with:
    perl -v
  1. PostgreSQL (at least version 8.4)
PostgreSQL is required, along with its development libraries. To install
using packages run the following, replacing 8.x with the latest version.
    sudo apt-get install postgresql-8.x postgresql-server-dev-8.x postgresql-contrib
Alternatively, you may compile PostgreSQL from source, but then make sure to
also compile the cube extension found in contrib/cube. The database import
script will take care of installing that extension into the database when it
creates the database for you.
  1. Git
The MusicBrainz development team uses Git for their DVCS. To install Git,
run the following:
    sudo apt-get install git-core
  1. Memcached
By default the MusicBrainz server requires a Memcached server running on the
same server with default settings.  You can change the memcached server name
and port or configure other datastores in lib/DBDefs.pm.

Server configuration

(Note: The recommended way to set this up is to create a musicbrainz user with a home in /home/musicbrainz. You can do this on Debian by running: "adduser musicbrainz" which will automatically set up the home directory in /home/musicbrainz. You can then switch to this user by running "su musicbrainz" and then "cd ~" and following the steps below. You'll need to add musicbrainz to the sudoers file so you can run sudo commands (edit the sudoers file by running "visudo" and adding the musicbrainz user under # User privilege specification musicbrainz ALL=(ALL) ALL)

  1. Download the source code.
    git clone https://github.com/metabrainz/musicbrainz-server.git
    cd musicbrainz-server
  1. Modify the server configuration file.
    cp lib/DBDefs.pm.sample lib/DBDefs.pm
Fill in the appropriate values for MB_SERVER_ROOT and WEB_SERVER.
Determine what type of server this will be using REPLICATION_TYPE:

(NOTE: If you're just going to be using this as a Headphones mirror, you can set it as RT_SLAVE)

a) RT_SLAVE (mirror server)
   A mirror server will always be in sync with the master database at
   http://musicbrainz.org by way of an hourly replication packet. Mirror
   servers do not allow any local editing, after the initial data import the
   only changes allowed will be to load the next replication packet in turn.
   Mirror servers will have their WikiDocs automatically kept up to date.
   If you are not setting up a mirror server for development purposes, make
   sure to set DB_STAGING_SERVER to 0.
b) RT_STANDALONE
   A stand alone server is recommended if you are setting up a server for
   development purposes. They do not accept the replication packets and will
   require manually importing a new database dump in order to bring it up to
   date with the master database.  Local editing is available, but keep in
   mind that none of your changes will be pushed up to
   http://musicbrainz.org.
   Stand alone servers will need to manually download and update their
   WikiDoc transclusion table:
        wget -O root/static/wikidocs/index.txt http://musicbrainz.org/static/wikidocs/index.txt

Installing Perl dependencies

The fundamental thing that needs to happen here is all the dependency Perl modules get installed, somewhere where your server can find them. There are many ways to make this happen, and the best choice will be very site-dependent. MusicBrainz ships with support for Carton, a Perl package manager, which will allow you to have the exact same dependencies as our production servers. Carton also manages everything for you, and lets you avoid polluting your system installation with these dependencies.

Below outlines how to setup MusicBrainz server with Carton.

  1. Prerequisities
Before you get started you will actually need to have Carton installed as
MusicBrainz does not yet ship with an executable. There are also a few
development headers that will be needed when installing dependencies. Run
the following steps as a normal user on your system.
    sudo apt-get install libxml2-dev libpq-dev libexpat1-dev libdb-dev memcached libyaml-perl

(NOTE: In order to install all the Carton dependencies, all of these packages are required: build-essential git-core libssl-dev libxml2-dev memcached libexpat-dev postgresql-8.4 postgresql-server-dev-8.4 postgresql-contrib liblocal-lib-perl libossp-uuid-perl)

    sudo cpan Carton
NOTE: This installs Carton at the system level, if you prefer to install
this in your home directory, use http://search.cpan.org/perldoc?local::lib .
  1. Install dependencies
To install the dependencies for MusicBrainz server, first make sure you are
in the MusicBrainz source code directory and run the following:
    cat Makefile.PL | grep ^requires > cpanfile
    carton install --deployment
The following three libraries failed to install, so install them manually:
    sudo cpan Hash::Merge
    sudo cpan JSON::Syck
    sudo cpan Net::CoverArtArchive::CoverArt
Note that if you've previously used this command in the musicbrainz folder it
will not always upgrade all packages to their correct version.  If you're
having trouble running musicbrainz, run "rm -rf local" in the musicbrainz
directory to remove all packages previously installed by carton, and then run
the above step again.

Creating the database

  1. Install PostgreSQL Extensions
Before you start, you need to install the PostgreSQL Extensions on your
database server. First, from the MusicBrainz source code directory run these
commands to pull the extensions from the Git server:
    git submodule init
    git submodule update
Then to build the musicbrainz_unaccent extension run these commands:
    cd postgresql-musicbrainz-unaccent
    make
    sudo make install
    cd ..
To build our collate extension you will need libicu and it's development
files, to install these run:
    sudo apt-get install libicu-dev
With libicu installed, you can build and install the collate extension by
running:
    cd postgresql-musicbrainz-collate
    make
    sudo make install
    cd ..
Note: If you are using Ubuntu 11.10, the collate extension currently does
not work with gcc 4.6 and needs to be built with an older version such as
gcc 4.4. To do this, run "sudo apt-get install gcc-4.4" and then run the
following:
    cd postgresql-musicbrainz-collate
    CC=gcc-4.4 make -e
    sudo make install
    cd ..
  1. Setup PostgreSQL authentication
For normal operation, the server only needs to connect from one or two OS
users (whoever your web server / crontabs run as), to one database (the
MusicBrainz Database), as one PostgreSQL user. The PostgreSQL database name
and user name are given in DBDefs.pm (look for the "READWRITE" key).  For
example, if you run your web server and crontabs as "www-user", the
following configuration recipe may prove useful:

_(NOTE: On Debian 6.0, with PostgreSQL 8.4, you'll find the pg_hba.conf file in /etc/postgresql/8.4/main . Instead of only allowing the user musicbrainz to access the database, you can allow all local connections, and make sure postgres only listens on localhost:

In /etc/postgresql/8.4/main:

Edit postgresql.conf:

listen_addresses = 'localhost'

Edit pg_hba.conf:

# "local" is for Unix domain socket connections only

local all all trust

host all all 127.0.0.1/32 trust

    # in pg_hba.conf (Note: The order of lines is important!):
    local    musicbrainz_db    musicbrainz    ident    mb_map
    # in pg_ident.conf:
    mb_map    www-user    musicbrainz
Alternatively, if you are running a server for development purposes and
don't require any special access permissions, the following configuration in
pg_hba.conf will suffice (make sure to insert this line before any other
permissions):
    local   all    all    trust
  1. Create the databases
You have two options when it comes to databases. You can either opt for a
clean database with just the schema (useful for developers with limited disk
space), or you can import a full database dump.
a)  Use a clean database
    To use a clean database, all you need to do is run:
        carton exec ./admin/InitDb.pl -- --createdb --clean
b)  Import an NGS database dump
    The easiest way to import the database is to use a database dump.  These
    dumps are provided twice a week and are available here:
        ftp://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/
    To get going, you need at least the mbdump.tar.bz2,
    mbdump-editor.tar.bz2 and mbdump-derived.tar.bz2 archives, but you can
    grab whichever dumps suit your needs. Assuming the dumps have been
    downloaded to /tmp/dumps/ you can import them with:
        carton exec ./admin/InitDb.pl -- --createdb --import /tmp/dumps/mbdump*.tar.bz2 --echo
    --echo just gives us a bit more feedback in case this goes wrong, you
    may leave it off. Remember to change the paths to your mbdump*.tar.bz2
    files, if they are not in /tmp/dumps/.
NOTE: on a fresh postgresql install you may see the following error:
    CreateFunctions.sql:33: ERROR:  language "plpgsql" does not exist
To resolve that login to postgresql with the "postgres" user (or any other
postgresql user with SUPERUSER privileges) and load the "plpgsql" language
into the database with the following command:
    postgres=# CREATE LANGUAGE plpgsql;

Starting the server

  1. Start the development server
You should now have everything ready to run the development server!
The development server is a lightweight HTTP server that gives good debug
output and is much more convenient than having to set up a standalone
server. Just run:
    carton exec -- plackup -Ilib -r
Visiting http://your.machines.ip.address:5000 should now present you with
your own running instance of the MusicBrainz Server.
  1. Troubleshooting
If you have any difficulties, please feel free to contact ocharles or warp
in #musicbrainz-devel on irc.freenode.net, or email the developer mailing
list at musicbrainz-devel [at] lists.musicbrainz.org.
If you find any bugs, please report them on http://tickets.musicbrainz.org.
Good luck, and happy hacking!

Keeping Your Local Musicbrainz Server Up-to-Date

To load the replication changes manually, you'll need to run:

carton exec -- ./admin/replication/LoadReplicationChanges

Alternatively, you can use the following script which can start/stop the server, and load the replication changes.

http://paste.pocoo.org/show/555245/ (if that doesn't work try: http://paste2.org/p/2042937)

Save it as "/usr/bin/mbcontrol", and run:

chmod a+x /usr/bin/mbcontrol

which will make it executable.

Run it as the musicbrainz user, making sure you have write access to /var/run/musicbrainz and var/log/musicbrainz.

Usage:

mbcontrol start/stop (start and stop the server)

mbcontrol hourly (load the replication changes)

To have the script run automatically, you can stick a line into the musicbrainz user's crontab:

crontab -e

and add this line, which will run "mbcontrol hourly" at 10 minutes past the hour every hour:

10 * * * * /usr/bin/mbcontrol hourly

If you have trouble with the script not running, it could be because carton is not in the crontab path. If that's the case, you'll need to specify the path to carton in the mbcontrol script:

CARTON=/usr/local/bin/carton

Clone this wiki locally