Magnolia Workspace Clustering Tutorial

The scenario for clustering a Magnolia workspace between several instances will be like shown in the picture below. Well will use one author and two public instances that all cluster/share the forum workspace. With that setup, comments made on one of the public instances don’t need to be synchronized to other public instances. Because we include the author instance in the cluster, comments are available for moderation though content editors.

Pre-requisites

For this guide we will use a pre-packaged Magnolia bundle that you can download from the Magnolia website http://download.magnolia-cms.com). Of course you need to have a license for the enterprise versions. The community versions of Magnolia can be downloaded from SourceForge.

Clustering data also means we need to use data storage able to handle concurrent connections so we cannot use the file-based Derby database for this task. In the tutorial MySQL will be used because it’s the most wide-spread relational database used with Magnolia CMS. You also can use a different RDBMS supported by JackRabbit. Besides the configuration changes you also need to think of providing the correct JDBC driver.

A MySQL user account with sufficient access rights is needed and also a way to drop and create databases in MySQL (I will use the command line).

We will assume that the content of the repositories is split between the RDBMS and the filesystem (this is the default setting). It’s possible to store everything in the database, so in that case you wouldn’t need shared access to a file system (see below).

Prepare the environment

Create MySQL databases

In this tutorial I am using a the command line to create the databases needed for the different Magnolia instances and also for the forum/comments. Of course you change change the database names (and user accounts) as you like, you just have to also use them in the steps later in this document.

mysql -u root -p

create database mgnl_cluster_public1;
create database mgnl_cluster_public2;
create database mgnl_cluster_author;
create database mgnl_cluster_forum;

(Remove a database with drop database name).

Provide a folder on a shared filesystem

Create a folder to be used by all clustered instances for storing content. This folder needs to be accessible from all the instances involved in the cluster.

  • /path/to/directory/shared/forum

Prepare the Magnolia bundle

Configure the author instance

First we will adjust the memory available to Tomcat so that the installation process doesn’t fail when deploying multiple Magnolia instances.

  • In the bundle edit tomcatDir/bin/setenv.sh and adjust the memory settings, e. g.:

    export CATALINA_OPTS="$CATALINA_OPTS -XX:MaxPermSize=512m -Xms128M -Xmx3072M-Djava.awt.headless=true”

    Consider the physical memory available in your personal machine before adjusting the values!

  • copy the needed database driver JAR-file to tomcatDir/webapps/magnoliaAuthor/WEB-INF/lib (download it from the MySQL website)

  • delete the directory tomcatDir/webapps/magnoliaPublic
  • in the magnoliaAuthor directory, open WEB-INF/config/default/repositories.xml

At the end of the file just before the closing JCR tag, add the following configuration:

<Repository name="magnoliacluster-forum" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
  <param name="configFile" value="${magnolia.repositories.jackrabbit.cluster-forum.config}" />
  <param name="repositoryHome" value="${magnolia.repositories.home}/magnoliacluster-forum" />
  <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
  <param name="providerURL" value="localhost" />
  <param name="bindName" value="cluster-forum-${magnolia.webapp}" />
  <workspace name="forum" />
</Repository>

Now add a new repository mapping for magnoliacluster-forum in the RepositoryMapping section:

<RepositoryMapping>
  … existing mappings …
  <Map name="forum" repositoryName="magnoliacluster-forum" workspaceName="forum" />
</RepositoryMapping>
  • save the file

  • in the magnoliaAuthor directory, open WEB-INF/config/default/magnolia.properties

change the line

magnolia.repositories.jackrabbit.config=WEB-INF/config/repo-conf/jackrabbit-bundle-derby-search.xml

to

magnolia.repositories.jackrabbit.config=WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search.xml

Hint: In a real development project, you would configure these settings per instance and not globally (and use other file names for the database configuration but in the context of this tutorial it’s faster to modify the default. Read more about this in the Magnolia documentation.

Directly after the line above, add

magnolia.repositories.jackrabbit.cluster-forum.config=WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search-forum.xml

  • save the file
  • duplicate the file WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search.xml
  • rename it to * jackrabbit-bundle-mysql-search-forum.xml*
  • open the file to add the cluster configuration

Directly at the top of the file after the Repository tag add

<Cluster id="cid_author" syncDelay="2000">
  <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
    <param name="revision" value="${rep.home}/revision.log" />
    <param name="driver" value="com.mysql.jdbc.Driver" />
    <param name="url" value="jdbc:mysql://localhost:3306/mgnl_cluster_forum" />
    <param name="user" value="root" />
    <param name="password" value="PASSWORD" />
    <param name="schema" value="mysql" />
    <param name="schemaObjectPrefix" value="journal_" />
  </Journal>
</Cluster>

Of course you have to set the correct values for the database and the user credentials.

Adjust the database configuration for the DataSource with the name magnolia

<DataSources>
  <DataSource name="magnolia">
    <param name="driver" value="com.mysql.jdbc.Driver" />
    <param name="url" value="jdbc:mysql://localhost:3306/mgnl_cluster_forum" />
    <param name="user" value="root" />
    <param name="password" value=„PASSWORD“ />
    <param name="databaseType" value="mysql"/>
    <param name="validationQuery" value="select 1"/>
  </DataSource>
</DataSources>

Adjust the path for FileSystem to point you your shared directory available to all clustered instances, eg

 <param name="path" value=„/my/great/directory/_cluster_shared/shared_forum/repository" />

and also the path for the DataStore

<param name="path" value="/my/great/directory/_cluster_shared/shared_forum/repository/datastore"/>
  • save the file

  • duplicate the directory WEB-INF/config/magnoliaPublic and rename it to magnoliaPublicTwo

Add the public instances

  • duplicate the directory tomcatDir/webapps/magnoliaAuthor and rename it to magnoliaPublic
  • duplicate the directory tomcatDir/webapps/magnoliaAuthor and rename it to magnoliaPublicTwo

Adjust the database settings

Now we need to configure every instance to use the right database we created when we set up our environment.

  • open magnoliaAuthor/WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search.xml

Edit the DataSource named magnolia to match your database parameters, e.g.

<DataSource name="magnolia">
  <param name="driver" value="com.mysql.jdbc.Driver" />
  <param name="url" value="jdbc:mysql://localhost:3306/mgnl_cluster_author" />
  <param name="user" value="root" />
  <param name="password" value="PASSWORD" />
  <param name="databaseType" value="mysql"/>
  <param name="validationQuery" value="select 1"/>
</DataSource>
  • do the same for your public instances:
    • edit magnoliaPublic/WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search.xml
    • edit magnoliaPublicTwo/WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search.xml

Of course you have to adjust the values to use the correct databases.

Adjust the cluster IDs for the public instances

Don’t forget to set individual cluster ID’s for every single instance!

We already created the cluster configuration for the author instance in one of the steps above. Now we need to adjust the IDs for the public Magnolia instances.

  • edit magnoliaPublic/WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search-forum.xml

Change the Cluster id to a different name than "cid_author“, e. g.

<Cluster id="cid_public1" syncDelay="2000">
  <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
    <param name="revision" value="${rep.home}/revision.log" />
    <param name="driver" value="com.mysql.jdbc.Driver" />
    <param name="url" value="jdbc:mysql://localhost:3306/mgnl_cluster_forum" />
    <param name="user" value="root" />
    <param name="password" value="PASSWORD" />
    <param name="schema" value="mysql" />
    <param name="schemaObjectPrefix" value="journal_" />
  </Journal>
</Cluster>
  • edit magnoliaPublicTwo/WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-search-forum.xml

Change to cluster id to a unique value, e. g.

<Cluster id="cid_public2"

Test the configuration

For a fresh start make sure the databases used are clean and also all repository data on the file system.
Go to the tomcatDir/bin folder and start the server with ./catalina.sh run if you want to have the log output on the screen or with ./magnolia_control.sh start if there is no need to watch what is happening immediately.

All three Magnolia instances should start and ready to have their modules installed.

Login to the admin interfaces of all three instances (host/context/.magnolia/admincentral) and open the "Forums" app from the "Edit" menu in Magnolia AdminCentral.

Now if you create threads and messages on one of the three instances they automatically (you might have to refresh the interface) should appear in all instances.

You can also now moderate new comments now on the author instance:

More resources

Configuration files

You can find the configuration files also on Github.

Lars Fischer

Read more posts by this author.

Subscribe to Lars-Fischer.me

Get the latest posts delivered right to your inbox.

or subscribe via RSS with Feedly!
comments powered by Disqus