05
March 2018
Multilanguage Database Design in MySQL
Building a multilanguage website is not a trivial task and you will encounter many problems on this way, and one of them is how you're planning to store the content of the site in the database for each language.
You may perform a small research on the Web and find enough resources about it, but there is no a magic solution, you have to understand this - each solution depends on your personal requirements, size of the database, complexity of your site, etc. So we'll discuss only major techniques. If you want to learn more, you may find additional information with a Google search.
Ok, so... there are more or less 4 popular databases schemas for multilanguage website.
- Column Approach
- Multirow Approach
- Single Translation Table Approach
- Additional Translation Table Approach
1. Column Approach
This solution is the simplest one and basically it creates an additional column for each text (each language) that needs to be translated (there is may be a number of such columns in your table, like: title, name, description etc.). Below the example for such table in MySQL:
Sample 1-1. Create column approach table.
CREATE TABLE app_product (
`id` int(10) NOT NULL AUTO_INCREMENT,
`date_created` datetime NOT NULL,
`price` decimal(10,2) unsigned NOT NULL DEFAULT '0.00',
`title_en` varchar(255) NOT NULL,
`title_es` varchar(255) NOT NULL,
`title_fr` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
Now, the way you would query it is also simple enough. You may do it by automatically selecting the right columns according to the chosen language:
Sample 1-2. Usage of column approach table.
<?php
// Retrieve titles for all languages
$sql = "SELECT * FROM `app_product` WHERE 1";
if($result = mysqli_query($link, $sql)){
if($row = mysqli_fetch_assoc($result)){
echo "English: ".$row["title_en"]."<br>";
echo "Spanish: ".$row["title_es"]."<br>";
echo "French: ".$row["title_fr"]."<br>";
}
}
// Retrieve appropriate title according to the chosen language in the system
$sql = "SELECT `title_".$_SESSION['current_language']."` as `title`
FROM `app_product`";
if($result = mysqli_query($link, $sql)){
if($row = mysqli_fetch_assoc($result)){
echo "Current Language: ".$row["title"];
}
}
?>
Advantages:
- Simplicity - easy to implement
- Easy querying - no JOINs required
- No duplicates - doesn't have duplicate content (there is only one row for each record and only the language columns are duplicated)
Disadvantages:
- Hard to maintain - works in easy way for 2-3 languages, but it becomes a really hard when you have a lot of columns or a lot of languages
- Hard to add a new language - adding new language requires schema changes (and special access rights for db user) for each table with multilanguage content
- Store empty space - if not all translations are required (e.g. at some places default language should always be used) it may cause redundant data or empty db fields
- Need to build the watch - what column you are working with depending on the language
2. Multirow Approach
This solution is similar to the one above, but instead of duplicating the content in columns it does it in rows. Below the example for such table in MySQL:
Sample 2-1. Create multirow approach table.
CREATE TABLE app_product (
`id` int(10) NOT NULL AUTO_INCREMENT,
`date_created` datetime NOT NULL,
`price` decimal(10,2) unsigned NOT NULL DEFAULT '0.00',
`language_id` varchar(2) NOT NULL,
`title` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
Lets check now, how we could query it. Generally the idea is to do it by automatically selecting the right rows according to the chosen language:
Sample 2-2. Usage of multirow approach table.
<?php
// Retrieve titles for all languages
$sql = "SELECT * FROM `app_product` WHERE `id` = 1";
if($result = mysqli_query($link, $sql)){
while($row = mysqli_fetch_assoc($result)){
echo "Language (".$row["language_id"]."): ".$row["title"]."<br>";
}
}
// Retrieve appropriate title according to the chosen language in the system
$sql = "SELECT `title`
FROM `app_product`
WHERE `language_id` = '".$_SESSION['current_language']."'";
if($result = mysqli_query($link, $sql)){
if($row = mysqli_fetch_assoc($result)){
echo "Current Language: ".$row["title"];
}
}
?>
Advantages:
- Simplicity - easy to implement
- Easy querying - no JOINs required
Disadvantages:
- Hard to maintain - every column that is not translated must be changed in all rows for each language. e.g changing the price for single product requires repeating of this operation for all languages
- Hard to add a new language - requires repeating insertion operation for each language (cloning the record for default language)
- Duplicate content - you will have a lot of duplicate content for all the columns that are not translated
3. Single Translation Table Approach
This solution seems to be the cleanest one from database structure perspective. You store all texts that need to be translated in a single translation table. It is more suited for dynamic websites and which have a large number of languages or which intend to add a new language in the future and want to do it with ease. Below the example for such database schema in MySQL:
Sample 3-1. Create single translation table approach.
CREATE TABLE IF NOT EXISTS `app_language` (
`code` char(2) NOT NULL,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `app_product` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`date_created` datetime NOT NULL,
`price` decimal(10,2) NOT NULL DEFAULT '0.00',
`title` int(11) NOT NULL DEFAULT '0',
`description` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `title` (`title`),
KEY `description` (`description`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `app_translation` (
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1;
CREATE TABLE IF NOT EXISTS `app_translation_entry` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`translation_id` int(11) NOT NULL,
`language_code` char(2) NOT NULL,
`field_text` text NOT NULL,
PRIMARY KEY (`id`),
KEY `translation_id` (`translation_id`),
KEY `language_code` (`language_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1;
Now lets check how we could query it.
<?php
// Retrieve titles for all languages
$sql = "SELECT p.*, l.name as language_name, te.field_text as title
FROM `app_product` p
INNER JOIN `app_translation_entry` te ON p.title = te.translation_id
INNER JOIN `app_language` l ON te.language_code = l.code
WHERE p.id = 1";
if($result = mysqli_query($link, $sql)){
while($row = mysqli_fetch_assoc($result)){
echo "Language (".$row["language_name"]."): ".$row["title"]."<br>";
}
}
// Retrieve appropriate title according to the chosen language in the system
$sql = "SELECT p.*, l.name as language_name, te.field_text as title
FROM `app_product` p
INNER JOIN `app_translation_entry` te ON p.title = te.translation_id
INNER JOIN `app_language` l ON te.language_code = l.code
WHERE p.id = 1 AND
te.language_code = '".$_SESSION['current_language']."'";
if($result = mysqli_query($link, $sql)){
if($row = mysqli_fetch_assoc($result)){
echo "Current Language: ".$row["title"];
}
}
?>
Advantages:
- Proper normalization - seems like clean, relational approach
- Ease in adding a new language - doesn't require schema changes
- All translations in one place - readable/maintainable database
Disadvantages:
- Complex querying - multiple joins required to retrieve correct product description
- Hard to maintain - overcomplicated querying on all operations: insertion, removing and updating
- All translations in one place - one missing table leads to global problems
4. Additional Translation Table Approach
This is a variation of the above approach and it seems to be easier to maintain and work with. Let's check why: for each table that stores information that may need to be translated an additional table is created. The original table stores only language insensitive data and the new one all translated info. Below the example for such database schema in MySQL:
Sample 4-1. Create additional translation table approach.
CREATE TABLE IF NOT EXISTS `app_product` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`date_created` datetime NOT NULL,
`price` decimal(10,2) NOT NULL DEFAULT '0.00',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `app_product_translation` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`product_id` int(11) NOT NULL DEFAULT '0',
`language_code` char(2) NOT NULL,
`title` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`description` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `translation_id` (`product_id`),
KEY `language_code` (`language_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `app_language` (
`code` char(2) NOT NULL,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Here the example of how we could query it.
Sample 4-2. Usage of additional translation table approach.
<?php
// Retrieve titles for all languages
$sql = "SELECT p.*, pt.title, pt.description, l.name as language_name
FROM `app_product` p
INNER JOIN `app_product_translation` pt ON p.id = pt.product_id
INNER JOIN `app_language` l ON pt.language_code = l.code
WHERE p.id = 1";
if($result = mysqli_query($link, $sql)){
while($row = mysqli_fetch_assoc($result)){
echo "Language (".$row["language_name"]."): ".$row["title"]."<br>";
}
}
// Retrieve appropriate title according to the chosen language in the system
$sql = "SELECT p.*, pt.title, pt.description
FROM `app_product` p
INNER JOIN `app_product_translation` pt ON p.id = pt.product_id
WHERE p.id = 1 AND pt.language_code = '".$_SESSION['current_language']."'";
if($result = mysqli_query($link, $sql)){
if($row = mysqli_fetch_assoc($result)){
echo "Current Language: ".$row["title"];
}
}
?>
Advantages:
- Proper normalization - seems like clean, relational approach
- Ease in adding a new language - doesn't require schema changes
- Columns keep there names - doesn't require "_lang" suffixes or something else
- Easy to query - relatively simple querying (only one JOIN is required)
Disadvantages:
- May double the amount of tables - You have to create translation tables for all your tables that have columns that need to be translated
Conclusion
These 4 examples that are presented above give us an idea of how different approaches may be used here. These are of course not all of possible options, just the most popular ones. You may always modify them e.g. by introducing some additional views that would save you writing complex joins direct from your code.
Rememeber, that a solution you choose mostly depends on your project requirements. If you need s simplicity and are sure that the number of supported languages is small and fixed you could go with option 1. If you require a bit more flexibility and can afford a simple join when querying for multilingual data options 3 or 4 would be a possible solution.
In ApPHP we commonly use the 4th solution, that is Additional Translation Table Approach.
Source: https://www.apphp.com/tutorials/index.php?page=multilanguage-database-design-in-mysql