Designing Databases for i18n and l10n in Global Services

Designing a Database for Localized Content
To support global services, designing a database that efficiently stores and manages multilingual data is essential. This goes beyond storing translated strings—it requires careful consideration of region-specific content, performance, and scalability. Various approaches exist, such as using separate tables, expanding columns, storing JSON structures, or leveraging key-value stores. Understanding the strengths, limitations, and appropriate use cases of each method is critical.
This post analyzes database design strategies with internationalization (i18n) and localization (l10n) in mind and explores how to apply them in real-world scenarios.
1. Various localization strategies
When designing a system that requires multilingual support, the database layer can adopt several strategies. Each approach differs in performance, scalability, and maintainability, making it important to choose the right one based on the use case. This section outlines three common strategies.
1-1. Column expansion strategy
The column expansion strategy is one of the simplest ways to store multilingual data. This approach adds separate columns for each language within the same table. For example, a product_name
field might be expanded to product_name_ko
, product_name_en
, and product_name_ja
to support Korean, English, and Japanese respectively.

This strategy is intuitive and easy to implement, especially in the early stages of a project. Queries remain straightforward since language-specific data resides in a single table. However, as the number of supported languages increases, the schema becomes bloated, potentially degrading performance. Moreover, adding new languages requires altering the table structure, limiting scalability.
1-1-1. Strengths and weaknesses of the column expansion strategy
The column expansion strategy is ideal when only a few languages need to be supported. It allows for fast retrieval of language-specific data without joins and keeps SQL queries simple. However, it lacks long-term flexibility. Every new language requires schema changes, and as column count grows, index efficiency can drop and table scans may become more expensive depending on the database engine.
- Fast data retrieval: All language variants are stored in the same row, enabling fast access without joins.
- Simple SQL: Queries are easy to write and maintain.
- Easy to adopt: Can be quickly integrated into existing systems with minimal changes.
However, the strategy has notable downsides:
- Limited scalability: Adding a new language requires modifying the schema, which makes long-term maintenance more difficult. As more languages are added, the table grows in width, increasing storage use and potentially impacting query performance.
- Column bloat: The more languages supported, the more columns are added, making the structure harder to manage.
- NULL inflation: If translations for some languages are missing, their respective columns will often be left as
NULL
, leading to sparsity.
1-1-2. Table schema for the column expansion strategy
In this strategy, separate columns are defined for each translatable field in every supported language. The following schema demonstrates how this might be implemented:
create table products
(
id bigint auto_increment primary key,
product_name_ko varchar(255), -- Product name in Korean
product_name_en varchar(255), -- Product name in English
product_name_ja varchar(255), -- Product name in Japanese
description_ko varchar(1000), -- Product description in Korean
description_en varchar(1000), -- Product description in English
description_ja varchar(1000), -- Product description in Japanese
price decimal(10, 2)
);
When inserting a product, values for all languages should be provided:
insert into products (product_name_ko, product_name_en, product_name_ja, description_ko, description_en, description_ja, price)
values ('스마트폰', 'Smartphone', 'スマートフォン', '최신 기술이 적용된 고급 스마트폰입니다.', 'A high-end smartphone with advanced features.', '最新技術が搭載された高級スマートフォンです。', 999.99);
To retrieve content in a specific language, use a query like this:
select
id as `Product ID`,
product_name_en as `Product Name`,
products.description_en as `Product Description`
from
products
where
id = 1;
The result would be:
Product ID | Product Name | Product Description |
---|---|---|
1 | Smartphone | A high-end smartphone with advanced features. |
As shown, the column expansion strategy enables fast, join-free queries for each language. However, its rigidity in accommodating new languages can become a significant limitation over time.
1-2. Column-based translation table strategy
The column-based translation table strategy stores multilingual data in a separate table, rather than using individual columns within the main table. Fields that require translation—such as product names or descriptions—are not stored directly in the primary entity table. Instead, the application retrieves localized content by referencing a translation table.

1-2-1. Strengths and weaknesses of the column-based translation table strategy
This approach allows multilingual support to be introduced with minimal changes to the existing data model. It reduces redundancy by centralizing translation data and supports easy expansion: new languages can be added by inserting data rather than altering the schema. However, it typically requires joins between the main and translation tables, which may affect performance and increase query complexity.
- Minimal changes to existing schema: New languages can be supported without modifying table structures—only new data needs to be inserted.
- Reusability of translation entries: When the same translated value is used in multiple places, a single reference can be reused.
- Consistency of translated data: Centralized management of translations ensures consistent messaging across the application.
There are also some trade-offs to consider:
- Increased SQL complexity: Retrieving multilingual data requires joining with the translation table. This can lead to more complex queries, especially when multiple fields (e.g., product name and description) each require a join.
- Potential for NULLs: If a translation for a particular language is missing, the corresponding field in the translation table will return
NULL
.
1-2-2. Table schema for the column-based translation table strategy
In this model, translations are stored in a separate table, while the main entity table keeps reference IDs to the appropriate translations. This allows multilingual content to be managed efficiently and flexibly. Below is an example schema:
create table translations
(
id bigint auto_increment primary key,
text_ko varchar(1000), -- Translated text in Korean
text_en varchar(1000), -- Translated text in English
text_ja varchar(1000) -- Translated text in Japanese
);
create table products
(
id bigint auto_increment primary key,
category_id bigint not null,
price decimal(10, 2) not null,
image_url varchar(255) not null,
name_translation_id bigint not null,
description_translation_id bigint not null,
foreign key (name_translation_id) references translations (id),
foreign key (description_translation_id) references translations (id)
);
Here, the products
table does not contain any multilingual fields directly. Instead, it refers to entries in the translations
table.
-- Insert translations (each row contains all language variants)
insert into
translations (text_ko, text_en, text_ja)
values
('스마트폰', 'Smartphone', 'スマートフォン'),
('최신 기술이 적용된 고급 스마트폰입니다.', 'A high-end smartphone with advanced features.', '最新技術が搭載された高級スマートフォンです。');
-- Insert product with references to the translation rows
insert into
products (category_id, price, image_url, name_translation_id, description_translation_id)
values
(1, 999.99, 'image.jpg', 1, 2);
To retrieve Japanese content for product name and description, you can use the following query:
select
p.id as `Product ID`,
productName.text_ja as `Product Name`,
productDesc.text_ja as `Product Description`
from
products p
left join translations productName on p.name_translation_id = productName.id
left join translations productDesc on p.description_translation_id = productDesc.id
where
p.id = 1;
The result would be:
Product ID | Product Name | Product Description |
---|---|---|
1 | スマートフォン | 最新技術が搭載された高級スマートフォンです。 |
This approach enables multilingual data to be managed independently of the main schema, allowing new languages to be added without any structural changes.
1-3. Row-based translation table strategy
The row-based translation table strategy stores multilingual data in separate rows rather than columns. In this approach, translatable fields are not kept in the main table but are retrieved from a translation table based on a locale value.

1-3-1. Strengths and weaknesses of the row-based translation table strategy
This strategy offers excellent scalability, as new languages can be added simply by inserting additional rows, with no changes to the schema. It also supports optimized queries by filtering on locale, minimizing the need for complex joins.
- Highly scalable: New languages can be supported by inserting data, without altering the table structure.
- Query optimization by locale: Only the required language data is retrieved by filtering on locale.
- Minimal data duplication: Translations can be reused across entities, improving consistency and reducing redundancy.
However, there are some limitations:
- Joins required for retrieval: Retrieving multilingual data typically involves joining the main table and the translation table.
- Performance impact on full scans: While locale-specific queries are efficient, retrieving all translations can lead to a large number of rows, which may affect performance if not properly optimized.
1-3-2. Table schema for the row-based translation table strategy
In this approach, translatable fields are moved to a separate translation table, and the main table holds only the non-localized data. The translation table stores one row per locale per entity, allowing multiple translations to be maintained efficiently. The example schema below illustrates this setup:
create table products
(
id bigint auto_increment primary key,
category_id bigint not null,
price decimal(10, 2) not null,
image_url varchar(255) not null
);
create table product_translations
(
id bigint auto_increment primary key,
product_id bigint not null,
locale varchar(10) not null, -- Language code (e.g., ko, en, ja)
name varchar(255) not null, -- Translated product name
description text not null, -- Translated product description
unique key unique_translation (product_id, locale),
foreign key (product_id) references products (id) on delete cascade
);
Translatable fields are now stored in the product_translations
table, separated by locale. This structure supports clean multilingual management and allows new languages to be added without modifying the schema.
-- Insert base product
insert into
products (category_id, price, image_url)
values
(1, 999.99, 'image.jpg');
-- Insert translations (one row per language)
insert into
product_translations (product_id, locale, name, description)
values
(1, 'ko', '스마트폰', '최신 기술이 적용된 고급 스마트폰입니다.'),
(1, 'en', 'Smartphone', 'A high-end smartphone with advanced features.'),
(1, 'ja', 'スマートフォン', '最新技術が搭載された高級スマートフォンです。');
To retrieve English content for a given product, use the following query:
select
product_id as `Product ID`,
name as `Product Name`,
description as `Product Description`
from
product_translations
where
product_id = 1
and locale = 'en';
The result will be:
Product ID | Product Name | Product Description |
---|---|---|
1 | Smartphone | A high-end smartphone with advanced features. |
This strategy provides an efficient and scalable way to manage multilingual data, with no need to alter the schema when adding new languages.
2. Implementation of a localized database
Previously, we explored three strategies for managing translated content tailored to specific regions and cultures. Among them, the row-based translation table strategy offers a balanced choice in terms of scalability and maintainability. It allows new languages to be added without altering the schema and provides reasonable performance. In this section, we’ll implement this strategy in code and examine how it works in practice.
2-1. Database schema definition
First, define the tables needed to store multilingual data. The products
table holds base product information, while the product_translations
table stores the localized text for each product. The two tables are linked via product_id
, and translations are filtered by locale
.
create table if not exists products
(
id bigint auto_increment primary key,
category_id bigint not null,
price decimal(10, 2) not null,
image_url varchar(255) not null
);
create table if not exists product_translations
(
id bigint auto_increment primary key,
product_id bigint not null,
locale varchar(10) not null,
name varchar(255) not null,
description text not null,
unique key unique_translation (product_id, locale),
foreign key (product_id) references products (id) on delete cascade
);
2-2. Insert sample data
Next, insert sample data to verify how multilingual content is stored. Add a product to the products
table, then insert translated content for that product into the product_translations
table.
insert into
products (category_id, price, image_url)
values
(10, 999.99, 'iphone.jpg');
insert into
product_translations (product_id, locale, name, description)
values
(1, 'en', 'iPhone 15', 'The latest iPhone model'),
(1, 'ko', '아이폰 15', '최신 아이폰 모델'),
(1, 'ja', 'アイフォン15', '最新のアイフォンモデル');
Each translation is stored as a separate row, making it easy to expand the system by simply inserting new data for additional languages.
2-3. API implementation
Now let’s implement an API that returns localized data. Using Spring Boot, create a controller that returns product information with translations based on the requested locale.
@GetMapping("/products/{id}")
public ResponseEntity<?> products(
@PathVariable Long id
) {
ProductResult result = service.retrieve(id);
return ResponseEntity
.ok(result);
}
To provide localized content, the application needs access to the request's locale. Spring’s i18n support automatically extracts the locale from the Accept-Language
header and stores it in a thread-local context. This enables locale-aware responses throughout the application without requiring custom logic.
public ProductResult retrieve(Long id) {
Locale locale = LocaleContextHolder.getLocale();
return productRepository.findProductWithTranslationById(id, locale);
}
The query to retrieve data for a specific locale can then be written as follows:
@Override
public ProductResult findProductWithTranslationById(Long id, Locale locale) {
return queryFactory
.select(Projections.constructor(
ProductResult.class,
product.id,
product.price,
product.imageUrl,
productTranslation.name,
productTranslation.description
))
.from(product)
.innerJoin(productTranslation).on(productTranslation.product.eq(product))
.where(
product.id.eq(id),
productTranslation.locale.eq(locale.getLanguage())
)
.fetchOne();
}
2-4. API request and response
With the multilingual API in place, let’s test it in action. The client sends a request with the desired language set via the Accept-Language
header, and the server returns the translated data accordingly. This enables a single endpoint to serve multiple languages.
To retrieve English content, include Accept-Language: en
in the request:
GET /products/1 HTTP/1.1
Content-Type: application/json
Accept-Language: en
Host: localhost:8080
The response will contain English translations:
{
"id": 1,
"price": 999.99,
"imageUrl": "iphone.jpg",
"name": "iPhone 15",
"description": "The latest iPhone model"
}
To retrieve the Japanese version, use Accept-Language: ja
:
GET /products/1 HTTP/1.1
Content-Type: application/json
Accept-Language: ja
Host: localhost:8080
The response will be:
{
"id": 1,
"price": 999.99,
"imageUrl": "iphone.jpg",
"name": "アイフォン15",
"description": "最新のアイフォンモデル"
}
This setup allows multilingual content to be served dynamically based on the client’s locale. Most importantly, it avoids the need to modify the database schema when new languages are added—only new rows must be inserted—making the system highly maintainable and easy to extend.
3. Final thoughts
While planning a side project with potential for global reach, I naturally began to think about multilingual support. Although the project started as a simple prototype, I felt a growing sense of responsibility to support at least Korean, rather than offering it in English only.
This led me to explore various strategies for managing multilingual data and designing a database that can support them efficiently. Beyond the strategies covered here, other approaches exist—such as storing schema-less data in JSON fields, using key-value stores, maintaining separate tables per language, or integrating with external translation services. Each option has its place, depending on the project’s scale, number of supported languages, frequency of content updates, and performance requirements.
The implementation covered in this post is available in the 🐙 GitHub repository. 🚀