Locale Aware Sorting in JavaScript
March 28, 2022
Problem
When building a localized JavaScript web-app, the default sorting logic for strings doesn't quite yield the results that you might expect. For example, take the following example…
let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort();
console.log(strings);
// ['NOP', 'abc', 'abc', 'nop', 'äbc', 'ñop']
If it weren't for the accented characters, you could try lowercasing everything
to shift NOP
to the intended place, but to properly sort with localization
in mind, this technique does not work.
You can jump to various sections of this blog post…
- What is a
compareFunction
? - Using localeCompare in the compareFunction
- Using Intl.Collator in the compareFunction
- Sorting Objects with a Primary and Secondary Property
- Additional Locale Specific Sorting Options
Solutions
Thankfully there are a couple of options that you can use to apply locale-aware
sorting
(localeCompare
and
Intl.Collator
.
We will take a look at both of these approaches used inside the Array's
sort
method, but first let's briefly explain what a compareFunction
is.
compareFunction
?
What is a You can customize how an Array sorts by providing a
compareFunction
as an argument. This function takes two parameters (typically named a
and b
)
where the return value of the function is positive, negative, or 0.
- If the result is negative, then
a
should be beforeb
, - If the result is positive, then
b
should be beforea
, - If the result is zero, then
a
andb
are equal.
The following is an example of what a compareFunction
can look like. This
function forces each string to be compared after they have been lowercased. It
does not solve our sorting problem listed above. The sorting is a little better
than our original attempt, but it still doesn't account for the special
locale-specific characters.
let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort((a, b) => {
const lowerA = a.toLowerCase();
const lowerB = b.toLowerCase();
if (lowerA < lowerB) {
return -1; // A is less than B
} else if (lowerA > lowerB) {
return 1; // A is greater than B
} else {
return 0; // A and B are equal
}
});
console.log(strings);
// ['abc', 'abc', 'nop', 'NOP', 'äbc', 'ñop']
Using localeCompare in the compareFunction
As mentioned above, modern browsers have better techniques to compare strings
with locale in mind. First we will look at the
localeCompare
method off of the String prototype. This method follows the contract defined by the compareFunction
as
described above. The function accepts two parameters and returns a positive
value, negative value, or 0 depending on how the parameters compare to
each other.
let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort((a, b) => a.localeCompare(b));
console.log(strings);
// ['abc', 'abc', 'äbc', 'nop', 'NOP', 'ñop']
Sorting by an Object Property
When I need to sort in a web-app, I'm typically trying to sort objects in an
array. Thankfully, you can tweak the compareFunction
to access the property
that needs sorting.
let objects = [
{ name: "nop", value: 3 },
{ name: "NOP", value: 2 },
{ name: "ñop", value: 1 },
{ name: "abc", value: 3 },
{ name: "abc", value: 2 },
{ name: "äbc", value: 1 },
];
objects.sort((a, b) => a.name.localeCompare(b.name));
console.log(objects);
/*
[
{ "name": "abc", "value": 3 },
{ "name": "abc", "value": 2 },
{ "name": "äbc", "value": 1 },
{ "name": "nop", "value": 3 },
{ "name": "NOP", "value": 2 },
{ "name": "ñop", "value": 1 }
]
*/
Using Intl.Collator in the compareFunction
Another way to sort with language-sensitive string comparison is to use
Intl.Collator
.
Using this approach, you use the Intl.Collator
constructor and create a
collator object that will be used in your compareFunction
. The collator has a
compare method that can be leveraged inside of the Array's sort method.
let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
const collator = new Intl.Collator('en');
strings.sort((a, b) => collator.compare(a, b));
console.log(strings);
Since the collator.compare
method accepts the same parameters as the
compareFunction
you can simplify the line above by passing the compare
directly to the sort method.
strings.sort(collator.compare);
You might be wondering why you should this approach versus the localeCompare
method in the previous section. MDN recommends that
you use the Intl.Collator
for performance reasons when "comparing large numbers of
strings".
Sorting by an Object Property
You can also sort arrays of objects like we did in the previous example. In this
case we leverage the collator.compare
method and pass along the properties
that we want to sort by.
let objects = [
{ name: "nop", value: 3 },
{ name: "NOP", value: 2 },
{ name: "ñop", value: 1 },
{ name: "abc", value: 3 },
{ name: "abc", value: 2 },
{ name: "äbc", value: 1 },
];
const collator = new Intl.Collator('en');
objects.sort((a, b) => collator.compare(a.name, b.name));
console.log(objects);
/*
[
{ "name": "abc", "value": 3 },
{ "name": "abc", "value": 2 },
{ "name": "äbc", "value": 1 },
{ "name": "nop", "value": 3 },
{ "name": "NOP", "value": 2 },
{ "name": "ñop", "value": 1 }
]
*/
Sorting Objects with a Primary and Secondary Property
When you sort an array and have several matches exact matches, it is handy to
have a secondary property to sort by to break the tie. You can use the same
approach as above, but with a little more logic. Inside the compareFunction
,
if the two properties have the same value (a zero compare value), then you can
compare again by a secondary property.
let objects = [
{ name: "nop", value: 3 },
{ name: "NOP", value: 2 },
{ name: "ñop", value: 1 },
{ name: "abc", value: 3 },
{ name: "abc", value: 2 },
{ name: "äbc", value: 1 },
];
const collator = new Intl.Collator('en');
objects.sort((a, b) => {
// Compare the strings via locale
let diff = collator.compare(a.name, b.name);
if (diff === 0) {
// If the strings are equal compare the numbers
return a.value - b.value;
}
return diff;
});
console.log(objects);
/*
[
{ "name": "abc", "value": 2 }, // name is same, sort by value
{ "name": "abc", "value": 3 }, // name is same, sort by value
{ "name": "äbc", "value": 1 },
{ "name": "nop", "value": 3 },
{ "name": "NOP", "value": 2 },
{ "name": "ñop", "value": 1 }
]
*/
Additional Locale Specific Sorting Options
Both the above sorting techniques have additional options that you can pass along to help refine the sorting logic.
const collator = new Intl.Collator('en', {
sensitivity: 'base', // base, accent, case, variant
caseFirst: 'upper', // upper, lower, false
usage: 'sort', // sort, search
ignorePunctuation: true, // true, false
numeric: true, // true, false
});
You find more details about these options in the TC39 documentation.
Tweet about this post and have it show up here!