Paul Irish

Making the www great

Quick Internationalized Sort in Javascript

Let’s take a list of countries that was originally alphabetized in English, but is now translated to French.

1
2
3
4
5
6
var arr = ["Argentine", "Australie", "Autriche", "Belgique", "Brésil", "Canada", "Chili",
"Chine", "Costa Rica ", "République Tchèque", "Danemark", "Équateur", "El Salvador ",
"Finlande", "France", "Allemagne", "Guatemala", "Hong Kong", "Hongrie", "Inde", "Irlande",
"Italie", "Japon", "Corée du Sud", "Luxembourg", "Mexique", "Pays-Bas", "Nouvelle-Zélande",
"Norvège", "Panama", "Pologne", "Portugal", "Russie", "Slovaquie", "Espagne",
"la Suède", "Suisse", "Turquie", "Royaume-Uni", "Uruguay", "États-Unis"]

You can see the incorrect sort order for Germany (“Allemagne”) and the US (“États-Unis”). Running the standard javascript Array.sort() will sort it according to the American English language:

1
2
3
4
5
6
7
8
arr.sort();
/*==>
["Allemagne", "Argentine", "Australie", "Autriche", "Belgique", "Brésil", "Canada", "Chili", 
"Chine", "Corée du Sud", "Costa Rica ", "Danemark", "El Salvador ", "Espagne", "Finlande", 
"France", "Guatemala", "Hong Kong", "Hongrie", "Inde", "Irlande", "Italie", "Japon", 
"Luxembourg", "Mexique", "Norvège", "Nouvelle-Zélande", "Panama", "Pays-Bas", "Pologne", 
"Portugal", "Royaume-Uni", "Russie", "République Tchèque", "Slovaquie", "Suisse", "Turquie",
 "Uruguay", "la Suède", "Équateur", "États-Unis"] */

Note the misplacement of the last three entries. A real internationalized sort of this would be a huge motherbitch to implement, but here is a quick and hacky way to get your ducks in order:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
  arr.sort(function(a,b){

    function normalize(str){
       return str
               .toLowerCase()
               .replace(/è|é|ê|ë/,'e').replace(/ò|ó|ô|õ|ö/,'o').replace(/ì|í|î|ï/,'i')
               .replace(/à|á|â|ã|ä|å|æ/,'a').replace(/ù|ú|û|ü/,'u');
    }

    a = normalize(a);
    b = normalize(b);

    return ((a < b) ? -1 : ((a > b) ? 1 : 0));
  });
/*==>
["Allemagne", "Argentine", "Australie", "Autriche", "Belgique", "Brésil", "Canada", "Chili", 
"Chine", "Corée du Sud", "Costa Rica ", "Danemark", "El Salvador ", "Équateur", "Espagne",
 "États-Unis", "Finlande", "France", "Guatemala", "Hong Kong", "Hongrie", "Inde", "Irlande", 
"Italie", "Japon", "la Suède", "Luxembourg", "Mexique", "Norvège", "Nouvelle-Zélande", 
"Panama", "Pays-Bas", "Pologne", "Portugal", "République Tchèque", "Royaume-Uni", 
"Russie", "Slovaquie", "Suisse", "Turquie", "Uruguay"] */

It’s not perfect (I bet that “la Suède” should actually be in the S’s), but it’ll get you a bit closer without too much effort.

2009.10.29: A much better method:
1
2
3
4
  arr.sort(function(a, b) {
    if (typeof a === 'string' && typeof b === 'string') {
      return a.toLowerCase().localeCompare(b.toLowerCase());
    });

Comments