Quick internationalized sort in javascript

Let's take a list of countries that was originally alphabetized in English, but is now translated to French.

var arr = ["Argentine", "Australie", "Autriche", "Belgique", "Brésil", "Canada", "Chili", 
"Chine", "Costa Rica ", "République Tchèque", "Danemark", "Équateur", "El Salvador ", 
"Finlande", "France", "Allemagne", "Guatemala", "Hong Kong", "Hongrie", "Inde", "Irlande", 
"Italie", "Japon", "Corée du Sud", "Luxembourg", "Mexique", "Pays-Bas", "Nouvelle-Zélande", 
"Norvège", "Panama", "Pologne", "Portugal", "Russie", "Slovaquie", "Espagne", 
"la Suède", "Suisse", "Turquie", "Royaume-Uni", "Uruguay", "États-Unis"]

You can see the incorrect sort order for Germany ("Allemagne") and the US ("États-Unis").
Running the standard javascript Array.sort() will sort it according to the American English language:

arr.sort();
/*==>
["Allemagne", "Argentine", "Australie", "Autriche", "Belgique", "Brésil", "Canada", "Chili", 
"Chine", "Corée du Sud", "Costa Rica ", "Danemark", "El Salvador ", "Espagne", "Finlande", 
"France", "Guatemala", "Hong Kong", "Hongrie", "Inde", "Irlande", "Italie", "Japon", 
"Luxembourg", "Mexique", "Norvège", "Nouvelle-Zélande", "Panama", "Pays-Bas", "Pologne", 
"Portugal", "Royaume-Uni", "Russie", "République Tchèque", "Slovaquie", "Suisse", "Turquie",
 "Uruguay", "la Suède", "Équateur", "États-Unis"] */

Note the misplacement of the last three entries. A real internationalized sort of this would be a huge motherbitch to implement, but here is a quick and hacky way to get your ducks in order:

  arr.sort(function(a,b){
 
    function normalize(str){
       return str
               .toLowerCase()
               .replace(/è|é|ê|ë/,'e').replace(/ò|ó|ô|õ|ö/,'o').replace(/ì|í|î|ï/,'i')
               .replace(/à|á|â|ã|ä|å|æ/,'a').replace(/ù|ú|û|ü/,'u');
    }
 
    a = normalize(a);
    b = normalize(b);
 
    return ((a < b) ? -1 : ((a > b) ? 1 : 0));
  });
/*==>
["Allemagne", "Argentine", "Australie", "Autriche", "Belgique", "Brésil", "Canada", "Chili", 
"Chine", "Corée du Sud", "Costa Rica ", "Danemark", "El Salvador ", "Équateur", "Espagne",
 "États-Unis", "Finlande", "France", "Guatemala", "Hong Kong", "Hongrie", "Inde", "Irlande", 
"Italie", "Japon", "la Suède", "Luxembourg", "Mexique", "Norvège", "Nouvelle-Zélande", 
"Panama", "Pays-Bas", "Pologne", "Portugal", "République Tchèque", "Royaume-Uni", 
"Russie", "Slovaquie", "Suisse", "Turquie", "Uruguay"] */

It's not perfect (I bet that "la Suède" should actually be in the S's), but it'll get you a bit closer without too much effort.

0 comments ↓

There are no comments yet...Kick things off by filling out the form below.

Leave a Comment

Basic HTML is cool. Surround code blocks with <pre lang="javascript"></pre>.