Single quotes in common language (not in programming) are usually ', but
there is also the form known as ’ in HTML. See
https://fr.wikipedia.org/wiki/Apostrophe_%28typographie%29
This bug proposes to transliterate all forms into a space.
Test plan :
(I'll use the code ’ instead of the unicode character)
- Without the patch
- Create a record with title : L’avion d’argile
- Index this record
- Search for "L’avion d’argile" => You find the record
- Search for "L'avion d'argile" => You do not find the record
- Apply patch
- Search for "L’avion d’argile" => You find the record
- Search for "L'avion d'argile" => You find the record
- Search for "L avion d argile" => You find the record
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
(cherry picked from commit
b11eb03a4c9674f4f4dedadaa8790257e30fb1d0)
Signed-off-by: Frédéric Demians <f.demians@tamil.fr>
(cherry picked from commit
c7f90a22be27a8960b6429a5a7e66bf77371f8da)
Signed-off-by: Liz Rea <wizzyrea@gmail.com>
<transliterate rule="{ æ > ae "/>
<transliterate rule="{ Æ > ae "/>
<transliterate rule="\'>\ "/>
+ <transliterate rule="\u2019>\ "/>
+ <transliterate rule="\u02BC>\ "/>
<transliterate rule="[:Number:] { '-' > '' "/>
<!-- Remove control characters except \t\n\r -->
<transform rule="[\x00-\x08\x0B\x0C\x0E-\x1F\x7F] Any-Remove"/>