NOTE : I use HTML codes for special characters to avoir encoding issues in patch file.
In ICU configuration, add a transliterate rule for
œ = oe
æ = ae
Test plan :
- Without patch
- Create a record R1 with title containing for example "cœur"
- Create a record R2 with title containing for example "coeur"
- Index those records
- Search for "cœur"
=> You only find R1
- Search for "coeur"
=> You only find R2
- Apply patch
- Restart zebra
- Index R1 and R2
- Search for "cœur"
=> You find R1 and R2
- Search for "coeur"
=> You find R1 and R2
(Same test plan for ae)
------
Tested with all variants of Ae ae Oe oe. Search worked as expected.
Note: The words with special characters were not highlighted, but I think this can be done in an other bug.
Signed-off-by: Marc Veron <veron@veron.ch>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
<icu_chain locale="">
+ <transliterate rule="{ œ > oe "/>
+ <transliterate rule="{ Œ > oe "/>
+ <transliterate rule="{ æ > ae "/>
+ <transliterate rule="{ Æ > ae "/>
<!-- Remove control characters except \t\n\r -->
<transform rule="[\x00-\x08\x0B\x0C\x0E-\x1F\x7F] Any-Remove"/>
<tokenize rule="l"/>
<icu_chain locale="">
+ <transliterate rule="{ œ > oe "/>
+ <transliterate rule="{ Œ > oe "/>
+ <transliterate rule="{ æ > ae "/>
+ <transliterate rule="{ Æ > ae "/>
<transliterate rule="\'>\ "/>
<transliterate rule="[:Number:] { '-' > '' "/>
<!-- Remove control characters except \t\n\r -->