git-svn-id: svn://svn.open-ils.org/ILS/trunk@14004 dcc99617-32d9-48b4-a31d-7c20da2025e4
authorkgs <kgs@dcc99617-32d9-48b4-a31d-7c20da2025e4>
Wed, 9 Sep 2009 21:15:08 +0000 (21:15 +0000)
committerkgs <kgs@dcc99617-32d9-48b4-a31d-7c20da2025e4>
Wed, 9 Sep 2009 21:15:08 +0000 (21:15 +0000)
docs/1.6/book1/sysadmin/indexedfieldweighting.xml [new file with mode: 0644]

diff --git a/docs/1.6/book1/sysadmin/indexedfieldweighting.xml b/docs/1.6/book1/sysadmin/indexedfieldweighting.xml
new file mode 100644 (file)
index 0000000..565a502
--- /dev/null
@@ -0,0 +1,233 @@
+<?xml version='1.0' encoding='UTF-8'?>\r
+<section xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude"\r
+    xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:id="indexedfieldweighting">\r
+    <title>Indexed-Field and Matchpoint Weighting</title>\r
+    <info>\r
+        <abstract>\r
+            <para>This chapter describes indexed field weighting and matchpoint weighting, which\r
+                control relevance ranking in Evergreen catalog search results.</para>\r
+            <para>\r
+                <tip>\r
+                    <para>In tuning search relevance, it is good practice to make incremental\r
+                        adjustments, capture search logs, and assess results before making further\r
+                        adjustments. </para>\r
+                </tip>\r
+            </para>\r
+        </abstract>\r
+    </info>\r
+    <section>\r
+        <title>Indexed-field Weighting</title>\r
+        <para>Indexed-field weighting is configured in the Evergreen database in the weight column\r
+            of the config.metabib_field table, which follows the other four columns in this table:\r
+            field_class, name, xpath, and format. </para>\r
+        <para>The following is one representative line from the config.metabib_field table:</para>\r
+        <para> author | conference |\r
+            //mods32:mods/mods32:name[@type='conference']/mods32:namePart[../mods32:role/mods32:roleTerm[text()='creator']]\r
+            | mods32 | 1 ) </para>\r
+        <para>The default value for index-field weights in config.metabib_field is 1. Adjust the\r
+            weighting of indexed fields to boost or lower the relevance score for matches on that\r
+            indexed field. The weight value may be increased or decreased by whole integers. </para>\r
+        <para>For example, by increasing the weight of the title-proper field from 1 to 2, a search\r
+            for <emphasis role="bold">jaguar</emphasis> would double the relevance  for the book\r
+            titled <emphasis role="italic">Aimee and Jaguar</emphasis> than for a record with the\r
+            term <emphasis role="bold">jaguar</emphasis> in another indexed field. </para>\r
+    </section>\r
+    <section>\r
+        <title>Matchpoint Weighting</title>\r
+        <para> Matchpoint weighting provides another way to fine-tune Evergreen relevance ranking,\r
+            and is configured through floating-point multipliers in the multiplier column of the\r
+            search.relevance_adjustment table.</para>\r
+        <para> Weighting can be adjusted for one, more, or all multiplier fields in\r
+            search.relevance_adjustment. </para>\r
+        <para>You can adjust the following three matchpoints:</para>\r
+        <itemizedlist>\r
+            <listitem>\r
+                <para><indexterm>\r
+                        <primary>first_word</primary>\r
+                    </indexterm> boosts relevance if the query is one term long and matches the\r
+                    first term in the indexed field (search for <emphasis role="bold"\r
+                        >twain</emphasis>, get a bonus for <emphasis role="bold">twain,\r
+                        mark</emphasis> but not <emphasis role="bold">mark twain</emphasis>)</para>\r
+            </listitem>\r
+            <listitem>\r
+                <para><indexterm>\r
+                        <primary>word_order</primary>\r
+                    </indexterm> increases relevance for words matching the order of search terms,\r
+                    so that the results for the search <emphasis role="bold">legend\r
+                        suicide</emphasis> would match higher for the book <emphasis role="italic"\r
+                        >Legend of a Suicide</emphasis> than for the book, <emphasis role="italic"\r
+                        >Suicide Legend</emphasis></para>\r
+            </listitem>\r
+            <listitem>\r
+                <para><indexterm>\r
+                        <primary>full_match</primary>\r
+                    </indexterm> boosts relevance when the full query exactly matches the entire\r
+                    indexed field (after space, case, and diacritics are normalized). So a title\r
+                    search for <emphasis role="italic">The Future of Ice</emphasis> would get a\r
+                    relevance boost above <emphasis role="italic">Ice Ages of the\r
+                    Future</emphasis>.</para>\r
+            </listitem>\r
+        </itemizedlist>\r
+        <para> Here are the default settings of the search.relevance_adjustment table: </para>\r
+        <table xml:id="search.relevance">\r
+            <title>search.relevance_adjustment table</title>\r
+            <tgroup cols="4">\r
+                <thead>\r
+                    <row>\r
+                        <entry>field_class</entry>\r
+                        <entry>name</entry>\r
+                        <entry>bump_type</entry>\r
+                        <entry>multiplier</entry>\r
+                    </row>\r
+                </thead>\r
+                <tbody>\r
+                    <row>\r
+                        <entry>author</entry>\r
+                        <entry>conference</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>author</entry>\r
+                        <entry>corporate</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>author </entry>\r
+                        <entry>other </entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>author</entry>\r
+                        <entry>personal</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>keyword</entry>\r
+                        <entry>keyword</entry>\r
+                        <entry>word_order</entry>\r
+                        <entry>10</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>series</entry>\r
+                        <entry>seriestitle</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>series</entry>\r
+                        <entry>seriestitle</entry>\r
+                        <entry>full_match</entry>\r
+                        <entry>20</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>abbreviated</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>abbreviated</entry>\r
+                        <entry>full_match</entry>\r
+                        <entry>20</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>abbreviated</entry>\r
+                        <entry>word_order</entry>\r
+                        <entry>10</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>alternative</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>alternative</entry>\r
+                        <entry>full_match</entry>\r
+                        <entry>20</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>alternative</entry>\r
+                        <entry>word_order</entry>\r
+                        <entry>10</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>proper</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>proper</entry>\r
+                        <entry>full_match</entry>\r
+                        <entry>20</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>proper</entry>\r
+                        <entry>word_order</entry>\r
+                        <entry>10</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>translated</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>translated</entry>\r
+                        <entry>full_match</entry>\r
+                        <entry>20</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>translated</entry>\r
+                        <entry>word_order</entry>\r
+                        <entry>10</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>uniform</entry>\r
+                        <entry>first_word</entry>\r
+                        <entry>1.5</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>uniform</entry>\r
+                        <entry>full_match</entry>\r
+                        <entry>20</entry>\r
+                    </row>\r
+                    <row>\r
+                        <entry>title</entry>\r
+                        <entry>uniform</entry>\r
+                        <entry>word_order</entry>\r
+                        <entry>10</entry>\r
+                    </row>\r
+                </tbody>\r
+            </tgroup>\r
+        </table>\r
+    </section>\r
+    <section>\r
+        <title>Combining Index Weighting and Matchpoint Weighting</title>\r
+        <para>Index weighting and matchpoint weighting may be combined. The relevance boost of the\r
+            combined weighting is equal to the product of the two multiplied values. </para>\r
+        <para>If the relevance setting in the config.metabib_field were increased to 2, and the\r
+            multiplier set to 1.2 in the search.relevance_adjustment table, the resulting matchpoint\r
+            increase would be 240%. </para>\r
+        <note>\r
+            <para>In practice, these weights are applied serially -- first the index weight, then\r
+                all the matchpoint weights that apply -- because they are evaluated at different\r
+                stages of the search process.</para>\r
+        </note>\r
+    </section>\r
+</section>\r