aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Implementation.md49
1 files changed, 48 insertions, 1 deletions
diff --git a/Implementation.md b/Implementation.md
index b60b8b9..ae527d7 100644
--- a/Implementation.md
+++ b/Implementation.md
@@ -187,4 +187,51 @@ This assumes that you have the following files from Nordlys:
The system is agnostic with regards to the ranking function (BM25 or another
method).
-## Intermediate Result
+## Intermediate Results
+These are the thirty most important fields as found by our measure:
+
+| Field | Score | Used by Nordlys |
+|------------------------------|----------:|:---------------:|
+| `<dbp:imageFlag>` | 2205.50 | ![][n] |
+| `<dbp:office>` | 2246.90 | ![][n] |
+| `<dbp:pushpinMapCaption>` | 2357.07 | ![][n] |
+| `<dbp:description>` | 2357.35 | ![][n] |
+| `<dbp:placeOfBirth>` | 2384.14 | ![][n] |
+| `<dbp:fastTime>` | 2440.73 | ![][n] |
+| `<dbp:imageMap>` | 2485.96 | ![][n] |
+| `<dbp:writer>` | 2689.86 | ![][n] |
+| `<dbp:alt>` | 2691.94 | ![][n] |
+| `<foaf:givenName>` | 2694.41 | ![][y] |
+| `<dbp:poleTime>` | 2698.75 | ![][n] |
+| `<dbp:country>` | 2836.44 | ![][n] |
+| `<dbp:type>` | 3248.58 | ![][n] |
+| `<dbo:office>` | 3425.58 | ![][n] |
+| `<dbp:location>` | 3430.20 | ![][n] |
+| `<dbp:officialName>` | 4316.34 | ![][y] |
+| `<dbp:quote>` | 4470.38 | ![][n] |
+| `<dbp:imageCaption>` | 4480.06 | ![][n] |
+| `<dbp:producer>` | 4704.52 | ![][n] |
+| `<dbp:mapCaption>` | 8040.36 | ![][n] |
+| `<dbp:title>` | 10999.72 | ![][n] |
+| `<dbp:shortDescription>` | 22065.46 | ![][n] |
+| `<dc:description>` | 23442.34 | ![][n] |
+| `<dbp:caption>` | 24697.75 | ![][n] |
+| `<dbp:name>` | 25500.42 | ![][y] |
+| `<foaf:name>` | 32860.37 | ![][y] |
+| `<dbo:wikiPageWikiLinkTent>` | 86218.71 | ![][y] |
+| `<rdfs:label>` | 105358.89 | ![][y] |
+| `<rdfs:comment>` | 514446.08 | ![][n] |
+| `<dbo:abstract>` | 581355.57 | ![][n] |
+
+We see that many of the relevant fields are actually [not used by
+Nordlys](https://iai-group.github.io/DBpedia-Entity/index_details.html).
+However, this is not yet an indication that these fields should be added to the
+index. After all, adding more fields means more computation time to build the
+index and to retrieve search results.
+
+In fact, we expect that many of the fields not used actually display
+similarities with fields that *are* indexed. For example, the `<dbo:abstract>`
+field will probably match because the title is repeated in the abstract.
+
+[y]: http://i.stack.imgur.com/iro5J.png
+[n]: http://i.stack.imgur.com/asAya.png