aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Evaluation.md3
1 files changed, 1 insertions, 2 deletions
diff --git a/Evaluation.md b/Evaluation.md
index 6ae19be..2c9aaec 100644
--- a/Evaluation.md
+++ b/Evaluation.md
@@ -4,8 +4,7 @@
The results we obtained are described in the second blogpost: implementation. In this evaluation we will describe the fields that can be added to the index in order to find out the importance of all fields. As stated in table with the scores and ranks, we can see that not all fields that are found to be relevant by BM25 are included in Nordlys. Only `<foaf:givenName>`, `<dpb:name>`, `<foaf:name>`, `<dbo:wikiPageWikiLinkText>` and `<rfds:label>` are used in Nordlys. There might be some fields that we want to add to the index.
-The fields with the top two ranks in both the bm25 and the human assesment rankings are `<dbo:abstract>` and `<rdfs:comment>`.
-Even though these fields are ranked so highly by BM25 we do not recommend to add them, since the abstract contains a lot of information that is covered by other fields such as `<dbp:title>`. Also, since these fields contain large texts, adding them to the index would increase the computing time.
+The fields with the top two ranks in both the bm25 and the human assesment rankings are `<dbo:abstract>` and `<rdfs:comment>`. Even though these fields are ranked so highly by BM25 we do not recommend adding them both, since the `<rdfs:comment>` field is simply a shorter version of `<dbo:abstract>`. Also, since these fields contain large texts, adding them both to the index would likely increase the computing time by quite a bit. Instead, we recommend only adding the `<rdfs:comment>` field.
In this case, we might try adding fields that have a high BM25 score, such as `<dbp:caption>`, which is rank 7 by BM25. However, `<dbp:caption>` can be the caption of an image or an table, which tend to be very specific and not very likely to be relevant overall.