diff options
Diffstat (limited to 'Evaluation.md')
-rw-r--r-- | Evaluation.md | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/Evaluation.md b/Evaluation.md index f32cf82..0a08922 100644 --- a/Evaluation.md +++ b/Evaluation.md @@ -6,7 +6,7 @@ The results we obtained are described in the second blogpost: implementation. In Some fields that scored well using our human assesment as a relevance measures turned out to have a low ranking when using bm25. These fields are `<dbp:ground>` (802), `<dbo:foundingYear>` (299), `<dbp:foundation>` (266). -In this case, we might try adding fields that have a high BM25 score. The fields with the top two ranks in both the bm25 and the human assesment rankings are `<dbo:abstract>` and `<rdfs:comment>`. Even though these fields are ranked so highly by BM25 we do not recommend adding them both, since the `<rdfs:comment>` field is simply a shorter version of `<dbo:abstract>`. Also, since these fields contain large texts, adding them both to the index would likely increase the computing time by quite a bit. Instead, we recommend only adding the `<rdfs:comment>` field. +In this case, we might try adding fields that have a high BM25 score. The fields with the top two ranks in both the BM25 and the human assesment rankings are `<dbo:abstract>` and `<rdfs:comment>`. Even though these fields are ranked so highly by BM25 we do not recommend adding them both, since the `<rdfs:comment>` field is simply a shorter version of `<dbo:abstract>`. Also, since these fields contain large texts, adding them both to the index would likely increase the computing time by quite a bit. Instead, we recommend only adding the `<rdfs:comment>` field. Another field is `<dbp:caption>`, which is rank 7 for BM25. However, `<dbp:caption>` can be the caption of an image or an table, which tend to be very specific and not very likely to be relevant overall. |