diff options
author | Luuk Arts | 2018-01-12 11:30:16 +0100 |
---|---|---|
committer | GitHub | 2018-01-12 11:30:16 +0100 |
commit | f7dc76443d122c60763e6f4ede69d7777a3bfe1e (patch) | |
tree | fae84b3efd1bfc7a3c64baee97ccea4ec3dc5d34 | |
parent | Update Evaluation.md (diff) |
Update Evaluation.md
-rw-r--r-- | Evaluation.md | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/Evaluation.md b/Evaluation.md index f32cf82..0a08922 100644 --- a/Evaluation.md +++ b/Evaluation.md @@ -6,7 +6,7 @@ The results we obtained are described in the second blogpost: implementation. In Some fields that scored well using our human assesment as a relevance measures turned out to have a low ranking when using bm25. These fields are `<dbp:ground>` (802), `<dbo:foundingYear>` (299), `<dbp:foundation>` (266). -In this case, we might try adding fields that have a high BM25 score. The fields with the top two ranks in both the bm25 and the human assesment rankings are `<dbo:abstract>` and `<rdfs:comment>`. Even though these fields are ranked so highly by BM25 we do not recommend adding them both, since the `<rdfs:comment>` field is simply a shorter version of `<dbo:abstract>`. Also, since these fields contain large texts, adding them both to the index would likely increase the computing time by quite a bit. Instead, we recommend only adding the `<rdfs:comment>` field. +In this case, we might try adding fields that have a high BM25 score. The fields with the top two ranks in both the BM25 and the human assesment rankings are `<dbo:abstract>` and `<rdfs:comment>`. Even though these fields are ranked so highly by BM25 we do not recommend adding them both, since the `<rdfs:comment>` field is simply a shorter version of `<dbo:abstract>`. Also, since these fields contain large texts, adding them both to the index would likely increase the computing time by quite a bit. Instead, we recommend only adding the `<rdfs:comment>` field. Another field is `<dbp:caption>`, which is rank 7 for BM25. However, `<dbp:caption>` can be the caption of an image or an table, which tend to be very specific and not very likely to be relevant overall. |