Efficient Graph-based Document Similarity

What phenomena or properties are being investigated?
Quality and speed of an ‘efficient’ Method of ‘Similiar-Document-Search’.
The innovative idea of the method is to ‘semantically expands’ Documents as a pre-processing step and not at search time and use a new similarity measure, which combines hierarchical and transversal information.

Why are those phenomena or properties of interest?
Used in ‘many’ applications: document retrieval, recommendation …

Has the aim of the research been articulated?
I think it is “Improving Similiar-Document-Search”.

What are the specific hypotheses and research questions?
Does the new Method for storing and querying improve speed and quality of the recommendation.

Are these elements convincingly connected to each other?
I think you could separate this paper into two. One for the speed of the retrieval using known similarity measures and one for the quality of the new measure.
But I do not know much about this area maybe those two are interlinked somehow.

To what extent is the work innovative? Is this reflected in the claims?
A new method of storing, retrieval and similarity measures are presented.
The work claims to outperform related works in quality and speed of the retrieval.

What would disprove the hypothesis? Does it have any improbable consequences?
‘Better similarity’ could be disproved on a test on another dataset, with different characteristics.
‘Better speed’ i did not find a proof for that claim. The timings of the approach were reported but not compared to anything.

What are the underlying assumptions? Are they sensible?
Knowledge-Graph based similarity measures are better than word-distribution-based ones = sensible.
Inverted-Index based searches are fast = sensible.
create a candidate set fast and dirty and then use slower but better algorithm on that set is faster than full search without sacrificing too much quality = sensible.
Using two benchmarks are enough to say it outperforms = insensible.
Has the work been critically questioned? have you satisfied yourself that it is sound science?
It was accepted into the 13th ESWC 2016 (European Semantic Web Conference). So at least someone has looked at it.

What forms of evidence are to be used?
Experiments.

How is the evidence to measured? Are the chosen methods of measurement objective, appropriate and reasonable?
They use what sounded like established measures for the quality of a retrieval. They used it on established benchmarks. OK.

For speed they use time-measurement, they do not state what Hardware is used for the experiment or if the different competing approaches even ran on the same hardware and used the same experiment setup. NOT OK.

What are the qualitative aims, and what makes the quantitative measures you have chosen appropriately to those aims?
There is a quantitative measure for quality with higher numbers being better.
Same for speed but with lower numbers being better.

What compromises or simplifications are inherent in your choice of measure?
The time an algorithm takes to execute is dependent on many things (CPU speed, background processes etc.) and in this case also depends on the network speed.

Will the outcome be predictive?
No a different setup or different dataset can bring different results.

What is the argument that will link the evidence to the hypothesis?
Better (higher/lower) numbers than others prove the hypothesis since it just says that it is better than others.

To what extent will positive results persuasively confirm the hypothesis? Will negative results disprove it?
To a small extent since there is only a small number of test data sets. Negative results will disprove the hypothesis since it is so general.

What are the likely weaknesses of or limitations to your approach?
Since the claim is so general it is hard to proof.