Parsisanj: a semi-automatic component-based approach towards search engine evaluation
Accessing to required data on the internet is wide via search engines in the last two decades owing to the huge amount of available data and the high rate of new data is generating daily. Accordingly, search engines are encouraged to make the most valuable existing data on the web searchable. Knowing how to handle a large amount of data in each step of a search engines' procedure from crawling to indexing and ranking is just one of the challenges that a professional search engine should solve. Moreover, it should also have the best practices in handling users' traffics, state-of-the-art natural language processing tools, and should also address many other challenges on the edge of science and technology. As a result, evaluating these systems is too challenging due to the level of internal complexity they have, and is crucial for finding the improvement path of the existing system. Therefore, an evaluation procedure is a normal subsystem of a search engine that has the role of building its roadmap. Recently, several countries have developed national search engine programs to build an infrastructure to provide special services based on their needs on the available data of their language on the web. This research is conducted accordingly to enlighten the advancement path of two Iranian national search engines: Yooz and Parsijoo in comparison with two international ones, Google and Bing. Unlike related work, it is a semi-automatic method to evaluate the search engines at the first pace. Eventually, we obtained some interesting results which based on them the component-based improvement roadmap of national search engines could be illustrated concretely.
READ FULL TEXT