Previous research shows that users tend to change their assessment of search results over time. T... more Previous research shows that users tend to change their assessment of search results over time. This is a first study that investigates the factors and reasons for these changes, and describes a stochastic model of user behaviour that may explain these changes. In particular, we hypothesise that most of the changes are local, i.e. between results with similar or close relevance to the query, and thus belong to the same"coarse" relevance category. According to the theory of coarse beliefs and categorical thinking, humans tend to divide the range of values under consideration into coarse categories, and are thus able to distinguish only between cross-category values but not within them. To test this hypothesis we conducted five experiments with about 120 subjects divided into 3 groups. Each student in every group was asked to rank and assign relevance scores to the same set of search results over two or three rounds, with a period of three to nine weeks between each round. T...
This paper proposes a set of measures to evaluate search engine functionality over time. When com... more This paper proposes a set of measures to evaluate search engine functionality over time. When coming to evaluate the performance of Web search engines, the evaluation criteria used in traditional information retrieval systems (precision, recall, etc.) are not sufficient. Web search engines operate in a highly dynamic, distributed environment, therefore it becomes necessary to assess search engine performance not just at a single point in time, but over a whole period.
... our evaluation methodology, section 4 is dedicated to the results and section 5 concludes and... more ... our evaluation methodology, section 4 is dedicated to the results and section 5 concludes and ... The evaluation was based on ten queries (when the queries included diacritics the searches were ... in the context of information retrieval, but not in the context of the Web (eg Arabic [1 ...
The objective of this study was to characterize the changes in the rankings of the top-n results ... more The objective of this study was to characterize the changes in the rankings of the top-n results of major search engines over time and to compare the rankings between these engines. We considered only the top-ten results, since users usually inspect only the first page returned by the search engine, which normally contains ten results. In particular, we compare rankings
Anecdotal evidence exists that in many positions two distinct chess engines will choose different... more Anecdotal evidence exists that in many positions two distinct chess engines will choose different moves and, moreover, that their top-n ranking of move choices also differ. Here we set out to quantify this difference, including the difference between move choices by chess engines and those made by humans. For our analysis we used FRITZ 8 and JUNIOR 9 as representative chess search engines and the POWERBOOK opening book as representing human choices. We collected the top-5 ranked moves and their scores as reported by FRITZ and JUNIOR, after 15 and 30 minutes of thinking time, and the top-5 moves recorded in the POWERBOOK, for the Nunn2 test positions and the initial board position. The data analysis was carried out using several nonparametric measures, including the amount of overlap in the top-5 choices of the engines and their association as measured by three variants of Spearman's footrule. Our preliminary results show that, overall, the engines differ substantially in their choice of moves, and, furthermore, the engines' choices also differ substantially from human choice.
Web research is based on data from or about the Web. Often data is collected using search engines... more Web research is based on data from or about the Web. Often data is collected using search engines. Here we describe our "wish list" for the ideal search engine, explain the need for the specific features and examine whether the currently existing major search engines can at ...
Abstract: In this paper, we analyze those publications of the home institutes of the iSchools tha... more Abstract: In this paper, we analyze those publications of the home institutes of the iSchools that are indexed by Thomson Reuters (ISI) Web of Science in the information science and library science category, and were published between 2000 and 2009.
Resumen: Currently existing data sources for informetric research are far from being perfect. Som... more Resumen: Currently existing data sources for informetric research are far from being perfect. Some of the imperfections are caused by uneven coverage, errors or changes in indexing policies that are often not retroactive or by mistaken or ineffective retrieval strategies ...
ABSTRACT THE INTERNET, AND MORE SPECIFICALLY the WorldWide Web, is quickly becoming one of our ma... more ABSTRACT THE INTERNET, AND MORE SPECIFICALLY the WorldWide Web, is quickly becoming one of our main information sources. Systematic evaluation and analysis can help us understand how this medium works, grows, and changes, and how it influences our lives and ...
Journal of the Association for Information Science and Technology, 2014
Blogs that cite academic articles have emerged as a potential source for alternative impact metri... more Blogs that cite academic articles have emerged as a potential source for alternative impact metrics for the visibility of the blogged articles. Nevertheless, in order to more fully evaluate the value of blog citations, it is necessary to investigate whether research blogs focus on particular types of articles or give new perspectives into scientific discourse. Thus, we studied the characteristics of peer-reviewed references in blogs and the typical content of blog posts to get insights into the bloggers' motivations. The sample consisted of 391 blog posts from 2010-2012 in Researchblogging.org's Health category. The bloggers mostly cited recent research articles or reviews from top multidisciplinary and general medical journals. Using content analysis methods, we created a general classification scheme for blog post content with ten major topic categories, each with several subcategories. The results suggest that health research bloggers rarely self-cite and the vast majority of their blog posts (90%) include a general discussion of the issue covered in the article, with over a quarter providing health-related advice based on the article(s) covered. These factors suggest a genuine attempt to engage with a wider nonacademic audience. Nevertheless, almost 30% of the posts included some criticism of the issues being discussed. Given that explicit criticism is rare in academic articles, this suggests that blogs are a more natural home for this important scientific activity.
Proceedings of the American Society for Information Science and Technology, 2014
ABSTRACT The aim of this SIG/MET-sponsored panel is to discuss major informetric topics including... more ABSTRACT The aim of this SIG/MET-sponsored panel is to discuss major informetric topics including the impact factor, the h-index, sources of citation data, the Eigenfactor, the making and use of base maps of science, application of informetrics (e.g., for retrieval purposes), altmetrics, and future perspectives on bibliometrics. The panel especially addresses attendees who want to expand their knowledge in this area or got in touch with it only recently.
... and fields. We would like to thank Ms. Aviva Joseph for helping with the characterization of ... more ... and fields. We would like to thank Ms. Aviva Joseph for helping with the characterization of the changes that URLs had undergone and thanks to TamarBar-Ilan for helping with the data collection. Bibliography ALM1ND, TC ...
Abstract In this paper we investigate the retrieval capabilities of six Internet search engines o... more Abstract In this paper we investigate the retrieval capabilities of six Internet search engines on a simple query. As a case study the query Erdos was chosen. Paul Erdos was a world famous Hungarian mathematician, who passed away in September 1996. Existing work ...
Proceedings of the American Society for Information Science and Technology, 2009
the European countries, as of December 2008; it was 88.8% (Parag, 2009). Currently Google has loc... more the European countries, as of December 2008; it was 88.8% (Parag, 2009). Currently Google has local sites for 168 countries or territories (Google, 2009). The localized Google versions allow displaying country-specific sponsored results, and in addition they allow Google to provide country-specific display of the organic results as well. As an example, consider Figures 1 and 2: in both cases we searched for Brown on January 27, 2009 -once on google.com and once on google.co.uk. Only on google.co.uk we see among the top-three results a result related to Prime Minister Gordon Brown. On google.com there was no mention of Gordon Brown among the top-ten results for the query Brown. Similarly, at Live Search, the user can select a region to "discover a search experience tailored to your part of the world" (Live Search, 2009). Currently one can choose from 58 regions.
Previous research shows that users tend to change their assessment of search results over time. T... more Previous research shows that users tend to change their assessment of search results over time. This is a first study that investigates the factors and reasons for these changes, and describes a stochastic model of user behaviour that may explain these changes. In particular, we hypothesise that most of the changes are local, i.e. between results with similar or close relevance to the query, and thus belong to the same"coarse" relevance category. According to the theory of coarse beliefs and categorical thinking, humans tend to divide the range of values under consideration into coarse categories, and are thus able to distinguish only between cross-category values but not within them. To test this hypothesis we conducted five experiments with about 120 subjects divided into 3 groups. Each student in every group was asked to rank and assign relevance scores to the same set of search results over two or three rounds, with a period of three to nine weeks between each round. T...
This paper proposes a set of measures to evaluate search engine functionality over time. When com... more This paper proposes a set of measures to evaluate search engine functionality over time. When coming to evaluate the performance of Web search engines, the evaluation criteria used in traditional information retrieval systems (precision, recall, etc.) are not sufficient. Web search engines operate in a highly dynamic, distributed environment, therefore it becomes necessary to assess search engine performance not just at a single point in time, but over a whole period.
... our evaluation methodology, section 4 is dedicated to the results and section 5 concludes and... more ... our evaluation methodology, section 4 is dedicated to the results and section 5 concludes and ... The evaluation was based on ten queries (when the queries included diacritics the searches were ... in the context of information retrieval, but not in the context of the Web (eg Arabic [1 ...
The objective of this study was to characterize the changes in the rankings of the top-n results ... more The objective of this study was to characterize the changes in the rankings of the top-n results of major search engines over time and to compare the rankings between these engines. We considered only the top-ten results, since users usually inspect only the first page returned by the search engine, which normally contains ten results. In particular, we compare rankings
Anecdotal evidence exists that in many positions two distinct chess engines will choose different... more Anecdotal evidence exists that in many positions two distinct chess engines will choose different moves and, moreover, that their top-n ranking of move choices also differ. Here we set out to quantify this difference, including the difference between move choices by chess engines and those made by humans. For our analysis we used FRITZ 8 and JUNIOR 9 as representative chess search engines and the POWERBOOK opening book as representing human choices. We collected the top-5 ranked moves and their scores as reported by FRITZ and JUNIOR, after 15 and 30 minutes of thinking time, and the top-5 moves recorded in the POWERBOOK, for the Nunn2 test positions and the initial board position. The data analysis was carried out using several nonparametric measures, including the amount of overlap in the top-5 choices of the engines and their association as measured by three variants of Spearman's footrule. Our preliminary results show that, overall, the engines differ substantially in their choice of moves, and, furthermore, the engines' choices also differ substantially from human choice.
Web research is based on data from or about the Web. Often data is collected using search engines... more Web research is based on data from or about the Web. Often data is collected using search engines. Here we describe our "wish list" for the ideal search engine, explain the need for the specific features and examine whether the currently existing major search engines can at ...
Abstract: In this paper, we analyze those publications of the home institutes of the iSchools tha... more Abstract: In this paper, we analyze those publications of the home institutes of the iSchools that are indexed by Thomson Reuters (ISI) Web of Science in the information science and library science category, and were published between 2000 and 2009.
Resumen: Currently existing data sources for informetric research are far from being perfect. Som... more Resumen: Currently existing data sources for informetric research are far from being perfect. Some of the imperfections are caused by uneven coverage, errors or changes in indexing policies that are often not retroactive or by mistaken or ineffective retrieval strategies ...
ABSTRACT THE INTERNET, AND MORE SPECIFICALLY the WorldWide Web, is quickly becoming one of our ma... more ABSTRACT THE INTERNET, AND MORE SPECIFICALLY the WorldWide Web, is quickly becoming one of our main information sources. Systematic evaluation and analysis can help us understand how this medium works, grows, and changes, and how it influences our lives and ...
Journal of the Association for Information Science and Technology, 2014
Blogs that cite academic articles have emerged as a potential source for alternative impact metri... more Blogs that cite academic articles have emerged as a potential source for alternative impact metrics for the visibility of the blogged articles. Nevertheless, in order to more fully evaluate the value of blog citations, it is necessary to investigate whether research blogs focus on particular types of articles or give new perspectives into scientific discourse. Thus, we studied the characteristics of peer-reviewed references in blogs and the typical content of blog posts to get insights into the bloggers' motivations. The sample consisted of 391 blog posts from 2010-2012 in Researchblogging.org's Health category. The bloggers mostly cited recent research articles or reviews from top multidisciplinary and general medical journals. Using content analysis methods, we created a general classification scheme for blog post content with ten major topic categories, each with several subcategories. The results suggest that health research bloggers rarely self-cite and the vast majority of their blog posts (90%) include a general discussion of the issue covered in the article, with over a quarter providing health-related advice based on the article(s) covered. These factors suggest a genuine attempt to engage with a wider nonacademic audience. Nevertheless, almost 30% of the posts included some criticism of the issues being discussed. Given that explicit criticism is rare in academic articles, this suggests that blogs are a more natural home for this important scientific activity.
Proceedings of the American Society for Information Science and Technology, 2014
ABSTRACT The aim of this SIG/MET-sponsored panel is to discuss major informetric topics including... more ABSTRACT The aim of this SIG/MET-sponsored panel is to discuss major informetric topics including the impact factor, the h-index, sources of citation data, the Eigenfactor, the making and use of base maps of science, application of informetrics (e.g., for retrieval purposes), altmetrics, and future perspectives on bibliometrics. The panel especially addresses attendees who want to expand their knowledge in this area or got in touch with it only recently.
... and fields. We would like to thank Ms. Aviva Joseph for helping with the characterization of ... more ... and fields. We would like to thank Ms. Aviva Joseph for helping with the characterization of the changes that URLs had undergone and thanks to TamarBar-Ilan for helping with the data collection. Bibliography ALM1ND, TC ...
Abstract In this paper we investigate the retrieval capabilities of six Internet search engines o... more Abstract In this paper we investigate the retrieval capabilities of six Internet search engines on a simple query. As a case study the query Erdos was chosen. Paul Erdos was a world famous Hungarian mathematician, who passed away in September 1996. Existing work ...
Proceedings of the American Society for Information Science and Technology, 2009
the European countries, as of December 2008; it was 88.8% (Parag, 2009). Currently Google has loc... more the European countries, as of December 2008; it was 88.8% (Parag, 2009). Currently Google has local sites for 168 countries or territories (Google, 2009). The localized Google versions allow displaying country-specific sponsored results, and in addition they allow Google to provide country-specific display of the organic results as well. As an example, consider Figures 1 and 2: in both cases we searched for Brown on January 27, 2009 -once on google.com and once on google.co.uk. Only on google.co.uk we see among the top-three results a result related to Prime Minister Gordon Brown. On google.com there was no mention of Gordon Brown among the top-ten results for the query Brown. Similarly, at Live Search, the user can select a region to "discover a search experience tailored to your part of the world" (Live Search, 2009). Currently one can choose from 58 regions.
Uploads
Papers by Judit Bar-Ilan