In the Internet age there are stored enormous amounts of data daily. Nowadays, using data mining ... more In the Internet age there are stored enormous amounts of data daily. Nowadays, using data mining techniques to extract knowledge from web log files has became a necessity. The behavior of Internet users can be found in the log files stored on Internet servers. Web log analysis can improve business firms that are based on a Web site through learning user behavior and applying this knowledge to target them for example to pages that other users with similar behavior have visited. The extraction of useful information from these data has proved to be very useful for optimizing Web sites and promotional campaigns for marketing, etc. In this paper I will focus on finding associations as a data mining technique to extract potentially useful knowledge from web usage data. I implemented in Java programming language, using NetBeans IDE, a program for identification of pages' association from sessions. For exemplification, I used the log files from a commercial web site.
The technological revolution of recent years about the spectacular development of the Internet ha... more The technological revolution of recent years about the spectacular development of the Internet has made its presence felt in the economy. Electronic commerce is already a major component of the economy and thus influences the labor market. In this article, we present electronic commerce as an alternative to increase the number of employees. Given that only 6% of Romanian currently use electronic commerce, its growth potential is huge and in terms of thoughtful strategies it can be achieved their guidance to specific regions. Attracting companies operating on the Internet in a given region can be done by providing incentives and beneficial effects will be felt not only through the number of employees but also through the services used by these companies.
The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly ... more The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly in recent years and his content is changing every day. Today, they are several billions of HTML documents, pictures and another multimedia files available on the Internet. There is a need of methods to help us extract information from the content of web pages. One answer to this problem is using the data mining techniques that is known as web content mining, which is defined as "the process of extracting useful information from the text, images and other forms of content that make up the pages".
The World Wide Web became one of the most valuable resources for information retrievals and knowl... more The World Wide Web became one of the most valuable resources for information retrievals and knowledge discoveries due to the permanent increasing of the amount of data available online. Taking into consideration the web dimension, the users get easily lost in the web's rich hyper structure. Application of data mining methods is the right solution for knowledge discovery on the Web. The knowledge extracted from the Web can be used to raise the performances for Web information retrievals, question answering and Web based data warehousing. In this paper, I provide an introduction of Web mining categories and I focus on one of these categories: the Web structure mining. Web structure mining, one of three categories of web mining for data, is a tool used to identify the relationship between Web pages linked by information or direct link connection. It offers information about how different pages are linked together to form this huge web. Web Structure Mining finds hidden basic structures and uses hyperlinks for more web applications such as web search.
Ovidius University Annals, Economic Sciences Series, 2011
The age of Information technology with lots of services is upon us. Nowadays, using computers to ... more The age of Information technology with lots of services is upon us. Nowadays, using computers to do all sorts of daily tasks has become a necessity. Internet technology is changing faster, and the pace of it's innovation and adoption is truly staggering. Apart from ...
the data preprocessing, sessions identification is a very important step. Algorithms used so far ... more the data preprocessing, sessions identification is a very important step. Algorithms used so far to identify sessions use some fixed values to specify the end of a session and to mark the beginning of another. In this paper we explain why the use of fixed values cause errors in identifying sessions and we propose a new method for identifying sessions based on average time of visiting web pages We implemented in Java programming language by using NetBeans IDE, two algorithms to identify sessions. The first uses a fixed value of 30 minutes (1800 seconds) to indicate the end of a session and the second using the average time spent on the pages of the website by users. For exemplification we used the NASA log file available online at
The quality of decisions is based on the quality of processed data. So it is important that at th... more The quality of decisions is based on the quality of processed data. So it is important that at the beginning of the data mining process to provide correct and quality data. The preprocessing data is a necessity for avoiding the failure of the data analysis. The idea that the data mining process can be done without human supervision has proved to be wrong. Even so, the humans are trying to automate as much as possible the process. From here are resulting many algorithms and techniques that are implemented using various programming language. In this work is presented an algorithm for identifying the sessions from a web logs file. It uses a value of 30 minutes to mark the end of a session and start another. We compute the average time for visiting the pages and using this we show that the presented algorithm produces errors in identifying sessions. We consider that the correct way to identify the session is to take into account the average time for visiting the pages.
Web servers worldwide generate a vast amount of information on web users’ browsing activities. Se... more Web servers worldwide generate a vast amount of information on web users’ browsing activities. Several researchers have studied these so-called clickstream or web access log data to better understand and characterize web users. Clickstream data can be enriched with information about the content of visited pages and the origin (e.g., geographic, organizational) of the requests. The goal of this project is to analyse user behaviour by mining enriched web access log data. With the continued growth and proliferation of e-commerce, Web services, and Web-based information systems, the volumes of click stream and user data collected by Web-based organizations in their daily operations has reached astronomical proportions. This information can be exploited in various ways, such as enhancing the effectiveness of websites or developing directed web marketing campaigns. The discovered patterns are usually represented as collections of pages, objects, or re-sources that are frequently accessed ...
In this article I focus on developing an expert system for advising the choice of wine that best ... more In this article I focus on developing an expert system for advising the choice of wine that best matches a specific occasion. An expert system is a computer application that performs a task that would be performed by a human expert. The implementation is done using Delphi programming language. I used to represent the knowledge bases a set of rules. The rules are of type IF THEN ELSE rules, decision rules based on different important wine features.
Due to the continuous growth and spread of the internet using Web Mining to improve the quality o... more Due to the continuous growth and spread of the internet using Web Mining to improve the quality of different services has become a necessity. Web Mining is nothing else than applying data mining techniques and algorithms on web data. In this work we present two algorithms used in Web Structure Mining namely Page Rank and HITS. Both algorithms draw their origin from social networks analysis and they are modeled based on the Theory of Markov Chains. Page Rank is used by the search engine GOOGLE and HITS by the search engine CLEVER. We present their strengths, weakness and other areas of applicability.
Nowadays, using data mining techniques to extract knowledge from web log files has became a neces... more Nowadays, using data mining techniques to extract knowledge from web log files has became a necessity. The behavior of Internet users can be found in the log files stored on Internet servers. Web log analysis can improve business firms that are based on a Web site through learning user behavior and applying this knowledge to target them for example to pages that other users with similar behavior have visited. In this paper we adapt random model navigation, which uses rank web pages for a single Web site by replacing the jump probability matrix with a matrix derived from adjacency matrix of the web site. We use the modified algorithm to predict the next page that the user will navigate. Also we note future research directions.
The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly ... more The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly in recent years and his content is changing every day. Today, they are several billions of HTML documents, pictures and another multimedia files available on the Internet. There is a need of methods to help us extract information from the content of web pages. One answer to this problem is using the data mining techniques that is known as web content mining, which is defined as "the process of extracting useful information from the text, images and other forms of content that make up the pages".
The World Wide Web became one of the most valuable resources for information retrievals and knowl... more The World Wide Web became one of the most valuable resources for information retrievals and knowledge discoveries due to the permanent increasing of the amount of data available online. Taking into consideration the web dimension, the users get easily lost in the web's rich hyper structure. Application of data mining methods is the right solution for knowledge discovery on the Web. The knowledge extracted from the Web can be used to raise the performances for Web information retrievals, question answering and Web based data warehousing. In this paper, I provide an introduction of Web mining categories and I focus on one of these categories: the Web structure mining. Web structure mining, one of three categories of web mining for data, is a tool used to identify the relationship between Web pages linked by information or direct link connection. It offers information about how different pages are linked together to form this huge web. Web Structure Mining finds hidden basic structures and uses hyperlinks for more web applications such as web search. KEY WORDS: web mining; internet; web structure mining; link mining. JEL CLASSIFICATION: L86.
In the data preprocessing, sessions identification is a very important step. Algorithms used so f... more In the data preprocessing, sessions identification is a very important step. Algorithms used so far to identify sessions use some fixed values to specify the end of a session and to mark the beginning of another. In this paper we explain why the use of fixed values cause errors in identifying sessions and we propose a new method for identifying sessions based on average time of visiting web pages
In the Internet age there are stored enormous amounts of data daily. Nowadays, using data mining ... more In the Internet age there are stored enormous amounts of data daily. Nowadays, using data mining techniques to extract knowledge from web log files has became a necessity. The behavior of Internet users can be found in the log files stored on Internet servers. Web log analysis can improve business firms that are based on a Web site through learning user behavior and applying this knowledge to target them for example to pages that other users with similar behavior have visited. The extraction of useful information from these data has proved to be very useful for optimizing Web sites and promotional campaigns for marketing, etc. In this paper I will focus on finding associations as a data mining technique to extract potentially useful knowledge from web usage data. I implemented in Java programming language, using NetBeans IDE, a program for identification of pages' association from sessions. For exemplification, I used the log files from a commercial web site.
The technological revolution of recent years about the spectacular development of the Internet ha... more The technological revolution of recent years about the spectacular development of the Internet has made its presence felt in the economy. Electronic commerce is already a major component of the economy and thus influences the labor market. In this article, we present electronic commerce as an alternative to increase the number of employees. Given that only 6% of Romanian currently use electronic commerce, its growth potential is huge and in terms of thoughtful strategies it can be achieved their guidance to specific regions. Attracting companies operating on the Internet in a given region can be done by providing incentives and beneficial effects will be felt not only through the number of employees but also through the services used by these companies.
The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly ... more The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly in recent years and his content is changing every day. Today, they are several billions of HTML documents, pictures and another multimedia files available on the Internet. There is a need of methods to help us extract information from the content of web pages. One answer to this problem is using the data mining techniques that is known as web content mining, which is defined as "the process of extracting useful information from the text, images and other forms of content that make up the pages".
The World Wide Web became one of the most valuable resources for information retrievals and knowl... more The World Wide Web became one of the most valuable resources for information retrievals and knowledge discoveries due to the permanent increasing of the amount of data available online. Taking into consideration the web dimension, the users get easily lost in the web's rich hyper structure. Application of data mining methods is the right solution for knowledge discovery on the Web. The knowledge extracted from the Web can be used to raise the performances for Web information retrievals, question answering and Web based data warehousing. In this paper, I provide an introduction of Web mining categories and I focus on one of these categories: the Web structure mining. Web structure mining, one of three categories of web mining for data, is a tool used to identify the relationship between Web pages linked by information or direct link connection. It offers information about how different pages are linked together to form this huge web. Web Structure Mining finds hidden basic structures and uses hyperlinks for more web applications such as web search.
Ovidius University Annals, Economic Sciences Series, 2011
The age of Information technology with lots of services is upon us. Nowadays, using computers to ... more The age of Information technology with lots of services is upon us. Nowadays, using computers to do all sorts of daily tasks has become a necessity. Internet technology is changing faster, and the pace of it's innovation and adoption is truly staggering. Apart from ...
the data preprocessing, sessions identification is a very important step. Algorithms used so far ... more the data preprocessing, sessions identification is a very important step. Algorithms used so far to identify sessions use some fixed values to specify the end of a session and to mark the beginning of another. In this paper we explain why the use of fixed values cause errors in identifying sessions and we propose a new method for identifying sessions based on average time of visiting web pages We implemented in Java programming language by using NetBeans IDE, two algorithms to identify sessions. The first uses a fixed value of 30 minutes (1800 seconds) to indicate the end of a session and the second using the average time spent on the pages of the website by users. For exemplification we used the NASA log file available online at
The quality of decisions is based on the quality of processed data. So it is important that at th... more The quality of decisions is based on the quality of processed data. So it is important that at the beginning of the data mining process to provide correct and quality data. The preprocessing data is a necessity for avoiding the failure of the data analysis. The idea that the data mining process can be done without human supervision has proved to be wrong. Even so, the humans are trying to automate as much as possible the process. From here are resulting many algorithms and techniques that are implemented using various programming language. In this work is presented an algorithm for identifying the sessions from a web logs file. It uses a value of 30 minutes to mark the end of a session and start another. We compute the average time for visiting the pages and using this we show that the presented algorithm produces errors in identifying sessions. We consider that the correct way to identify the session is to take into account the average time for visiting the pages.
Web servers worldwide generate a vast amount of information on web users’ browsing activities. Se... more Web servers worldwide generate a vast amount of information on web users’ browsing activities. Several researchers have studied these so-called clickstream or web access log data to better understand and characterize web users. Clickstream data can be enriched with information about the content of visited pages and the origin (e.g., geographic, organizational) of the requests. The goal of this project is to analyse user behaviour by mining enriched web access log data. With the continued growth and proliferation of e-commerce, Web services, and Web-based information systems, the volumes of click stream and user data collected by Web-based organizations in their daily operations has reached astronomical proportions. This information can be exploited in various ways, such as enhancing the effectiveness of websites or developing directed web marketing campaigns. The discovered patterns are usually represented as collections of pages, objects, or re-sources that are frequently accessed ...
In this article I focus on developing an expert system for advising the choice of wine that best ... more In this article I focus on developing an expert system for advising the choice of wine that best matches a specific occasion. An expert system is a computer application that performs a task that would be performed by a human expert. The implementation is done using Delphi programming language. I used to represent the knowledge bases a set of rules. The rules are of type IF THEN ELSE rules, decision rules based on different important wine features.
Due to the continuous growth and spread of the internet using Web Mining to improve the quality o... more Due to the continuous growth and spread of the internet using Web Mining to improve the quality of different services has become a necessity. Web Mining is nothing else than applying data mining techniques and algorithms on web data. In this work we present two algorithms used in Web Structure Mining namely Page Rank and HITS. Both algorithms draw their origin from social networks analysis and they are modeled based on the Theory of Markov Chains. Page Rank is used by the search engine GOOGLE and HITS by the search engine CLEVER. We present their strengths, weakness and other areas of applicability.
Nowadays, using data mining techniques to extract knowledge from web log files has became a neces... more Nowadays, using data mining techniques to extract knowledge from web log files has became a necessity. The behavior of Internet users can be found in the log files stored on Internet servers. Web log analysis can improve business firms that are based on a Web site through learning user behavior and applying this knowledge to target them for example to pages that other users with similar behavior have visited. In this paper we adapt random model navigation, which uses rank web pages for a single Web site by replacing the jump probability matrix with a matrix derived from adjacency matrix of the web site. We use the modified algorithm to predict the next page that the user will navigate. Also we note future research directions.
The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly ... more The World Wide Web, or simply the web, is the most dynamic environment.The web has grown steadly in recent years and his content is changing every day. Today, they are several billions of HTML documents, pictures and another multimedia files available on the Internet. There is a need of methods to help us extract information from the content of web pages. One answer to this problem is using the data mining techniques that is known as web content mining, which is defined as "the process of extracting useful information from the text, images and other forms of content that make up the pages".
The World Wide Web became one of the most valuable resources for information retrievals and knowl... more The World Wide Web became one of the most valuable resources for information retrievals and knowledge discoveries due to the permanent increasing of the amount of data available online. Taking into consideration the web dimension, the users get easily lost in the web's rich hyper structure. Application of data mining methods is the right solution for knowledge discovery on the Web. The knowledge extracted from the Web can be used to raise the performances for Web information retrievals, question answering and Web based data warehousing. In this paper, I provide an introduction of Web mining categories and I focus on one of these categories: the Web structure mining. Web structure mining, one of three categories of web mining for data, is a tool used to identify the relationship between Web pages linked by information or direct link connection. It offers information about how different pages are linked together to form this huge web. Web Structure Mining finds hidden basic structures and uses hyperlinks for more web applications such as web search. KEY WORDS: web mining; internet; web structure mining; link mining. JEL CLASSIFICATION: L86.
In the data preprocessing, sessions identification is a very important step. Algorithms used so f... more In the data preprocessing, sessions identification is a very important step. Algorithms used so far to identify sessions use some fixed values to specify the end of a session and to mark the beginning of another. In this paper we explain why the use of fixed values cause errors in identifying sessions and we propose a new method for identifying sessions based on average time of visiting web pages
Uploads
Papers by Claudia Dinuca