Analyzing of the Evolution of Web Pages by Using a Domain Based Web Crawler. Uzun, E., Yerlikaya, T., & Kurt, M. In Techsys, 26-28 May, Plovdiv, Bulgaria, pages 151-156, 2011. Website abstract bibtex 5 downloads To improve algorithms that are used in search engines, crawlers and indexers, the evolution of web pages should be examined. For this purpose, we developed a domain based crawler, namely SET Crawler, which collects the web archives between 1998 and 2008 of three Turkish daily popular newspapers (Hurriyet, Milliyet and Sabah). After completion of the crawl, we obtained a set of 3430997 HTML pages. While the average file size of one web page in 1998 approximately is 5.19 KB, this size in 2008 is 53.94 KB. When considering the size of main contents of web pages are similar, this observation shows the degree of increase in the use of unnecessary contents and tags. Analyses indicate that the use of link, image and layout tags has increased significantly in the last decades. Moreover, the tag has been used instead of the
@inproceedings{
title = {Analyzing of the Evolution of Web Pages by Using a Domain Based Web Crawler},
type = {inproceedings},
year = {2011},
keywords = {Degree of Changes in Web Pages,Web Crawlers,Web Evolution},
pages = {151-156},
websites = {https://erdincuzun.com/wp-content/uploads/download/2011/plovdiv01.pdf},
id = {0966001a-cb13-3243-a557-ea1c1ee9f9c1},
created = {2018-06-05T12:53:51.612Z},
file_attached = {false},
profile_id = {37fa15c3-e5d0-3212-8e18-e4c72814fd47},
last_modified = {2021-02-21T16:02:44.528Z},
read = {false},
starred = {false},
authored = {true},
confirmed = {true},
hidden = {false},
citation_key = {Uzun2011b},
private_publication = {false},
abstract = {To improve algorithms that are used in search engines, crawlers and indexers, the evolution of web pages should be examined. For this purpose, we developed a domain based crawler, namely SET Crawler, which collects the web archives between 1998 and 2008 of three Turkish daily popular newspapers (Hurriyet, Milliyet and Sabah). After completion of the crawl, we obtained a set of 3430997 HTML pages. While the average file size of one web page in 1998 approximately is 5.19 KB, this size in 2008 is 53.94 KB. When considering the size of main contents of web pages are similar, this observation shows the degree of increase in the use of unnecessary contents and tags. Analyses indicate that the use of link, image and layout tags has increased significantly in the last decades. Moreover, the tag has been used instead of the},
bibtype = {inproceedings},
author = {Uzun, Erdinç and Yerlikaya, Tarık and Kurt, Meltem},
booktitle = {Techsys, 26-28 May, Plovdiv, Bulgaria}
}
Downloads: 5
{"_id":"5rh4yRZPunNK6FPAP","bibbaseid":"uzun-yerlikaya-kurt-analyzingoftheevolutionofwebpagesbyusingadomainbasedwebcrawler-2011","downloads":5,"creationDate":"2018-07-03T12:59:41.819Z","title":"Analyzing of the Evolution of Web Pages by Using a Domain Based Web Crawler","author_short":["Uzun, E.","Yerlikaya, T.","Kurt, M."],"year":2011,"bibtype":"inproceedings","biburl":"https://bibbase.org/service/mendeley/37fa15c3-e5d0-3212-8e18-e4c72814fd47","bibdata":{"title":"Analyzing of the Evolution of Web Pages by Using a Domain Based Web Crawler","type":"inproceedings","year":"2011","keywords":"Degree of Changes in Web Pages,Web Crawlers,Web Evolution","pages":"151-156","websites":"https://erdincuzun.com/wp-content/uploads/download/2011/plovdiv01.pdf","id":"0966001a-cb13-3243-a557-ea1c1ee9f9c1","created":"2018-06-05T12:53:51.612Z","file_attached":false,"profile_id":"37fa15c3-e5d0-3212-8e18-e4c72814fd47","last_modified":"2021-02-21T16:02:44.528Z","read":false,"starred":false,"authored":"true","confirmed":"true","hidden":false,"citation_key":"Uzun2011b","private_publication":false,"abstract":"To improve algorithms that are used in search engines, crawlers and indexers, the evolution of web pages should be examined. For this purpose, we developed a domain based crawler, namely SET Crawler, which collects the web archives between 1998 and 2008 of three Turkish daily popular newspapers (Hurriyet, Milliyet and Sabah). After completion of the crawl, we obtained a set of 3430997 HTML pages. While the average file size of one web page in 1998 approximately is 5.19 KB, this size in 2008 is 53.94 KB. When considering the size of main contents of web pages are similar, this observation shows the degree of increase in the use of unnecessary contents and tags. Analyses indicate that the use of link, image and layout tags has increased significantly in the last decades. Moreover, the tag has been used instead of the","bibtype":"inproceedings","author":"Uzun, Erdinç and Yerlikaya, Tarık and Kurt, Meltem","booktitle":"Techsys, 26-28 May, Plovdiv, Bulgaria","bibtex":"@inproceedings{\n title = {Analyzing of the Evolution of Web Pages by Using a Domain Based Web Crawler},\n type = {inproceedings},\n year = {2011},\n keywords = {Degree of Changes in Web Pages,Web Crawlers,Web Evolution},\n pages = {151-156},\n websites = {https://erdincuzun.com/wp-content/uploads/download/2011/plovdiv01.pdf},\n id = {0966001a-cb13-3243-a557-ea1c1ee9f9c1},\n created = {2018-06-05T12:53:51.612Z},\n file_attached = {false},\n profile_id = {37fa15c3-e5d0-3212-8e18-e4c72814fd47},\n last_modified = {2021-02-21T16:02:44.528Z},\n read = {false},\n starred = {false},\n authored = {true},\n confirmed = {true},\n hidden = {false},\n citation_key = {Uzun2011b},\n private_publication = {false},\n abstract = {To improve algorithms that are used in search engines, crawlers and indexers, the evolution of web pages should be examined. For this purpose, we developed a domain based crawler, namely SET Crawler, which collects the web archives between 1998 and 2008 of three Turkish daily popular newspapers (Hurriyet, Milliyet and Sabah). After completion of the crawl, we obtained a set of 3430997 HTML pages. While the average file size of one web page in 1998 approximately is 5.19 KB, this size in 2008 is 53.94 KB. When considering the size of main contents of web pages are similar, this observation shows the degree of increase in the use of unnecessary contents and tags. Analyses indicate that the use of link, image and layout tags has increased significantly in the last decades. Moreover, the tag has been used instead of the},\n bibtype = {inproceedings},\n author = {Uzun, Erdinç and Yerlikaya, Tarık and Kurt, Meltem},\n booktitle = {Techsys, 26-28 May, Plovdiv, Bulgaria}\n}","author_short":["Uzun, E.","Yerlikaya, T.","Kurt, M."],"urls":{"Website":"https://erdincuzun.com/wp-content/uploads/download/2011/plovdiv01.pdf"},"biburl":"https://bibbase.org/service/mendeley/37fa15c3-e5d0-3212-8e18-e4c72814fd47","bibbaseid":"uzun-yerlikaya-kurt-analyzingoftheevolutionofwebpagesbyusingadomainbasedwebcrawler-2011","role":"author","keyword":["Degree of Changes in Web Pages","Web Crawlers","Web Evolution"],"metadata":{"authorlinks":{"uzun, e":"https://erdincuzun.com/yayinlar/"}},"downloads":5},"search_terms":["analyzing","evolution","web","pages","using","domain","based","web","crawler","uzun","yerlikaya","kurt"],"keywords":["degree of changes in web pages","web crawlers","web evolution"],"authorIDs":["QrE2Jk7Eehmqc5trT"],"dataSources":["mqdHLrE2gnaRYnL6B","ya2CyA73rpZseyrZ8"]}