01461nas a2200157 4500008004100000245007300041210006900114260000900183520091500192653000801107653001701115100002201132700002101154700001801175856011001193 2015 eng d00aEstimation and Visualization of Digital Library Content Similarities0 aEstimation and Visualization of Digital Library Content Similari c20153 aWe report on a process for similarity estimation and two-dimensional mapping of lesson materials stored in a Web-based K12 Science, Technology, Engineering and Mathematics (STEM) digital library. The process starts with automated removal of all information which should not be included in the similarity estimations followed by automated indexing. Similarity estimation itself is conducted through a natural language processing algorithm which heavily relies on bigrams. The resulting similarities are then used to compute a Sammon-map; i.e., a projection in n dimensions, the item-to-item distances of which best reflect the input similarities. In this paper we concentrate on specification and validation of this process. The similarity results show almost 100% precision-by-rank in the top three to five ranks. Sammon mapping in two dimensions corresponds well with the digital library‘s table of content.10aBIS10aSupply Chain1 aReitsma, Reindert1 aHsieh, Ping-Hung1 aRobson, Robby u/biblio/estimation-and-visualization-digital-library-content-similarities