Serendipitous Exploration of the Carleton Curriculum

Cathy Duan, Willow Gu, Markus Gunadi, Zoey La, Charlie Ney, and Kai Weiner

Our Website

The core feature of our project is a force-directed graph in which every node represents a unique course. With our Doc2Vec model, each node in the graph has at least 2 connections to other nodes, with more depending on the quantity of strong connections a course has with other courses. Moreover, if there is a line between two courses, it means that their descriptions are similar, according to the model that we used. Our graph is navigable through hovers, clicks, and zooms and is connected to a search bar and quiz feature. Our website also has a calendar component such that a course that is saved by the user then populates in our built-in calendar to allow users to better visualize a potential schedule. We plan to implement more serendipitous-promoting features.



Background

Carleton College provides students with extensive tools like Workday and the Academic Course Catalog for students to filter and search for classes. These search tools function well and are what all students use to search for classes currently. However, Carleton lacks tools that emphasize exploration and browsing. It is therefore our aim to build a tool that highlights the liberal arts experience. Our tool will function in tandem to the existing tools, to help students consider more unique classes, before they solidify their schedule and register using Workday.

Our main inspiration for this project comes from the research paper, The Bohemian Bookshelf: Supporting Serendipitous Book Discoveries through Information Visualization (Thudt, 2012). Researchers, Alice Thudt, Uta Hinrichs and Sheelagh Carpendale, created 5 visualizations of book collections aimed at encouraging “serendipitous discoveries” through highlighting different patterns and connections between books. We hope to emulate their success through encouraging students at Carleton College to have their own “serendipitous discoveries” when exploring courses to register for future terms. We seek to create our own tool to highlight unique connections between Carleton’s courses and use NLP tools on course descriptions.

Our Data

Our data includes 6 credit courses offered in the Spring 2025 term at Carleton College (excluding seminar and capstone courses). It includes only one section for each class, so if you’re curious if a course offers additional sections, please check Carleton’s Course Catalog.

The Model

We implemented Gensim's Doc2Vec (Paragraph Vector) model to capture the semantic meaning of each course description in a multidimensional space. Unlike traditional bag-of-words models, which disregard word order and relationships, our Document Vectorizer enriches each representation with contextual and structural information, learning fixed-length vectors that encode both word frequencies and their sequences. By simultaneously training individual word vectors and a document vector, the model better captures overall meaning. With our courses vectorized, we used cosine similarity to compare them, calculating the cosine of the angle between vectors to measure their semantic closeness and to give us accurate course recommendations that we could finally use in creating the graph’s connections.

Languages and Libraries

We used React.js, D3.js, CSS, Python, Node.js, Express.js, and postgreSQL to create our website.