Works portfolio

Dataset viewer on Hugging Face

Work

This work was realized for Hugging Face .

2022 2023 2024 Python Svelte SVG Table Tailwind CSS TypeScript

The dataset viewer has been my main work at Hugging Face since I joined the company in 2021.

The first goal was to offer a preview of every dataset on the Hugging Face Hub. Back in 2021, the datasets were very heterogeneous, and in many cases were generated using a Python script. I thus created a backend service, that preprocessed every public dataset to extract the first 100 rows, and offered an API to retrieve these data. On the Hub, I built a basic data table to display the data. As each dataset page is generated server-side, it also helped position the datasets in search results, as the preview data is indexed by search engines.

Some features we developed over time include random access / pagination, embedding the viewer in external websites, support for more writing systems, full-text search, filters, sort, support for private datasets.

Sort and filter in Hugging Face dataset viewer
Sort and filter in Hugging Face dataset viewer