Introduction#

About Auto-Metadata Generator#

This is an intern project for the 2023 ACADEMIA SINICA summer internship.

Depositar is an online repository dedicated to housing research data curated by Academia Sinica, Taiwan. As is the case with any data repository, the role of metadata cannot be understated—it serves as the cornerstone when users embark on the quest to discover datasets that align with their specific requirements.

Nonetheless, the task of accurately and comprehensively filling in metadata remains a formidable challenge for dataset providers. When users upload their datasets, they often find themselves uncertain about which Wikidata keywords would best suit their content. Moreover, grappling with the intricacies of spatial coverage can be a considerably intricate endeavor.

Thus, this project aims to develop a metadata generator. Based on textual dataset information such as descriptions or source file names, this program will automatically recommend to users certain Wikidata keywords and spatial coverage information that will assist them in completing the metadata for their dataset.

TL;DR

  • input: dataset title, dataset description, filenames of sources, file description of sources, project name, project description

  • output: wikidata keywords and spatial information for the uploaded dataset

More Information#

Depositar#

depositar — taking the term from the Portuguese/Spanish verb for to deposit — is an online repository for research data. The site data.depositar.io is built by the researchers for the researchers. You are free to deposit, discover, and reuse datasets on depositar for all your research purposes.

--Source: Depositar

Website: Depositar: deposit・discover・reuse

Wikidata#

Wikidata is a free and open knowledge base that can be read and edited by both humans and machines.

Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.

--Source: Wikidata

Website: Wikidata

OpenStreetMap#

OpenStreetMap is built by a community of mappers that contribute and maintain data about roads, trails, cafés, railway stations, and much more, all over the world.

--Source: OpenStreetMap

Website: OpenStreetMap