DBAcademic
Connecting open data from Brazilian public educational institutions using Linked Data.
DBAcademic: Connecting Open Public Data from Educational Institutions
Public institutions hold a vast volume of data that could be leveraged to improve their services, fueling the Open Data movement. In Brazil, Decree No. 8.777 (2016) mandated that federal institutions create an Open Data Plan (PDA), leading to a surge in published datasets. However, these institutions often maintain their data in isolation, making cross-institutional queries practically impossible. The goal of this project is to connect these datasets into a large repository of **Linked Data** called DBAcademic. This portal allows users to perform complex queries involving data from multiple public educational institutions simultaneously.
Datasets
Our research identified 45 public educational institutions with open data, averaging 20 datasets each. For this project, we selected the most relevant datasets to build our linked repository:
| Data | Description |
|---|---|
| Faculty | Information about professors, including name, bio, interests, department, and Lattes CV URL. |
| Courses | Details on academic programs, CNPq knowledge areas, coordinators, and professional titles. |
| Department | Data regarding university departments, including location, leadership, and associated centers. |
| Center | Information on the higher-level hierarchy (Colleges/Centers) that oversees departments. |
| Research Groups | Groups of faculty and students organized by theme, including areas of expertise and coordinators. |
| Monographs (TCC) | Data on student theses, including titles, advisors, defense dates, and years. |
| Students | Records of active, incoming, or former students, including enrollment IDs and course names. |
Technical Resources
- Modeling: Access the data modeling for each dataset.
- Ontology: Access the ontology used for data publication.
- Published Data: Explore the datasets published on Data.world.
- Query Examples: See examples of SPARQL queries.
Software
- Simple Object-triple Mapping (SiMPoT) - A tool developed to facilitate the publication of these data.
Selected Publications
- Costa, Sérgio Souza et al. “DBacademic: Connecting open data from educational institutions in Brazil.” Ciência da Informação, v. 49, 2021.
- Costa, Sérgio Souza et al. “A semi-automatic solution for extraction, transformation, and loading of linked data.” WIDaT, 2019.
Mentorship & Supervision
- Mateus Vitor Duarte Sousa. An Extensible Solution for Extracting Public Data into Linked Data. (Undergraduate Thesis), 2019.
- Jose Victor Meireles Guimaraes. Migrating from Open Data to Linked Data: A Proposal for UFMA. (Undergraduate Thesis), 2018.